lamindb.UPath
¶
- class lamindb.UPath(*args, protocol=None, chain_parser=<upath._chain.FSSpecChainParser object>, **storage_options)¶
Bases:
_UPathMixin,WritablePath,ReadablePathPath-like access to files.
Offers the typical access patterns of file systems and object stores, for example:
upath = ln.UPath("s3://my-bucket/my-folder/my-file.txt") upath.exists() # file exists in storage
The class is an extension of
universal_pathlib.UPath.- Parameters:
pathlike – A string or
Pathto a local or cloud file or folder.
See also
from_auth()If the S3 URI is in an S3 storage managed by LaminHub, use this method to request federated AWS credentials.
Examples
Create a path object from a local file:
upath = ln.UPath("./my-folder/my-file.txt")
Create a path from a S3 URI:
upath = ln.UPath("s3://my-bucket/my-folder/my-file.txt") # create a path that detects local AWS credentials upath = ln.UPath.from_auth("s3://managed-bucket/my-folder/") # create a path that requests federated AWS credentials from LaminHub
Create a path object from a GS URI:
upath = ln.UPath("gs://my-bucket/my-folder/my-file.txt")
In addition to what
pathlib.Pathanduniversal_pathlib.UPathoffer,ln.UPathoffers the following methods:upath.view_tree() # view a file tree upath.upload_from("local-file.txt") # upload a local file upath.download_to("local-file.txt") # download a file upath.synchronize_to("local-folder/") # synchronize a folder
- property anchor: str¶
The concatenation of the drive and root or an empty string.
- property drive: str¶
The drive prefix (letter or UNC path), if any.
Info¶
On non-Windows systems, the drive is always an empty string. On cloud storage systems, the drive is the bucket name or equivalent.
- property fs: AbstractFileSystem¶
The cached fsspec filesystem instance for the path.
This is the underlying fsspec filesystem instance. It’s instantiated on first filesystem access and cached. Can be used to access fsspec-specific functionality not exposed by the UPath API.
Examples¶
>>> from upath import UPath >>> p = UPath("s3://my-bucket/path/to/file.txt") >>> p.fs <s3fs.core.S3FileSystem object at 0x...> >>> p.fs.get_tags(p.path) {'VersionId': 'null', 'ContentLength': 12345, ...}
- property info: PathInfo¶
A PathInfo object that exposes the file type and other file attributes of this path.
Returns¶
- : UPathInfo
The UPathInfo object for this path.
- property modified: datetime | None¶
Return modified time stamp.
- property name¶
The final path component, if any.
- property parent: Self¶
The logical parent of the path.
Examples¶
>>> from upath import UPath >>> p = UPath("s3://my-bucket/path/to/file.txt") >>> p.parent S3Path('s3://my-bucket/path/to')
- property parents: Sequence[Self]¶
A sequence providing access to the logical ancestors of the path.
Examples¶
>>> from upath import UPath >>> p = UPath("memory:///foo/bar/baz.txt") >>> list(p.parents) [ MemoryPath('memory:///foo/bar'), MemoryPath('memory:///foo'), MemoryPath('memory:///'), ]
-
parser:
UPathParser= <wrapped class AnyProtocolFileSystemFlavour>¶
- property parts: Sequence[str]¶
Provides sequence-like access to the filesystem path components.
Examples¶
>>> from upath import UPath >>> p = UPath("s3://my-bucket/path/to/file.txt") >>> p.parts ('my-bucket/', 'path', 'to', 'file.txt') >>> p2 = UPath("/foo/bar/baz.txt", protocol="memory") >>> p2.parts ('/', 'foo', 'bar', 'baz.txt')
- property path: str¶
The path used by fsspec filesystem.
FSSpec filesystems usually handle paths stripped of protocol. This property returns the path suitable for use with the underlying fsspec filesystem. It guarantees that a filesystem’s strip_protocol method is applied correctly.
Examples¶
>>> from upath import UPath >>> p = UPath("memory:///foo/bar.txt") >>> str(p) 'memory:///foo/bar.txt' >>> p.path '/foo/bar.txt' >>> p.fs.exists(p.path) True
- property protocol: str¶
The fsspec protocol for the path.
Note¶
Protocols are linked to upath and fsspec filesystems via the
upath.registryandfsspec.registrymodules. They basically represent the URI scheme used for the specific filesystem.Examples¶
>>> from upath import UPath >>> p0 = UPath("s3://my-bucket/path/to/file.txt") >>> p0.protocol 's3' >>> p1 = UPath("/foo/bar/baz.txt", protocol="memory") >>> p1.protocol 'memory'
- property root: str¶
The root of the path, if any.
- property stem¶
The final path component, minus its last suffix.
- property storage_options: Mapping[str, Any]¶
The read-only fsspec storage options for the path.
Note¶
Storage options are specific to each fsspec filesystem and can include parameters such as authentication credentials, connection settings, and other options that affect how the filesystem interacts with the underlying storage.
Examples¶
>>> from upath import UPath >>> p = UPath("s3://my-bucket/path/to/file.txt", anon=True) >>> p.storage_options['anon'] True
- property suffix¶
The final component’s last suffix, if any.
This includes the leading period. For example: ‘.txt’
- property suffixes¶
A list of the final component’s suffixes, if any.
These include the leading periods. For example: [‘.tar’, ‘.gz’]
- classmethod from_uri(uri, **storage_options)¶
- Return type:
Self
- classmethod cwd()¶
Return a new UPath object representing the current working directory.
- Return type:
Self
Info¶
None of the fsspec filesystems support a global current working directory, so this method only works for the base UPath class, returning the local current working directory.
- classmethod home()¶
Return a new UPath object representing the user’s home directory.
- Return type:
Self
Info¶
None of the fsspec filesystems support user home directories, so this method only works for the base UPath class, returning the local user’s home directory.
- classmethod from_auth()¶
Create an authenticated path object.
This method makes a request to a LaminHub to obtain standard federated AWS credentials for the
UPathobject, compliant withuniversal_pathlibandfsspec.Note: This only works for paths inside storage locations whose access is managed by LaminHub (
Storage). For paths outside managed storage locations, local or environment credentials are used using the standardUPathsearch strategy, from AWS environment variables or AWS configuration files. Non-S3 paths are returned unchanged if they are alreadyUPathobjects.Example
Create a path object from a S3 URI with federated AWS credentials:
upath = ln.UPath.from_auth("s3://managed-bucket/my-folder/")
- with_segments(*pathsegments)¶
Construct a new path object from any number of path-like objects.
- Return type:
Self
- with_name(name)¶
Return a new path with the file name changed.
- Return type:
Self
- joinpath(*pathsegments)¶
Combine this path with one or several arguments, and return a new path.
For one argument, this is equivalent to using the
/operator.- Return type:
Self
Examples¶
>>> from upath import UPath >>> p = UPath("s3://my-bucket/path/to") >>> p.joinpath("file.txt") S3Path('s3://my-bucket/path/to/file.txt')
- iterdir()¶
Yield path objects of the directory contents.
The children are yielded in arbitrary order, and the special entries ‘.’ and ‘..’ are not included.
- Return type:
Iterator[Self]
- readlink()¶
- Return type:
Self
- copy(target, **kwargs)¶
Recursively copy this file or directory tree to the given destination.
- Return type:
TypeVar(_WT, bound= WritablePath) |UPath
- copy_into(target_dir, **kwargs)¶
Copy this file or directory tree into the given existing directory.
- Return type:
TypeVar(_WT, bound= WritablePath) |UPath
- move(target, **kwargs)¶
Recursively move this file or directory tree to the given destination.
- Return type:
TypeVar(_WT, bound= WritablePath) |UPath
- move_into(target_dir, **kwargs)¶
Move this file or directory tree into the given existing directory.
- Return type:
TypeVar(_WT, bound= WritablePath) |UPath
- symlink_to(target, target_is_directory=False)¶
- Return type:
None
- mkdir(mode=511, parents=False, exist_ok=False)¶
Create a new directory at this given path.
- Return type:
None
- open(mode='r', buffering=_DefaultValue.UNSET, encoding=_DefaultValue.UNSET, errors=_DefaultValue.UNSET, newline=_DefaultValue.UNSET, **fsspec_kwargs)¶
Open the file pointed by this path and return a file object, as the built-in open() function does.
- Return type:
IO[Any]
- stat(*, follow_symlinks=True)¶
Return the result of the stat() system call on this path, like os.stat() does.
- Return type:
StatResultType
- lstat()¶
- Return type:
StatResultType
- chmod(mode, *, follow_symlinks=True)¶
- Return type:
None
- exists(*, follow_symlinks=True)¶
Whether this path exists.
This method normally follows symlinks; to check whether a symlink exists, add the argument follow_symlinks=False.
- Return type:
bool
- is_dir(*, follow_symlinks=True)¶
Whether this path is a directory.
- Return type:
bool
- is_file(*, follow_symlinks=True)¶
Whether this path is a regular file (also True for symlinks pointing to regular files).
- Return type:
bool
- is_mount()¶
Check if this path is a mount point
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_symlink()¶
Whether this path is a symbolic link.
- Return type:
bool
- is_junction()¶
Whether this path is a junction.
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_block_device()¶
Whether this path is a block device.
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_char_device()¶
Whether this path is a character device.
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_fifo()¶
Whether this path is a FIFO (named pipe).
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_socket()¶
Whether this path is a socket.
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- is_reserved()¶
Whether this path is reserved under Windows.
- Return type:
bool
Info¶
For fsspec filesystems this is always False.
- expanduser()¶
Return a new path with expanded
~constructs.- Return type:
Self
Info¶
For fsspec filesystems this is currently a no-op.
- glob(pattern, *, case_sensitive=None, recurse_symlinks=False)¶
Iterate over this subtree and yield all existing files (of any kind, including directories) matching the given relative pattern.
- Return type:
Iterator[Self]
- rglob(pattern, *, case_sensitive=None, recurse_symlinks=False)¶
Recursively yield all existing files (of any kind, including directories) matching the given relative pattern, anywhere in this subtree.
- Return type:
Iterator[Self]
- owner(*, follow_symlinks=True)¶
- Return type:
str
- group(*, follow_symlinks=True)¶
- Return type:
str
- absolute()¶
Return an absolute version of this path No normalization or symlink resolution is performed.
Use resolve() to resolve symlinks and remove ‘..’ segments.
- Return type:
Self
- is_absolute()¶
True if the path is absolute (has both a root and, if applicable, a drive).
- Return type:
bool
- resolve(strict=False)¶
Make the path absolute, resolving all symlinks on the way and also normalizing it.
- Return type:
Self
- touch(mode=438, exist_ok=True)¶
Create this file with the given access mode, if it doesn’t exist.
- Return type:
None
- lchmod(mode)¶
- Return type:
None
- unlink(missing_ok=False)¶
Remove this file or link. If the path is a directory, use rmdir() instead.
- Return type:
None
- rmdir(recursive=True)¶
Remove this directory.
- Return type:
None
Warning¶
This method is non-standard compared to pathlib.Path.rmdir(), as it supports a
recursiveparameter to remove non-empty directories and defaults to recursive deletion.This behavior is likely to change in future releases once
.delete()is introduced.
- rename(target, *, recursive=_DefaultValue.UNSET, maxdepth=_DefaultValue.UNSET, **kwargs)¶
Move file, see
fsspec.AbstractFileSystem.mv.For example:
upath = UPath("s3://my-bucket/my-file") upath.rename(UPath("s3://my-bucket/my-file-renamed")) upath.rename("my-file-renamed")
- Return type:
Self
- replace(target)¶
Rename this path to the target path, overwriting if that path exists.
The target path may be absolute or relative. Relative paths are interpreted relative to the current working directory, not the directory of the Path object.
Returns the new Path instance pointing to the target path.
- Return type:
Self
Warning¶
This method is currently not implemented.
- as_uri()¶
Return the string representation of the path as a URI.
- Return type:
str
- as_posix()¶
Return the string representation of the path with POSIX-style separators.
- Return type:
str
- samefile(other_path)¶
- Return type:
bool
- relative_to(other, /, *_deprecated, walk_up=False)¶
Return the relative path to another path identified by the passed arguments. If the operation is not possible (because this is not related to the other path), raise ValueError.
The walk_up parameter controls whether
..may be used to resolve the path.- Return type:
Self
- is_relative_to(other, /, *_deprecated)¶
Return True if the path is relative to another path identified.
- Return type:
bool
- hardlink_to(target)¶
- Return type:
None
- full_match(pattern, *, case_sensitive=None)¶
Match this path against the provided glob-style pattern. Return True if matching is successful, False otherwise.
- Return type:
bool
- match(path_pattern, *, case_sensitive=None)¶
Match this path against the provided non-recursive glob-style pattern. Return True if matching is successful, False otherwise.
- Return type:
bool
- synchronize_to(destination, error_no_origin=True, print_progress=False, just_check=False, disable_boto3=False, **kwargs)¶
Sync to a local destination path.
- Return type:
bool
- upload_from(local_path, create_folder=None, print_progress=True, **kwargs)¶
Upload from the local path to
self(a destination in the cloud).If the local path is a directory, recursively upload its contents.
- Parameters:
local_path (
str|Path|UPath) – A local path of a file or directory.create_folder (
bool|None, default:None) – Only applies iflocal_pathis a directory and then defaults toTrue. IfTrue, make a new folder in the destination using the directory name oflocal_path. IfFalse, upload the contents of the directory to to the root-level of the destination.print_progress (
bool, default:True) – Print progress.
- Return type:
- Returns:
The destination path.
- to_url()¶
Public storage URL.
Generates a public URL for an object in an S3 bucket using fsspec’s UPath, considering the bucket’s region.
- Parameters:
upath (
S3Path) – AUPathobject representing an S3 path.- Return type:
str- Returns:
A string containing the public URL to the S3 object.
- download_to(local_path, print_progress=True, use_boto3=False, **kwargs)¶
Download from self (a destination in the cloud) to the local path.
- Parameters:
local_path (
str|Path|UPath) – A local path to download to.print_progress (
bool, default:True) – Print progress.use_boto3 (
bool, default:False) – Use boto3 instead of s3fs to download a single file from s3. Ignored if the path is not a file or not in s3**kwargs – Additional arguments for the download.
- view_tree(*, level=2, only_dirs=False, n_max_files_per_dir_and_type=100, n_max_files=1000, include_paths=None, skip_suffixes=None)¶
Print a visual tree structure of files & directories.
- Parameters:
level (
int, default:2) – If1, only iterate through one level, if2iterate through 2 levels, if-1iterate through entire hierarchy.only_dirs (
bool, default:False) – Only iterate through directories.n_max_files (
int, default:1000) – Display limit. Will only show this many files. Doesn’t affect count.include_paths (
set[Any] |None, default:None) – Restrict to these paths.skip_suffixes (
list[str] |None, default:None) – Skip directories with these suffixes.
- Return type:
None
Example
View the file tree of a directory:
import lamindb as ln dir_path = ln.examples.datasets.generate_cell_ranger_files( "sample_001", ln.settings.storage ) ln.UPath(dir_path).view_tree() #> 3 subdirectories, 15 files #> sample_001 #> ├── web_summary.html #> ├── metrics_summary.csv #> ├── molecule_info.h5 #> ├── filtered_feature_bc_matrix #> │ ├── features.tsv.gz #> │ ├── barcodes.tsv.gz #> │ └── matrix.mtx.gz #> ├── analysis #> │ └── analysis.csv #> ├── raw_feature_bc_matrix #> │ ├── features.tsv.gz #> │ ├── barcodes.tsv.gz #> │ └── matrix.mtx.gz #> ├── possorted_genome_bam.bam.bai #> ├── cloupe.cloupe #> ├── possorted_genome_bam.bam #> ├── filtered_feature_bc_matrix.h5 #> └── raw_feature_bc_matrix.h5
- joinuri(uri)¶
Join with urljoin behavior for UPath instances.
- Return type:
Examples¶
>>> from upath import UPath >>> p = UPath("https://example.com/dir/subdir/") >>> p.joinuri("file.txt") HTTPSPath('https://example.com/dir/subdir/file.txt') >>> p.joinuri("/anotherdir/otherfile.txt") HTTPSPath('https://example.com/anotherdir/otherfile.txt') >>> p.joinuri("memory:///foo/bar.txt" MemoryPath('memory:///foo/bar.txt')
- write_bytes(data)¶
Open the file in bytes mode, write to it, and close the file.
- write_text(data, encoding=None, errors=None, newline=None)¶
Open the file in text mode, write to it, and close the file.
- read_bytes()¶
Open the file in bytes mode, read it, and close the file.
- read_text(encoding=None, errors=None, newline=None)¶
Open the file in text mode, read it, and close the file.
- walk(top_down=True, on_error=None, follow_symlinks=False)¶
Walk the directory tree from this directory, similar to os.walk().
- with_stem(stem)¶
Return a new path with the stem changed.
- with_suffix(suffix)¶
Return a new path with the file suffix changed. If the path has no suffix, add given suffix. If the given suffix is an empty string, remove the suffix from the path.