lamindb.Storage¶
- class lamindb.Storage(root: str, type: str, region: str | None)¶
Bases:
Record
,TracksRun
,TracksUpdates
Storage locations.
A storage location is either a directory/folder (local or in the cloud) or an entire S3/GCP bucket.
A LaminDB instance can manage and link multiple storage locations. But any storage location is managed by at most one LaminDB instance.
Managed vs. linked storage locations
The LaminDB instance can update & delete artifacts in managed storage locations but merely read artifacts in linked storage locations.
When you transfer artifacts from another instance, the default is to only copy metadata into the target instance, but merely link the data.
The
instance_uid
field indicates the managing LaminDB instance of a storage location.When you delete a LaminDB instance, you’ll be warned about data in managed storage locations while data in linked storage locations is ignored.
See also
storage
Default storage.
StorageSettings
Storage settings.
Examples
Configure the default storage location upon initiation of a LaminDB instance:
lamin init --storage ./mydata # or "s3://my-bucket" or "gs://my-bucket"
View the default storage location:
>>> ln.settings.storage PosixPath('/home/runner/work/lamindb/lamindb/docs/guide/mydata')
Dynamically change the default storage:
>>> ln.settings.storage = "./storage_2" # or a cloud bucket
Attributes¶
Simple fields¶
-
uid:
str
¶ Universal id, valid across DB instances.
-
root:
str
¶ Root path of storage. n s3 path. local path, etc. (required).
-
description:
str
|None
¶ A description of what the storage location is used for (optional).
-
type:
str
¶ Can be “local” vs. “s3” vs. “gs”.
-
region:
str
|None
¶ Cloud storage region, if applicable.
-
instance_uid:
str
|None
¶ Instance that manages this storage location.
-
created_at:
datetime
¶ Time of creation of record.
-
updated_at:
datetime
¶ Time of last update to record.
Relational fields¶
Class methods¶
- classmethod df(include=None, features=False, limit=100)¶
Convert to
pd.DataFrame
.By default, shows all direct fields, except
updated_at
.Use arguments
include
orfeature
to include other data.- Parameters:
include (
str
|list
[str
] |None
, default:None
) – Related fields to include as columns. Takes strings of form"ulabels__name"
,"cell_types__name"
, etc. or a list of such strings.features (
bool
|list
[str
], default:False
) – IfTrue
, map all features of theFeature
registry onto the resultingDataFrame
. Only available forArtifact
.limit (
int
, default:100
) – Maximum number of rows to display from a Pandas DataFrame. Defaults to 100 to reduce database load.
- Return type:
DataFrame
Examples
Include the name of the creator in the
DataFrame
:>>> ln.ULabel.df(include="created_by__name"])
Include display of features for
Artifact
:>>> df = ln.Artifact.df(features=True) >>> ln.view(df) # visualize with type annotations
Only include select features:
>>> df = ln.Artifact.df(features=["cell_type_by_expert", "cell_type_by_model"])
- classmethod filter(*queries, **expressions)¶
Query records.
- Parameters:
queries – One or multiple
Q
objects.expressions – Fields and values passed as Django query expressions.
- Return type:
QuerySet
- Returns:
A
QuerySet
.
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ln.ULabel(name="my label").save() >>> ln.ULabel.filter(name__startswith="my").df()
- classmethod get(idlike=None, **expressions)¶
Get a single record.
- Parameters:
idlike (
int
|str
|None
, default:None
) – Either a uid stub, uid or an integer id.expressions – Fields and values passed as Django query expressions.
- Return type:
- Returns:
A record.
- Raises:
lamindb.core.exceptions.DoesNotExist – In case no matching record is found.
See also
Guide: Query & search registries
Django documentation: Queries
Examples
>>> ulabel = ln.ULabel.get("FvtpPJLJ") >>> ulabel = ln.ULabel.get(name="my-label")
- classmethod lookup(field=None, return_field=None)¶
Return an auto-complete object for a field.
- Parameters:
field (
str
|DeferredAttribute
|None
, default:None
) – The field to look up the values for. Defaults to first string field.return_field (
str
|DeferredAttribute
|None
, default:None
) – The field to return. IfNone
, returns the whole record.
- Return type:
NamedTuple
- Returns:
A
NamedTuple
of lookup information of the field values with a dictionary converter.
See also
Examples
>>> import bionty as bt >>> bt.settings.organism = "human" >>> bt.Gene.from_source(symbol="ADGB-DT").save() >>> lookup = bt.Gene.lookup() >>> lookup.adgb_dt >>> lookup_dict = lookup.dict() >>> lookup_dict['ADGB-DT'] >>> lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id") >>> genes.ensg00000002745 >>> lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
- classmethod search(string, *, field=None, limit=20, case_sensitive=False)¶
Search.
- Parameters:
string (
str
) – The input string to match against the field ontology values.field (
str
|DeferredAttribute
|None
, default:None
) – The field or fields to search. Search all string fields by default.limit (
int
|None
, default:20
) – Maximum amount of top results to return.case_sensitive (
bool
, default:False
) – Whether the match is case sensitive.
- Return type:
QuerySet
- Returns:
A sorted
DataFrame
of search results with a score in columnscore
. Ifreturn_queryset
isTrue
.QuerySet
.
Examples
>>> ulabels = ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name") >>> ln.save(ulabels) >>> ln.ULabel.search("ULabel2")
- classmethod using(instance)¶
Use a non-default LaminDB instance.
- Parameters:
instance (
str
|None
) – An instance identifier of form “account_handle/instance_name”.- Return type:
QuerySet
Examples
>>> ln.ULabel.using("account_handle/instance_name").search("ULabel7", field="name") uid score name ULabel7 g7Hk9b2v 100.0 ULabel5 t4Jm6s0q 75.0 ULabel6 r2Xw8p1z 75.0
Methods¶
- delete()¶
Delete.
- Return type:
None