lamindb.Storage

class lamindb.Storage(root: str, type: str, region: str | None)

Bases: Record, TracksRun, TracksUpdates

Storage locations.

A storage location is either a directory/folder (local or in the cloud) or an entire S3/GCP bucket.

A LaminDB instance can manage and link multiple storage locations. But any storage location is managed by at most one LaminDB instance.

Managed vs. linked storage locations

The LaminDB instance can update & delete artifacts in managed storage locations but merely read artifacts in linked storage locations.

When you transfer artifacts from another instance, the default is to only copy metadata into the target instance, but merely link the data.

The instance_uid field indicates the managing LaminDB instance of a storage location.

When you delete a LaminDB instance, you’ll be warned about data in managed storage locations while data in linked storage locations is ignored.

See also

storage

Default storage.

StorageSettings

Storage settings.

Examples

Configure the default storage location upon initiation of a LaminDB instance:

lamin init --storage ./mydata # or "s3://my-bucket" or "gs://my-bucket"

View the default storage location:

>>> ln.settings.storage
PosixPath('/home/runner/work/lamindb/lamindb/docs/guide/mydata')

Dynamically change the default storage:

>>> ln.settings.storage = "./storage_2" # or a cloud bucket

Attributes

property path: UPath

Bucket or folder path.

Cloud storage bucket:

>>> ln.Storage("s3://my-bucket").save()

Directory/folder in cloud storage:

>>> ln.Storage("s3://my-bucket/my-directory").save()

Local directory/folder:

>>> ln.Storage("./my-directory").save()

Simple fields

uid: str

Universal id, valid across DB instances.

root: str

Root path of storage. n s3 path. local path, etc. (required).

description: str

A description of what the storage location is used for (optional).

type: str

Can be “local” vs. “s3” vs. “gs”.

region: str

Cloud storage region, if applicable.

instance_uid: str

Instance that manages this storage location.

created_at: datetime

Time of creation of record.

updated_at: datetime

Time of last update to record.

Relational fields

created_by: User

Creator of record.

run: Run

Last run that created or updated the record.

artifacts: Artifact

Artifacts contained in this storage location.

Class methods

classmethod df(include=None, join='inner', limit=100)

Convert to pd.DataFrame.

By default, shows all direct fields, except created_at.

If you’d like to include related fields, use parameter include.

Parameters:
  • include (str | list[str] | None, default: None) – Related fields to include as columns. Takes strings of form "labels__name", "cell_types__name", etc. or a list of such strings.

  • join (str, default: 'inner') – The join parameter of pandas.

Return type:

DataFrame

Examples

>>> labels = [ln.ULabel(name="Label {i}") for i in range(3)]
>>> ln.save(labels)
>>> ln.ULabel.filter().df(include=["created_by__name"])
classmethod filter(*queries, **expressions)

Query records.

Parameters:
  • queries – One or multiple Q objects.

  • expressions – Fields and values passed as Django query expressions.

Return type:

QuerySet

Returns:

A QuerySet.

See also

Examples

>>> ln.ULabel(name="my ulabel").save()
>>> ulabel = ln.ULabel.get(name="my ulabel")
classmethod get(idlike=None, **expressions)

Get a single record.

Parameters:
  • idlike (int | str | None, default: None) – Either a uid stub, uid or an integer id.

  • expressions – Fields and values passed as Django query expressions.

Return type:

Record

Returns:

A record.

Raises:

lamindb.core.exceptions.DoesNotExist – In case no matching record is found.

See also

Examples

>>> ulabel = ln.ULabel.get("2riu039")
>>> ulabel = ln.ULabel.get(name="my-label")
classmethod lookup(field=None, return_field=None)

Return an auto-complete object for a field.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – The field to look up the values for. Defaults to first string field.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. If None, returns the whole record.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

>>> import bionty as bt
>>> bt.settings.organism = "human"
>>> bt.Gene.from_source(symbol="ADGB-DT").save()
>>> lookup = bt.Gene.lookup()
>>> lookup.adgb_dt
>>> lookup_dict = lookup.dict()
>>> lookup_dict['ADGB-DT']
>>> lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id")
>>> genes.ensg00000002745
>>> lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
classmethod search(string, *, field=None, limit=20, case_sensitive=False)

Search.

Parameters:
  • string (str) – The input string to match against the field ontology values.

  • field (str | DeferredAttribute | None, default: None) – The field or fields to search. Search all string fields by default.

  • limit (int | None, default: 20) – Maximum amount of top results to return.

  • case_sensitive (bool, default: False) – Whether the match is case sensitive.

Return type:

QuerySet

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True. QuerySet.

See also

filter() lookup()

Examples

>>> ulabels = ln.ULabel.from_values(["ULabel1", "ULabel2", "ULabel3"], field="name")
>>> ln.save(ulabels)
>>> ln.ULabel.search("ULabel2")
classmethod using(instance)

Use a non-default LaminDB instance.

Parameters:

instance (str | None) – An instance identifier of form “account_handle/instance_name”.

Return type:

QuerySet

Examples

>>> ln.ULabel.using("account_handle/instance_name").search("ULabel7", field="name")
            uid    score
name
ULabel7  g7Hk9b2v  100.0
ULabel5  t4Jm6s0q   75.0
ULabel6  r2Xw8p1z   75.0

Methods

delete()

Delete.

Return type:

None

save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

Record