Transfer data

This guide shows how to transfer data from a source database into the currently connected database.

# pip install lamindb
!lamin init --storage ./test-transfer --modules bionty
Hide code cell output
! using anonymous user (to identify, call: lamin login)
 initialized lamindb: anonymous/test-transfer
import lamindb as ln

ln.track("ITeOtm7bhtdq")
Hide code cell output
 connected lamindb: anonymous/test-transfer
 created Transform('ITeOtm7bhtdq0000', key='transfer.ipynb'), started new Run('r5jncvtO9tJOGJEH') at 2025-10-30 07:57:20 UTC
 notebook imports: lamindb==1.15a1

Query all artifacts in the laminlabs/lamindata instance and filter them to their latest versions.

# query all latest artifact versions
artifacts = ln.Artifact.using("laminlabs/lamindata").filter(is_latest=True)

# convert the QuerySet to a DataFrame and show the latest 5 versions
artifacts.to_dataframe().head()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
12288 W1AiST5wLrbNEyVq0001 schmidt22/analyzed_data.h5ad None .h5ad dataset AnnData 18324333 n-f42E4wmAuD8DftgWCr0Q NaN 853.0 None True False 2025-10-26 16:10:22.536426+00:00 1 1 2 1274.0 151.0 9
12279 UaXXF8QbvSiMd1Hk0000 schmidt22_perturbseq/schmidt22_perturbseq_ense... Schmidt22 PerturbSeq data with Ensembl gene ID... .h5ad dataset AnnData 18315384 5JzJS1l2UOYj3dK1nEtabA NaN 853.0 None True False 2025-10-26 15:39:53.569692+00:00 1 1 2 1271.0 151.0 9
12277 Ywz5JiVNHOWSJDiK0001 schmidt22/gws-crispr-ifng-hits.parquet Hits from genome-wide CRISPRa IFNG screen in T... .parquet dataset DataFrame 215127 KjMYGx-H6DEIgPaYv-2XqQ NaN 123.0 None True False 2025-10-26 15:28:21.292185+00:00 1 1 2 1269.0 149.0 9
12266 lXmgHRUFufX439eI0001 schmidt22/gws-crispr-ifng-readout.parquet Genome-wide CRISPRa screen with IFN-gamma read... .parquet dataset DataFrame 1354322 dp0CPb638U6Iz3Ye-gMbog NaN 18930.0 None True False 2025-10-26 14:41:19.222297+00:00 1 1 2 1261.0 163.0 9
12244 aX5PQG3ERGe78q940000 sample.fasta None .fasta None None 11 83rEPcAoBHmYiIuyBYrFKg NaN NaN None True False 2025-10-24 07:01:54.231544+00:00 1 1 2 1247.0 NaN 9

You can now further subset or search the QuerySet. Here we query by whether the description contains “tabula sapiens”.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.describe()
Hide code cell output
Artifact: tabula_sapiens_lung.h5ad (6Wph)
|   description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
├── uid: dPraor9rU1EofcFb6Wph            run: IeeCQQO (ux-session-tb-lung)
hash: 8mB1KK2wd51F6HQdvqipcQ         size: 3.6 GB                     
branch: main                         space: all                       
created_at: 2023-07-14 19:00:30 UTC  created_by: Koncopd              
├── storage/path: s3://lamindata/tabula_sapiens_lung.h5ad
└── Labels
    └── .ulabels                        ULabel                             TSP1, TSP2, TSP14                       
        .tissues                        bionty.Tissue                      lung                                    
        .cell_types                     bionty.CellType                    type I pneumocyte, adventitial cell, ba…
        .experimental_factors           bionty.ExperimentalFactor          anoxya, stroke                          

By saving the artifact record that’s currently attached to the source database instance, you transfer it to the default database instance.

artifact.save()
Hide code cell output
 transferred: Artifact(uid='dPraor9rU1EofcFb6Wph'), Storage(uid='D9BilDV2')
Artifact(uid='dPraor9rU1EofcFb6Wph', version=None, is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', kind=None, otype=None, size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=2, run_id=2, schema_id=None, created_by_id=1, created_at=2023-07-14 19:00:30 UTC, is_locked=False)
How do I know if a record is saved in the default database instance or not?

Every record has an attribute ._state.db which can take the following values:

  • None: the record has not yet been saved to any database

  • "default": the record is saved on the default database instance

  • "account/name": the record is saved on a non-default database instance referenced by account/name (e.g., laminlabs/lamindata)

The artifact record has been transferred to the current database without feature & label annotations, but with updated data lineage.

artifact.describe()
Hide code cell output
Artifact: tabula_sapiens_lung.h5ad (6Wph)
|   description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
├── uid: dPraor9rU1EofcFb6Wph            run: rIRguIJ (__lamindb_transfer__/4XIuR0tvaiXM)
hash: 8mB1KK2wd51F6HQdvqipcQ         size: 3.6 GB                                    
branch: main                         space: all                                      
created_at: 2023-07-14 19:00:30 UTC  created_by: anonymous                           
└── storage/path: s3://lamindata/tabula_sapiens_lung.h5ad

You see that the data itself remained in the original storage location, which has been added to the current instance’s storage location as a read-only location (indicated by the fact that the instance_uid doesn’t match the current instance).

ln.Storage.to_dataframe()
Hide code cell output
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
2 D9BilDV2 s3://lamindata None s3 us-east-1 4XIuR0tvaiXM False 2023-04-22 05:50:06.537267+00:00 1 1 1 2.0
1 6BU32n6siH16 /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 False 2025-10-30 07:57:16.922000+00:00 1 1 1 NaN

See the state of the database.

ln.view()
Hide code cell output
****************
* module: core *
****************
Artifact
uid key description suffix kind otype size hash n_files n_observations version is_latest is_locked created_at branch_id space_id storage_id run_id schema_id created_by_id
id
1 dPraor9rU1EofcFb6Wph tabula_sapiens_lung.h5ad Part of Tabula Sapiens, a benchmark, first-dra... .h5ad None None 3899435772 8mB1KK2wd51F6HQdvqipcQ None None None True False 2023-07-14 19:00:30.621330+00:00 1 1 2 2 None 1
Run
uid name started_at finished_at params reference reference_type is_locked created_at branch_id space_id transform_id report_id _logfile_id environment_id created_by_id initiated_by_run_id
id
2 rIRguIJJUlrKHFYh None 2025-10-30 07:57:23.388000+00:00 None None None None False 2025-10-30 07:57:23.388000+00:00 1 1 2 None None None 1 1.0
1 r5jncvtO9tJOGJEH None 2025-10-30 07:57:20.219350+00:00 None None None None False 2025-10-30 07:57:20.220000+00:00 1 1 1 None None None 1 NaN
Storage
uid root description type region instance_uid is_locked created_at branch_id space_id created_by_id run_id
id
2 D9BilDV2 s3://lamindata None s3 us-east-1 4XIuR0tvaiXM False 2023-04-22 05:50:06.537267+00:00 1 1 1 2.0
1 6BU32n6siH16 /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 False 2025-10-30 07:57:16.922000+00:00 1 1 1 NaN
Transform
uid key description type source_code hash reference reference_type version is_latest is_locked created_at branch_id space_id created_by_id _template_id
id
2 4XIuR0tvaiXM0000 __lamindb_transfer__/4XIuR0tvaiXM Transfer from `laminlabs/lamindata` function None None None None None True False 2025-10-30 07:57:23.384000+00:00 1 1 1 None
1 ITeOtm7bhtdq0000 transfer.ipynb Transfer data notebook None None None None None True False 2025-10-30 07:57:20.214000+00:00 1 1 1 None
******************
* module: bionty *
******************
Source
uid entity organism name in_db currently_used description url md5 source_website version is_locked created_at branch_id space_id created_by_id run_id dataframe_artifact_id
id
33 5JnVODh4 BioSample all ncbi False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... None https://www.ncbi.nlm.nih.gov/biosample/docs/at... 2023-09 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
32 MJRqduf9 bionty.Ethnicity human hancestro False True Human Ancestry Ontology http://purl.obolibrary.org/obo/hancestro/relea... None https://github.com/EBISPOT/hancestro 3.0 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
31 10va5JSt bionty.DevelopmentalStage mouse mmusdv False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... 2024-05-28 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
30 1GbFkOdz bionty.DevelopmentalStage human hsapdv False True Human Developmental Stages https://github.com/obophenotype/developmental-... None https://github.com/obophenotype/developmental-... 2024-05-28 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
29 1atB0WnU Drug all chebi False False Chemical Entities of Biological Interest s3://bionty-assets/df_all__chebi__2024-07-27__... None https://www.ebi.ac.uk/chebi/ 2024-07-27 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
28 ugaIoIlj Drug all dron False True Drug Ontology http://purl.obolibrary.org/obo/dron/releases/2... None https://bioportal.bioontology.org/ontologies/DRON 2024-08-05 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None
27 3rm9aOzL BFXPipeline all lamin False True Bioinformatics Pipeline s3://bionty-assets/df_all__lamin__1.0.0__BFXpi... None https://lamin.ai 1.0.0 False 2025-10-30 07:57:17.020000+00:00 1 1 1 None None

View lineage:

artifact.view_lineage()
Hide code cell output
! calling anonymously, will miss private instances
_images/d2de971e3ab4f66503b6c09c6d67b88dab2d91b8b8fd1628381517ba6ab56996.svg

The transferred dataset is linked to a special type of transform that stores the slug and uid of the source instance:

artifact.transform.description
Hide code cell output
'Transfer from `laminlabs/lamindata`'

The transform key has the form f"__lamindb_transfer__/{source_instance.uid}":

artifact.transform.key
Hide code cell output
'__lamindb_transfer__/4XIuR0tvaiXM'

The current notebook run is linked as the initiated_by_run of the “transfer run”:

artifact.run.initiated_by_run.transform
Hide code cell output
Transform(uid='ITeOtm7bhtdq0000', version=None, is_latest=True, key='transfer.ipynb', description='Transfer data', type='notebook', hash=None, reference=None, reference_type=None, branch_id=1, space_id=1, created_by_id=1, created_at=2025-10-30 07:57:20 UTC, is_locked=False)

Upon re-transferring a record, it will identify that the record already exists in the target database and simply map the record.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.save()
Hide code cell output
 mapped: Artifact(uid='dPraor9rU1EofcFb6Wph')
Artifact(uid='dPraor9rU1EofcFb6Wph', version=None, is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', kind=None, otype=None, size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=2, run_id=2, schema_id=None, created_by_id=1, created_at=2023-07-14 19:00:30 UTC, is_locked=False)

If you also want to transfer annotations of the artifact, you can pass transfer="annotations" to save(). Just note that this might populate your target database with metadata that doesn’t match the conventions you want to enforce.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.save(transfer="annotations")
Hide code cell output
 mapped: Artifact(uid='dPraor9rU1EofcFb6Wph'), Tissue(uid='7Tt4iEKc'), CellType(uid='5tiBvp96'), CellType(uid='7Crr32HI'), CellType(uid='6dzoXJ3Y'), CellType(uid='01NqvhnI'), CellType(uid='5NceZTYm'), CellType(uid='4PSMdO3I'), CellType(uid='3JO0EdVd'), CellType(uid='6rfrjhvo'), CellType(uid='37mWPv6o'), CellType(uid='5Z76sCep'), CellType(uid='2OWUH6Z1'), CellType(uid='5TU8SFt5'), CellType(uid='ryEtgi1y'), CellType(uid='1lMgAPE8'), CellType(uid='7m6Ruz32'), CellType(uid='42qbvc90'), CellType(uid='puGNwNrs'), CellType(uid='1T8bGe2I'), CellType(uid='6IC9NGJE'), CellType(uid='6ujMwy7s'), CellType(uid='3eecYgWR'), CellType(uid='zQ4dyjEs'), CellType(uid='7mNqzyFE'), CellType(uid='5A9EFjNB'), CellType(uid='3lsrLTv6'), CellType(uid='1HYtHpIc'), CellType(uid='6UmKFrzn'), CellType(uid='7eZArDpo'), CellType(uid='2KCFdGIk'), CellType(uid='1V5wVqK5'), CellType(uid='5i19XYug'), CellType(uid='2nPA0h4F'), CellType(uid='5Xi2OLvZ'), CellType(uid='3kaL3W1c'), ExperimentalFactor(uid='5YDCOg0V'), ExperimentalFactor(uid='7R1OhRJ7')
 transferred: CellType(uid='4mZaXZQg'), CellType(uid='5rVn0X39'), CellType(uid='EWy46Sey'), CellType(uid='4yqLzwwm'), ULabel(uid='vfLXaHgD'), ULabel(uid='ZaVLDCZE'), ULabel(uid='gk6w8qC5'), ULabel(uid='tZCTk48f')
Artifact(uid='dPraor9rU1EofcFb6Wph', version=None, is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', kind=None, otype=None, size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', n_files=None, n_observations=None, branch_id=1, space_id=1, storage_id=2, run_id=2, schema_id=None, created_by_id=1, created_at=2023-07-14 19:00:30 UTC, is_locked=False)

The artifact is now annotated.

artifact.describe()
Hide code cell output
Artifact: tabula_sapiens_lung.h5ad (6Wph)
|   description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
├── uid: dPraor9rU1EofcFb6Wph            run: rIRguIJ (__lamindb_transfer__/4XIuR0tvaiXM)
hash: 8mB1KK2wd51F6HQdvqipcQ         size: 3.6 GB                                    
branch: main                         space: all                                      
created_at: 2023-07-14 19:00:30 UTC  created_by: anonymous                           
├── storage/path: s3://lamindata/tabula_sapiens_lung.h5ad
└── Labels
    └── .tissues                        bionty.Tissue                      lung                                    
        .cell_types                     bionty.CellType                    pulmonary alveolar type 1 cell, adventi…
        .experimental_factors           bionty.ExperimentalFactor          anoxya, stroke                          
        .ulabels                        ULabel                             TSP1, TSP2, TSP14                       
Hide code cell content
# test the last 3 cells here
assert artifact.transform.description == "Transfer from `laminlabs/lamindata`"
assert artifact.transform.key == "__lamindb_transfer__/4XIuR0tvaiXM"
assert artifact.transform.uid == "4XIuR0tvaiXM0000"
assert artifact.run.initiated_by_run.transform.description == "Transfer data"