Transfer data

This guide shows how to transfer data from a source database into the currently connected database.

# pip install 'lamindb[jupyter,bionty]'
!lamin init --storage ./test-transfer --modules bionty
Hide code cell output
! using anonymous user (to identify, call: lamin login)
 initialized lamindb: anonymous/test-transfer
import lamindb as ln

ln.track("ITeOtm7bhtdq")
Hide code cell output
 connected lamindb: anonymous/test-transfer
 created Transform('ITeOtm7bhtdq0000'), started new Run('yo80sGzI...') at 2025-07-14 06:40:07 UTC
 notebook imports: lamindb==1.8.0

Query all artifacts in the laminlabs/lamindata instance and filter them to their latest versions.

# query all latest artifact versions
artifacts = ln.Artifact.using("laminlabs/lamindata").filter(is_latest=True)

# convert the QuerySet to a DataFrame and show the latest 5 versions
artifacts.df().head()
Hide code cell output
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
1282 WQtsc0CQZKB9GEst0000 None Example R cars dataset .parquet dataset DataFrame 2402.0 eIk8NXNiwMoGmhhjrMILbg NaN NaN md5 True False 1 2 NaN None True 460.0 2025-01-15 14:22:51.192955+00:00 30 None 1
1349 9KD0HE9lVveLpvuI0000 data/prep_adata None .h5ad None AnnData 124511524.0 gnwU_GFFN_xIhtncrxu-tv NaN NaN sha1-fl True False 1 2 NaN None True NaN 2025-03-03 23:24:56.184549+00:00 35 None 1
1451 cGi8QjXNQQfZzL4n0000 simple-lineage/figures/pca_all.pdf None .pdf None None 4707.0 QexvSEBGMa80m0pV5KXd4w NaN NaN md5 True False 1 2 NaN None True 569.0 2025-04-01 11:33:47.714024+00:00 9 None 1
1699 2qBNr2ICBnMS8JSC0000 mini_text_files/file32.txt None .txt None None 2.0 Y2TT8PSVtqudz407XG4LAQ NaN NaN md5 False False 1 2 NaN None True 669.0 2025-05-05 14:15:55.974243+00:00 9 None 1
1742 FoaS7BF8AZpt0Va80000 mini_text_files/file64.txt None .txt None None 2.0 6l0vHEYIIy4H06o9mY5RNQ NaN NaN md5 False False 1 2 NaN None True 669.0 2025-05-05 14:16:01.479023+00:00 9 None 1

You can now further subset or search the QuerySet. Here we query by whether the description contains “tabula sapiens”.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.describe()
Hide code cell output
Artifact .h5ad
├── General
│   ├── uid: dPraor9rU1EofcFb6Wph          hash: 8mB1KK2wd51F6HQdvqipcQ
│   ├── size: 3.6 GB                       space: all
│   ├── branch: main                       created_at: 2023-07-14 19:00:30
│   ├── created_by: Koncopd (Sergei Rybakov)
│   ├── key: tabula_sapiens_lung.h5ad
│   ├── storage location / path: s3://lamindata/tabula_sapiens_lung.h5ad
│   ├── description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
│   └── transform: ux-session-tb-lung
└── Labels
    └── .tissues                        bionty.Tissue                      lung                                    
        .cell_types                     bionty.CellType                    CD4-positive, alpha-beta T cell, CD8-po…
        .experimental_factors           bionty.ExperimentalFactor          anoxya, stroke                          
        .ulabels                        ULabel                             TSP1, TSP2, TSP14                       

By saving the artifact record that’s currently attached to the source database instance, you transfer it to the default database instance.

artifact.save()
Hide code cell output
 transferred: Artifact(uid='dPraor9rU1EofcFb6Wph'), Storage(uid='D9BilDV2')
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', branch_id=1, space_id=1, storage_id=2, run_id=2, created_by_id=1, created_at=2023-07-14 19:00:30 UTC)
How do I know if a record is saved in the default database instance or not?

Every record has an attribute ._state.db which can take the following values:

  • None: the record has not yet been saved to any database

  • "default": the record is saved on the default database instance

  • "account/name": the record is saved on a non-default database instance referenced by account/name (e.g., laminlabs/lamindata)

The artifact record has been transferred to the current database without feature & label annotations, but with updated data lineage.

artifact.describe()
Hide code cell output
Artifact .h5ad
└── General
    ├── uid: dPraor9rU1EofcFb6Wph          hash: 8mB1KK2wd51F6HQdvqipcQ
    ├── size: 3.6 GB                       space: all
    ├── branch: main                       created_at: 2023-07-14 19:00:30
    ├── created_by: anonymous
    ├── key: tabula_sapiens_lung.h5ad
    ├── storage location / path: s3://lamindata/tabula_sapiens_lung.h5ad
    ├── description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
    └── transform: __lamindb_transfer__/4XIuR0tvaiXM

You see that the data itself remained in the original storage location, which has been added to the current instance’s storage location as a read-only location (indicated by the fact that the instance_uid doesn’t match the current instance).

ln.Storage.df()
Hide code cell output
uid root description type region instance_uid space_id run_id created_at created_by_id _aux branch_id
id
1 XBeWU7nck6Vq /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 1 NaN 2025-07-14 06:40:03.979000+00:00 1 None 1
2 D9BilDV2 s3://lamindata None s3 us-east-1 4XIuR0tvaiXM 1 2.0 2023-04-22 05:50:06.537267+00:00 1 None 1

See the state of the database.

ln.view()
Hide code cell output
****************
* module: core *
****************
Artifact
uid key description suffix kind otype size hash n_files n_observations _hash_type _key_is_virtual _overwrite_versions space_id storage_id schema_id version is_latest run_id created_at created_by_id _aux branch_id
id
1 dPraor9rU1EofcFb6Wph tabula_sapiens_lung.h5ad Part of Tabula Sapiens, a benchmark, first-dra... .h5ad None None 3899435772 8mB1KK2wd51F6HQdvqipcQ None None sha1-fl False False 1 2 None None True 2 2023-07-14 19:00:30.621330+00:00 1 None 1
Run
uid name started_at finished_at reference reference_type _is_consecutive _status_code space_id transform_id report_id _logfile_id environment_id initiated_by_run_id created_at created_by_id _aux branch_id
id
1 yo80sGzIyHPLkFkc None 2025-07-14 06:40:07.463886+00:00 None None None None -1.0 1 1 None None None NaN 2025-07-14 06:40:07.464000+00:00 1 None 1
2 D3blUzckggoqMfRS None 2025-07-14 06:40:21.333000+00:00 None None None None NaN 1 2 None None None 1.0 2025-07-14 06:40:21.333000+00:00 1 None 1
Storage
uid root description type region instance_uid space_id run_id created_at created_by_id _aux branch_id
id
1 XBeWU7nck6Vq /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 1 NaN 2025-07-14 06:40:03.979000+00:00 1 None 1
2 D9BilDV2 s3://lamindata None s3 us-east-1 4XIuR0tvaiXM 1 2.0 2023-04-22 05:50:06.537267+00:00 1 None 1
Transform
uid key description type source_code hash reference reference_type space_id _template_id version is_latest created_at created_by_id _aux branch_id
id
2 4XIuR0tvaiXM0000 __lamindb_transfer__/4XIuR0tvaiXM Transfer from `laminlabs/lamindata` function None None None None 1 None None True 2025-07-14 06:40:21.326000+00:00 1 None 1
1 ITeOtm7bhtdq0000 transfer.ipynb Transfer data notebook None None None None 1 None None True 2025-07-14 06:40:07.446000+00:00 1 None 1
******************
* module: bionty *
******************
Source
uid entity organism name in_db currently_used description url md5 source_website space_id dataframe_artifact_id version run_id created_at created_by_id _aux branch_id
id
1 33TUF039 bionty.Organism vertebrates ensembl False True Ensembl https://ftp.ensembl.org/pub/release-112/specie... None https://www.ensembl.org 1 None release-112 None 2025-07-14 06:40:04.086000+00:00 1 None 1
2 6bbVUTCS bionty.Organism bacteria ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/bacte... None https://www.ensembl.org 1 None release-57 None 2025-07-14 06:40:04.086000+00:00 1 None 1
3 6s9nV6xh bionty.Organism fungi ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/fungi... None https://www.ensembl.org 1 None release-57 None 2025-07-14 06:40:04.086000+00:00 1 None 1
4 2PmTrc8x bionty.Organism metazoa ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/metaz... None https://www.ensembl.org 1 None release-57 None 2025-07-14 06:40:04.086000+00:00 1 None 1
5 7GPHh16S bionty.Organism plants ensembl False True Ensembl https://ftp.ensemblgenomes.ebi.ac.uk/pub/plant... None https://www.ensembl.org 1 None release-57 None 2025-07-14 06:40:04.086000+00:00 1 None 1
6 4tsksCMX bionty.Organism all ncbitaxon False True NCBItaxon Ontology http://purl.obolibrary.org/obo/ncbitaxon/2023-... None https://github.com/obophenotype/ncbitaxon 1 None 2023-06-20 None 2025-07-14 06:40:04.086000+00:00 1 None 1
7 4UGNz3fr bionty.Gene human ensembl False True Ensembl s3://bionty-assets/df_human__ensembl__release-... None https://www.ensembl.org 1 None release-112 None 2025-07-14 06:40:04.086000+00:00 1 None 1

View lineage:

artifact.view_lineage()
Hide code cell output
! calling anonymously, will miss private instances
_images/0e5efb79e98d94a1d413e6d5d428d9fc885fc0a8e71248425bd1b954ca522378.svg

The transferred dataset is linked to a special type of transform that stores the slug and uid of the source instance:

artifact.transform.description
Hide code cell output
'Transfer from `laminlabs/lamindata`'

The transform key has the form f"__lamindb_transfer__/{source_instance.uid}":

artifact.transform.key
Hide code cell output
'__lamindb_transfer__/4XIuR0tvaiXM'

The current notebook run is linked as the initiated_by_run of the “transfer run”:

artifact.run.initiated_by_run.transform
Hide code cell output
Transform(uid='ITeOtm7bhtdq0000', is_latest=True, key='transfer.ipynb', description='Transfer data', type='notebook', branch_id=1, space_id=1, created_by_id=1, created_at=2025-07-14 06:40:07 UTC)

Upon re-transferring a record, it will identify that the record already exists in the target database and simply map the record.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.save()
Hide code cell output
 mapped: Artifact(uid='dPraor9rU1EofcFb6Wph')
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', branch_id=1, space_id=1, storage_id=2, run_id=2, created_by_id=1, created_at=2023-07-14 19:00:30 UTC)

If you also want to transfer annotations of the artifact, you can pass transfer="annotations" to save(). Just note that this might populate your target database with metadata that doesn’t match the conventions you want to enforce.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.save(transfer="annotations")
Hide code cell output
 mapped: Artifact(uid='dPraor9rU1EofcFb6Wph'), Tissue(uid='7Tt4iEKc'), CellType(uid='5tiBvp96'), CellType(uid='7Crr32HI'), CellType(uid='6dzoXJ3Y'), CellType(uid='01NqvhnI'), CellType(uid='5NceZTYm'), CellType(uid='4PSMdO3I'), CellType(uid='3JO0EdVd'), CellType(uid='6rfrjhvo'), CellType(uid='37mWPv6o'), CellType(uid='5Z76sCep'), CellType(uid='2OWUH6Z1'), CellType(uid='5TU8SFt5'), CellType(uid='ryEtgi1y'), CellType(uid='1lMgAPE8'), CellType(uid='7m6Ruz32'), CellType(uid='42qbvc90'), CellType(uid='puGNwNrs'), CellType(uid='1T8bGe2I'), CellType(uid='6IC9NGJE'), CellType(uid='6ujMwy7s'), CellType(uid='3eecYgWR'), CellType(uid='zQ4dyjEs'), CellType(uid='7mNqzyFE'), CellType(uid='5A9EFjNB'), CellType(uid='3lsrLTv6'), CellType(uid='1HYtHpIc'), CellType(uid='6UmKFrzn'), CellType(uid='7eZArDpo'), CellType(uid='2KCFdGIk'), CellType(uid='1V5wVqK5'), CellType(uid='5i19XYug'), CellType(uid='2nPA0h4F'), CellType(uid='5Xi2OLvZ'), CellType(uid='3kaL3W1c'), ExperimentalFactor(uid='5YDCOg0V'), ExperimentalFactor(uid='7R1OhRJ7')
 transferred: CellType(uid='4mZaXZQg'), CellType(uid='5rVn0X39'), CellType(uid='EWy46Sey'), CellType(uid='4yqLzwwm'), ULabel(uid='vfLXaHgD'), ULabel(uid='ZaVLDCZE'), ULabel(uid='gk6w8qC5'), ULabel(uid='tZCTk48f')
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, key='tabula_sapiens_lung.h5ad', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', branch_id=1, space_id=1, storage_id=2, run_id=2, created_by_id=1, created_at=2023-07-14 19:00:30 UTC)

The artifact is now annotated.

artifact.describe()
Hide code cell output
Artifact .h5ad
├── General
│   ├── uid: dPraor9rU1EofcFb6Wph          hash: 8mB1KK2wd51F6HQdvqipcQ
│   ├── size: 3.6 GB                       space: all
│   ├── branch: main                       created_at: 2023-07-14 19:00:30
│   ├── created_by: anonymous
│   ├── key: tabula_sapiens_lung.h5ad
│   ├── storage location / path: s3://lamindata/tabula_sapiens_lung.h5ad
│   ├── description: Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.
│   └── transform: __lamindb_transfer__/4XIuR0tvaiXM
└── Labels
    └── .tissues                        bionty.Tissue                      lung                                    
        .cell_types                     bionty.CellType                    pulmonary alveolar type 1 cell, adventi…
        .experimental_factors           bionty.ExperimentalFactor          anoxya, stroke                          
        .ulabels                        ULabel                             TSP1, TSP2, TSP14                       
Hide code cell content
# test the last 3 cells here
assert artifact.transform.description == "Transfer from `laminlabs/lamindata`"
assert artifact.transform.key == "__lamindb_transfer__/4XIuR0tvaiXM"
assert artifact.transform.uid == "4XIuR0tvaiXM0000"
assert artifact.run.initiated_by_run.transform.description == "Transfer data"

# clean up test instance
!lamin delete --force test-transfer
! calling anonymously, will miss private instances
 deleting instance anonymous/test-transfer