Transfer data

This guide shows how to transfer data from a source database instance into the current default database instance.

# !pip install 'lamindb[jupyter,aws,bionty]'
!lamin init --storage ./test-transfer --schema bionty
Hide code cell output
→ connected lamindb: anonymous/test-transfer
import lamindb as ln

ln.context.uid = "ITeOtm7bhtdq0000"
ln.context.track()
Hide code cell output
→ connected lamindb: anonymous/test-transfer
→ notebook imports: lamindb==0.76.6
→ created Transform(uid='ITeOtm7bhtdq0000') & created Run(started_at='2024-09-10 15:14:26 UTC')

Query all artifacts in the laminlabs/lamindata instance and filter them to their latest versions.

# query all latest artifact versions 
artifacts = ln.Artifact.using("laminlabs/lamindata").filter(is_latest=True)

# convert the QuerySet to a DataFrame and show the latest 5 versions
artifacts.df().head()
Hide code cell output
uid version is_latest description key suffix type size hash n_objects n_observations _hash_type _accessor visibility _key_is_virtual storage_id transform_id run_id created_by_id updated_at
id
607 sRapK07mMtToihzFeTaf None True View Papalexi21 in Vitessce None .vitessce.json None 1527 jfAtjNNzdvetUaEo5zhf0Q NaN NaN md5 None 1 True 2 79.0 141.0 2 2024-04-30 12:51:16.348895+00:00
726 HXJ4DDAw8012jVKwoxgd None True View Kuppe2022 in Vitessce None .vitessce.json None 5258 JsVK8X8EGRsyTEMnD3Z-6g NaN NaN md5 None 1 True 2 79.0 198.0 2 2024-06-26 10:35:31.817677+00:00
1 WGDHevIgEDPJ6CB99foT None True tabula-muris-senis-facs-processed-official-ann... Data-objects/tabula-muris-senis-facs-processed... .h5ad None 4795677086 None NaN NaN None None 1 False 8 1.0 1.0 9 2023-10-14 15:40:50.063681+00:00
2 0PsCY8SjhD1FIqw9e99v None True paradisi05.jpg paradisi05.jpg .jpg None 29358 r4tnqmKI_SjrkdLzpuWp4g NaN NaN None None 1 False 2 30.0 6.0 2 2023-10-14 15:40:50.190289+00:00
3 1YOa919kJFXzvgfmq4Pv None True alignment_result.bam alignment_result.bam .bam None 14 rGSwtSEKB65DaaQq740p6A NaN NaN None None 1 False 2 30.0 6.0 2 2023-10-14 15:40:50.315646+00:00

You can now further subset or search the QuerySet. Here we query by whether the description contains “tabula sapiens”.

artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.describe()
Hide code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, updated_at='2023-10-14 15:40:52 UTC')
  Database instance
    slug: laminlabs/lamindata
  Provenance
    .storage = 's3://lamindata'
    .transform = 'Ingest Tabula Sapiens Lung'
    .run = '2023-07-14 12:53:17 UTC'
    .created_by = 'Koncopd'
  Usage
    .input_of_runs = '2023-07-15 17:12:16 UTC'
  Labels
    .tissues = 'lung'
    .cell_types = 'CD8-positive, alpha-beta T cell', 'fibroblast', 'pericyte', 'B cell', 'mesothelial cell', 'CD4-positive, alpha-beta T cell', 'natural killer cell', 'non-classical monocyte', 'myofibroblast cell', 'capillary endothelial cell', ...
    .experimental_factors = 'anoxya', 'stroke'
    .ulabels = 'TSP1', 'TSP2', 'TSP14'

By saving the artifact record that’s currently attached to the source database instance, you transfer it to the default database instance.

artifact.save()
Hide code cell output
→ mapped records: Tissue(uid='7Tt4iEKc'), CellType(uid='6IC9NGJE'), CellType(uid='zQ4dyjEs'), CellType(uid='6ujMwy7s'), CellType(uid='ryEtgi1y'), CellType(uid='2OWUH6Z1'), CellType(uid='4PSMdO3I'), CellType(uid='37mWPv6o'), CellType(uid='01NqvhnI'), CellType(uid='5Z76sCep'), CellType(uid='3kaL3W1c'), CellType(uid='5i19XYug'), CellType(uid='puGNwNrs'), CellType(uid='3JO0EdVd'), CellType(uid='3eecYgWR'), CellType(uid='1lMgAPE8'), CellType(uid='5tiBvp96'), CellType(uid='5NceZTYm'), CellType(uid='2nPA0h4F'), CellType(uid='6UmKFrzn'), CellType(uid='1HYtHpIc'), CellType(uid='6dzoXJ3Y'), CellType(uid='7Crr32HI'), CellType(uid='6rfrjhvo'), CellType(uid='5TU8SFt5'), CellType(uid='7m6Ruz32'), CellType(uid='42qbvc90'), CellType(uid='1T8bGe2I'), CellType(uid='7mNqzyFE'), CellType(uid='5A9EFjNB'), CellType(uid='3lsrLTv6'), CellType(uid='7eZArDpo'), CellType(uid='2KCFdGIk'), CellType(uid='1V5wVqK5'), CellType(uid='5Xi2OLvZ'), ExperimentalFactor(uid='5YDCOg0V'), ExperimentalFactor(uid='7R1OhRJ7')
→ transferred records: Artifact(uid='dPraor9rU1EofcFb6Wph'), Storage(uid='D9BilDV2'), CellType(uid='4mZaXZQg'), CellType(uid='5rVn0X39'), CellType(uid='EWy46Sey'), CellType(uid='4yqLzwwm'), ULabel(uid='vfLXaHgD'), ULabel(uid='gk6w8qC5'), ULabel(uid='tZCTk48f')
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, storage_id=2, transform_id=1, run_id=1, created_by_id=1, updated_at='2024-09-10 15:14:27 UTC')
How do I know if a record is saved in the default database instance or not?

Every record has an attribute ._state.db which can take the following values:

  • None: the record has not yet been saved to any database

  • "default": the record is saved on the default database instance

  • "account/name": the record is save on a non-default database instance referenced by account/name (e.g., laminlabs/lamindata)

The artifact record and all other feature & label records have been transferred to the current database.

artifact.describe()
Hide code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, updated_at='2024-09-10 15:14:27 UTC')
  Provenance
    .storage = 's3://lamindata'
    .transform = 'Transfer data'
    .run = '2024-09-10 15:14:26 UTC'
    .created_by = 'anonymous'
  Labels
    .tissues = 'lung'
    .cell_types = 'CD8-positive, alpha-beta T cell', 'fibroblast', 'pericyte', 'B cell', 'mesothelial cell', 'CD4-positive, alpha-beta T cell', 'natural killer cell', 'non-classical monocyte', 'myofibroblast cell', 'capillary endothelial cell', ...
    .experimental_factors = 'anoxya', 'stroke'
    .ulabels = 'TSP1', 'TSP2', 'TSP14'

You see that the data itself remained in the original storage location, which has been added to the current instance’s storage location as a read-only location.

ln.Storage.df()
Hide code cell output
uid root description type region instance_uid run_id created_by_id updated_at
id
2 D9BilDV2 s3://lamindata None s3 us-east-1 None 1.0 1 2024-09-10 15:14:27.910497+00:00
1 831L9kyiqB7X /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 NaN 1 2024-09-10 15:14:21.991327+00:00

See the state of the database.

ln.view()
Hide code cell output
****************
* module: core *
****************
Artifact
uid version is_latest description key suffix type size hash n_objects n_observations _hash_type _accessor visibility _key_is_virtual storage_id transform_id run_id created_by_id updated_at
id
1 dPraor9rU1EofcFb6Wph None True Part of Tabula Sapiens, a benchmark, first-dra... tabula_sapiens_lung.h5ad .h5ad None 3899435772 8mB1KK2wd51F6HQdvqipcQ None None sha1-fl None 1 False 2 1 1 1 2024-09-10 15:14:27.912616+00:00
Run
uid started_at finished_at is_consecutive reference reference_type transform_id report_id environment_id parent_id created_by_id
id
1 ltSstFTHan3c3Iue99sE 2024-09-10 15:14:26.398218+00:00 None True None None 1 None None None 1
Storage
uid root description type region instance_uid run_id created_by_id updated_at
id
2 D9BilDV2 s3://lamindata None s3 us-east-1 None 1.0 1 2024-09-10 15:14:27.910497+00:00
1 831L9kyiqB7X /home/runner/work/lamindb/lamindb/docs/test-tr... None local None 1FHu5eE0uxm4 NaN 1 2024-09-10 15:14:21.991327+00:00
Transform
uid version is_latest name key description type source_code hash reference reference_type _source_code_artifact_id created_by_id updated_at
id
1 ITeOtm7bhtdq0000 None True Transfer data transfer.ipynb None notebook None None None None None 1 2024-09-10 15:14:26.393485+00:00
ULabel
uid name description reference reference_type run_id created_by_id updated_at
id
3 tZCTk48f TSP14 None None None 1 1 2024-09-10 15:14:33.653435+00:00
2 gk6w8qC5 TSP2 None None None 1 1 2024-09-10 15:14:33.643682+00:00
1 vfLXaHgD TSP1 None None None 1 1 2024-09-10 15:14:33.633011+00:00
User
uid handle name updated_at
id
1 00000000 anonymous None 2024-09-10 15:14:21.987389+00:00
******************
* module: bionty *
******************
CellType
uid name ontology_id abbr synonyms description source_id run_id created_by_id updated_at
id
112 4yqLzwwm bronchial vessel endothelial cell None None None None NaN 1 1 2024-09-10 15:14:31.755783+00:00
111 EWy46Sey respiratory mucous cell None None None None NaN 1 1 2024-09-10 15:14:31.736369+00:00
110 5rVn0X39 capillary aerocyte None None None None NaN 1 1 2024-09-10 15:14:31.725832+00:00
109 4mZaXZQg alveolar fibroblast None None None None NaN 1 1 2024-09-10 15:14:31.621140+00:00
108 3hXuCKYH perivascular cell CL:4033054 None None A Cell That Is Adjacent To A Vessel. A Perivas... 32.0 1 1 2024-09-10 15:14:31.406472+00:00
107 4qrbhCCl respiratory ciliated cell CL:4030034 None ciliated cell of the respiratory tract A Ciliated Cell Of The Respiratory System. Cil... 32.0 1 1 2024-09-10 15:14:31.406416+00:00
106 2aMXs0ko microvascular endothelial cell CL:2000008 None None Any Blood Vessel Endothelial Cell That Is Part... 32.0 1 1 2024-09-10 15:14:31.406360+00:00
ExperimentalFactor
uid name ontology_id abbr synonyms description molecule instrument measurement source_id run_id created_by_id updated_at
id
8 1was9kRO hypoxia EFO:0009444 None None A Decrease In The Amount Of Oxygen In The Body... None None None 62 1 1 2024-09-10 15:14:33.563066+00:00
7 2lctIHmn central nervous system disease EFO:0009386 None disease or disorder of central nervous system|... A Disease Involving The Central Nervous System. None None None 62 1 1 2024-09-10 15:14:33.563003+00:00
6 68LLeA7O brain disease EFO:0005774 None brain disease or disorder|disorder of brain|br... A Disease Affecting The Brain Or Part Of The B... None None None 62 1 1 2024-09-10 15:14:33.562939+00:00
5 2xDSpjH7 cerebrovascular disorder EFO:0003763 None Intracranial Vascular Disease|Vascular Disorde... A Disorder Resulting From Inadequate Blood Flo... None None None 62 1 1 2024-09-10 15:14:33.562875+00:00
4 6ISbvepx nervous system disease EFO:0000618 None disease or disorder of nervous system|nervous ... A Non-Neoplastic Or Neoplastic Disorder That A... None None None 62 1 1 2024-09-10 15:14:33.562808+00:00
3 20Nq3k7b disease EFO:0000408 None disorders|disease or disorder|medical conditio... A Disease Is A Disposition To Undergo Patholog... None None None 62 1 1 2024-09-10 15:14:33.562728+00:00
2 7R1OhRJ7 stroke EFO:0000712 None Vascular Accident, Brain|Cerebrovascular Accid... A Sudden Loss Of Neurological Function Seconda... None None None 62 1 1 2024-09-10 15:14:33.108088+00:00
Source
uid entity organism name version in_db currently_used description url md5 source_website dataframe_artifact_id run_id created_by_id updated_at
id
62 69Xc bionty.ExperimentalFactor all efo 3.66.0 False True The Experimental Factor Ontology http://www.ebi.ac.uk/efo/releases/v3.66.0/efo.owl 6bd24217c740af7e1e771c1dabc9680b https://bioportal.bioontology.org/ontologies/EFO None None 1 2024-09-10 15:14:33.557495+00:00
32 1Lhf bionty.CellType all cl 2024-05-15 False True Cell Ontology http://purl.obolibrary.org/obo/cl/releases/202... 8a8638a9e79567935793e5007704c650 https://obophenotype.github.io/cell-ontology None None 1 2024-09-10 15:14:31.390876+00:00
39 MUtA bionty.Tissue all uberon 2024-08-07 False True Uberon multi-species anatomy ontology http://purl.obolibrary.org/obo/uberon/releases... http://obophenotype.github.io/uberon None None 1 2024-09-10 15:14:29.813147+00:00
96 5JnV BioSample all ncbi 2023-09 False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... 918db9bd1734b97c596c67d9654a4126 https://www.ncbi.nlm.nih.gov/biosample/docs/at... None None 1 2024-09-10 15:14:22.115752+00:00
95 MJRq bionty.Ethnicity human hancestro 3.0 False True Human Ancestry Ontology https://github.com/EBISPOT/hancestro/raw/3.0/h... 76dd9efda9c2abd4bc32fc57c0b755dd https://github.com/EBISPOT/hancestro None None 1 2024-09-10 15:14:22.115691+00:00
94 6vJm bionty.DevelopmentalStage mouse mmusdv 2020-03-10 False False Mouse Developmental Stages http://aber-owl.net/media/ontologies/MMUSDV/9/... 5bef72395d853c7f65450e6c2a1fc653 https://github.com/obophenotype/developmental-... None None 1 2024-09-10 15:14:22.115631+00:00
93 10va bionty.DevelopmentalStage mouse mmusdv 2024-05-28 False True Mouse Developmental Stages https://github.com/obophenotype/developmental-... https://github.com/obophenotype/developmental-... None None 1 2024-09-10 15:14:22.115571+00:00
Tissue
uid name ontology_id abbr synonyms description source_id run_id created_by_id updated_at
id
23 kkib4Wcs lateral structure UBERON:0015212 None None Any Structure That Is Placed On One Side Of Th... 39 1 1 2024-09-10 15:14:29.820758+00:00
22 4QeoxdKp body proper UBERON:0013702 None None The Region Of The Organism Associated With The... 39 1 1 2024-09-10 15:14:29.820702+00:00
21 3XuRxEhw main body axis UBERON:0013701 None None A Principle Subdivision Of An Organism That In... 39 1 1 2024-09-10 15:14:29.820647+00:00
20 7ZCdHnvN subdivision of organism along main body axis UBERON:0011676 None axial subdivision of organism A Major Subdivision Of An Organism That Divide... 39 1 1 2024-09-10 15:14:29.820592+00:00
19 4o2HviGe multicellular anatomical structure UBERON:0010000 None multicellular structure An Anatomical Structure That Has More Than One... 39 1 1 2024-09-10 15:14:29.820536+00:00
18 31GPuSXP subdivision of trunk UBERON:0009569 None trunk subdivision|region of trunk None 39 1 1 2024-09-10 15:14:29.820480+00:00
17 4IV77xkH thoracic segment organ UBERON:0005181 None None An Organ That Part Of The Thoracic Segment Reg... 39 1 1 2024-09-10 15:14:29.820425+00:00
Hide code cell content
# clean up test instance
!lamin delete --force test-transfer
! calling anonymously, will miss private instances
• deleting instance anonymous/test-transfer