Transfer data¶
This guide shows how to transfer data from a source database instance into the current default database instance.
# !pip install 'lamindb[jupyter,aws,bionty]'
!lamin init --storage ./test-transfer --schema bionty
Show code cell output
→ connected lamindb: anonymous/test-transfer
import lamindb as ln
ln.context.uid = "ITeOtm7bhtdq0000"
ln.context.track()
Show code cell output
→ connected lamindb: anonymous/test-transfer
→ notebook imports: lamindb==0.76.6
→ created Transform(uid='ITeOtm7bhtdq0000') & created Run(started_at='2024-09-10 15:14:26 UTC')
Query all artifacts in the laminlabs/lamindata
instance and filter them to their latest versions.
# query all latest artifact versions
artifacts = ln.Artifact.using("laminlabs/lamindata").filter(is_latest=True)
# convert the QuerySet to a DataFrame and show the latest 5 versions
artifacts.df().head()
Show code cell output
uid | version | is_latest | description | key | suffix | type | size | hash | n_objects | n_observations | _hash_type | _accessor | visibility | _key_is_virtual | storage_id | transform_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||||||||
607 | sRapK07mMtToihzFeTaf | None | True | View Papalexi21 in Vitessce | None | .vitessce.json | None | 1527 | jfAtjNNzdvetUaEo5zhf0Q | NaN | NaN | md5 | None | 1 | True | 2 | 79.0 | 141.0 | 2 | 2024-04-30 12:51:16.348895+00:00 |
726 | HXJ4DDAw8012jVKwoxgd | None | True | View Kuppe2022 in Vitessce | None | .vitessce.json | None | 5258 | JsVK8X8EGRsyTEMnD3Z-6g | NaN | NaN | md5 | None | 1 | True | 2 | 79.0 | 198.0 | 2 | 2024-06-26 10:35:31.817677+00:00 |
1 | WGDHevIgEDPJ6CB99foT | None | True | tabula-muris-senis-facs-processed-official-ann... | Data-objects/tabula-muris-senis-facs-processed... | .h5ad | None | 4795677086 | None | NaN | NaN | None | None | 1 | False | 8 | 1.0 | 1.0 | 9 | 2023-10-14 15:40:50.063681+00:00 |
2 | 0PsCY8SjhD1FIqw9e99v | None | True | paradisi05.jpg | paradisi05.jpg | .jpg | None | 29358 | r4tnqmKI_SjrkdLzpuWp4g | NaN | NaN | None | None | 1 | False | 2 | 30.0 | 6.0 | 2 | 2023-10-14 15:40:50.190289+00:00 |
3 | 1YOa919kJFXzvgfmq4Pv | None | True | alignment_result.bam | alignment_result.bam | .bam | None | 14 | rGSwtSEKB65DaaQq740p6A | NaN | NaN | None | None | 1 | False | 2 | 30.0 | 6.0 | 2 | 2023-10-14 15:40:50.315646+00:00 |
You can now further subset or search the QuerySet
. Here we query by whether the description contains “tabula sapiens”.
artifact = artifacts.filter(description__contains="Tabula Sapiens").first()
artifact.describe()
Show code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, updated_at='2023-10-14 15:40:52 UTC')
Database instance
slug: laminlabs/lamindata
Provenance
.storage = 's3://lamindata'
.transform = 'Ingest Tabula Sapiens Lung'
.run = '2023-07-14 12:53:17 UTC'
.created_by = 'Koncopd'
Usage
.input_of_runs = '2023-07-15 17:12:16 UTC'
Labels
.tissues = 'lung'
.cell_types = 'CD8-positive, alpha-beta T cell', 'fibroblast', 'pericyte', 'B cell', 'mesothelial cell', 'CD4-positive, alpha-beta T cell', 'natural killer cell', 'non-classical monocyte', 'myofibroblast cell', 'capillary endothelial cell', ...
.experimental_factors = 'anoxya', 'stroke'
.ulabels = 'TSP1', 'TSP2', 'TSP14'
By saving the artifact record that’s currently attached to the source database instance, you transfer it to the default database instance.
artifact.save()
Show code cell output
→ mapped records: Tissue(uid='7Tt4iEKc'), CellType(uid='6IC9NGJE'), CellType(uid='zQ4dyjEs'), CellType(uid='6ujMwy7s'), CellType(uid='ryEtgi1y'), CellType(uid='2OWUH6Z1'), CellType(uid='4PSMdO3I'), CellType(uid='37mWPv6o'), CellType(uid='01NqvhnI'), CellType(uid='5Z76sCep'), CellType(uid='3kaL3W1c'), CellType(uid='5i19XYug'), CellType(uid='puGNwNrs'), CellType(uid='3JO0EdVd'), CellType(uid='3eecYgWR'), CellType(uid='1lMgAPE8'), CellType(uid='5tiBvp96'), CellType(uid='5NceZTYm'), CellType(uid='2nPA0h4F'), CellType(uid='6UmKFrzn'), CellType(uid='1HYtHpIc'), CellType(uid='6dzoXJ3Y'), CellType(uid='7Crr32HI'), CellType(uid='6rfrjhvo'), CellType(uid='5TU8SFt5'), CellType(uid='7m6Ruz32'), CellType(uid='42qbvc90'), CellType(uid='1T8bGe2I'), CellType(uid='7mNqzyFE'), CellType(uid='5A9EFjNB'), CellType(uid='3lsrLTv6'), CellType(uid='7eZArDpo'), CellType(uid='2KCFdGIk'), CellType(uid='1V5wVqK5'), CellType(uid='5Xi2OLvZ'), ExperimentalFactor(uid='5YDCOg0V'), ExperimentalFactor(uid='7R1OhRJ7')
→ transferred records: Artifact(uid='dPraor9rU1EofcFb6Wph'), Storage(uid='D9BilDV2'), CellType(uid='4mZaXZQg'), CellType(uid='5rVn0X39'), CellType(uid='EWy46Sey'), CellType(uid='4yqLzwwm'), ULabel(uid='vfLXaHgD'), ULabel(uid='gk6w8qC5'), ULabel(uid='tZCTk48f')
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, storage_id=2, transform_id=1, run_id=1, created_by_id=1, updated_at='2024-09-10 15:14:27 UTC')
How do I know if a record is saved in the default database instance or not?
Every record has an attribute ._state.db
which can take the following values:
None
: the record has not yet been saved to any database"default"
: the record is saved on the default database instance"account/name"
: the record is save on a non-default database instance referenced byaccount/name
(e.g.,laminlabs/lamindata
)
The artifact record and all other feature & label records have been transferred to the current database.
artifact.describe()
Show code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', is_latest=True, description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', _hash_type='sha1-fl', visibility=1, _key_is_virtual=False, updated_at='2024-09-10 15:14:27 UTC')
Provenance
.storage = 's3://lamindata'
.transform = 'Transfer data'
.run = '2024-09-10 15:14:26 UTC'
.created_by = 'anonymous'
Labels
.tissues = 'lung'
.cell_types = 'CD8-positive, alpha-beta T cell', 'fibroblast', 'pericyte', 'B cell', 'mesothelial cell', 'CD4-positive, alpha-beta T cell', 'natural killer cell', 'non-classical monocyte', 'myofibroblast cell', 'capillary endothelial cell', ...
.experimental_factors = 'anoxya', 'stroke'
.ulabels = 'TSP1', 'TSP2', 'TSP14'
You see that the data itself remained in the original storage location, which has been added to the current instance’s storage location as a read-only location.
ln.Storage.df()
Show code cell output
uid | root | description | type | region | instance_uid | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|
id | |||||||||
2 | D9BilDV2 | s3://lamindata | None | s3 | us-east-1 | None | 1.0 | 1 | 2024-09-10 15:14:27.910497+00:00 |
1 | 831L9kyiqB7X | /home/runner/work/lamindb/lamindb/docs/test-tr... | None | local | None | 1FHu5eE0uxm4 | NaN | 1 | 2024-09-10 15:14:21.991327+00:00 |
See the state of the database.
ln.view()
Show code cell output
****************
* module: core *
****************
Artifact
uid | version | is_latest | description | key | suffix | type | size | hash | n_objects | n_observations | _hash_type | _accessor | visibility | _key_is_virtual | storage_id | transform_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||||||||
1 | dPraor9rU1EofcFb6Wph | None | True | Part of Tabula Sapiens, a benchmark, first-dra... | tabula_sapiens_lung.h5ad | .h5ad | None | 3899435772 | 8mB1KK2wd51F6HQdvqipcQ | None | None | sha1-fl | None | 1 | False | 2 | 1 | 1 | 1 | 2024-09-10 15:14:27.912616+00:00 |
Run
uid | started_at | finished_at | is_consecutive | reference | reference_type | transform_id | report_id | environment_id | parent_id | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||
1 | ltSstFTHan3c3Iue99sE | 2024-09-10 15:14:26.398218+00:00 | None | True | None | None | 1 | None | None | None | 1 |
Storage
uid | root | description | type | region | instance_uid | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|
id | |||||||||
2 | D9BilDV2 | s3://lamindata | None | s3 | us-east-1 | None | 1.0 | 1 | 2024-09-10 15:14:27.910497+00:00 |
1 | 831L9kyiqB7X | /home/runner/work/lamindb/lamindb/docs/test-tr... | None | local | None | 1FHu5eE0uxm4 | NaN | 1 | 2024-09-10 15:14:21.991327+00:00 |
Transform
uid | version | is_latest | name | key | description | type | source_code | hash | reference | reference_type | _source_code_artifact_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||||
1 | ITeOtm7bhtdq0000 | None | True | Transfer data | transfer.ipynb | None | notebook | None | None | None | None | None | 1 | 2024-09-10 15:14:26.393485+00:00 |
ULabel
uid | name | description | reference | reference_type | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|
id | ||||||||
3 | tZCTk48f | TSP14 | None | None | None | 1 | 1 | 2024-09-10 15:14:33.653435+00:00 |
2 | gk6w8qC5 | TSP2 | None | None | None | 1 | 1 | 2024-09-10 15:14:33.643682+00:00 |
1 | vfLXaHgD | TSP1 | None | None | None | 1 | 1 | 2024-09-10 15:14:33.633011+00:00 |
User
uid | handle | name | updated_at | |
---|---|---|---|---|
id | ||||
1 | 00000000 | anonymous | None | 2024-09-10 15:14:21.987389+00:00 |
******************
* module: bionty *
******************
CellType
uid | name | ontology_id | abbr | synonyms | description | source_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||
112 | 4yqLzwwm | bronchial vessel endothelial cell | None | None | None | None | NaN | 1 | 1 | 2024-09-10 15:14:31.755783+00:00 |
111 | EWy46Sey | respiratory mucous cell | None | None | None | None | NaN | 1 | 1 | 2024-09-10 15:14:31.736369+00:00 |
110 | 5rVn0X39 | capillary aerocyte | None | None | None | None | NaN | 1 | 1 | 2024-09-10 15:14:31.725832+00:00 |
109 | 4mZaXZQg | alveolar fibroblast | None | None | None | None | NaN | 1 | 1 | 2024-09-10 15:14:31.621140+00:00 |
108 | 3hXuCKYH | perivascular cell | CL:4033054 | None | None | A Cell That Is Adjacent To A Vessel. A Perivas... | 32.0 | 1 | 1 | 2024-09-10 15:14:31.406472+00:00 |
107 | 4qrbhCCl | respiratory ciliated cell | CL:4030034 | None | ciliated cell of the respiratory tract | A Ciliated Cell Of The Respiratory System. Cil... | 32.0 | 1 | 1 | 2024-09-10 15:14:31.406416+00:00 |
106 | 2aMXs0ko | microvascular endothelial cell | CL:2000008 | None | None | Any Blood Vessel Endothelial Cell That Is Part... | 32.0 | 1 | 1 | 2024-09-10 15:14:31.406360+00:00 |
ExperimentalFactor
uid | name | ontology_id | abbr | synonyms | description | molecule | instrument | measurement | source_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||
8 | 1was9kRO | hypoxia | EFO:0009444 | None | None | A Decrease In The Amount Of Oxygen In The Body... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.563066+00:00 |
7 | 2lctIHmn | central nervous system disease | EFO:0009386 | None | disease or disorder of central nervous system|... | A Disease Involving The Central Nervous System. | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.563003+00:00 |
6 | 68LLeA7O | brain disease | EFO:0005774 | None | brain disease or disorder|disorder of brain|br... | A Disease Affecting The Brain Or Part Of The B... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.562939+00:00 |
5 | 2xDSpjH7 | cerebrovascular disorder | EFO:0003763 | None | Intracranial Vascular Disease|Vascular Disorde... | A Disorder Resulting From Inadequate Blood Flo... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.562875+00:00 |
4 | 6ISbvepx | nervous system disease | EFO:0000618 | None | disease or disorder of nervous system|nervous ... | A Non-Neoplastic Or Neoplastic Disorder That A... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.562808+00:00 |
3 | 20Nq3k7b | disease | EFO:0000408 | None | disorders|disease or disorder|medical conditio... | A Disease Is A Disposition To Undergo Patholog... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.562728+00:00 |
2 | 7R1OhRJ7 | stroke | EFO:0000712 | None | Vascular Accident, Brain|Cerebrovascular Accid... | A Sudden Loss Of Neurological Function Seconda... | None | None | None | 62 | 1 | 1 | 2024-09-10 15:14:33.108088+00:00 |
Source
uid | entity | organism | name | version | in_db | currently_used | description | url | md5 | source_website | dataframe_artifact_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
id | |||||||||||||||
62 | 69Xc | bionty.ExperimentalFactor | all | efo | 3.66.0 | False | True | The Experimental Factor Ontology | http://www.ebi.ac.uk/efo/releases/v3.66.0/efo.owl | 6bd24217c740af7e1e771c1dabc9680b | https://bioportal.bioontology.org/ontologies/EFO | None | None | 1 | 2024-09-10 15:14:33.557495+00:00 |
32 | 1Lhf | bionty.CellType | all | cl | 2024-05-15 | False | True | Cell Ontology | http://purl.obolibrary.org/obo/cl/releases/202... | 8a8638a9e79567935793e5007704c650 | https://obophenotype.github.io/cell-ontology | None | None | 1 | 2024-09-10 15:14:31.390876+00:00 |
39 | MUtA | bionty.Tissue | all | uberon | 2024-08-07 | False | True | Uberon multi-species anatomy ontology | http://purl.obolibrary.org/obo/uberon/releases... | http://obophenotype.github.io/uberon | None | None | 1 | 2024-09-10 15:14:29.813147+00:00 | |
96 | 5JnV | BioSample | all | ncbi | 2023-09 | False | True | NCBI BioSample attributes | s3://bionty-assets/df_all__ncbi__2023-09__BioS... | 918db9bd1734b97c596c67d9654a4126 | https://www.ncbi.nlm.nih.gov/biosample/docs/at... | None | None | 1 | 2024-09-10 15:14:22.115752+00:00 |
95 | MJRq | bionty.Ethnicity | human | hancestro | 3.0 | False | True | Human Ancestry Ontology | https://github.com/EBISPOT/hancestro/raw/3.0/h... | 76dd9efda9c2abd4bc32fc57c0b755dd | https://github.com/EBISPOT/hancestro | None | None | 1 | 2024-09-10 15:14:22.115691+00:00 |
94 | 6vJm | bionty.DevelopmentalStage | mouse | mmusdv | 2020-03-10 | False | False | Mouse Developmental Stages | http://aber-owl.net/media/ontologies/MMUSDV/9/... | 5bef72395d853c7f65450e6c2a1fc653 | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-09-10 15:14:22.115631+00:00 |
93 | 10va | bionty.DevelopmentalStage | mouse | mmusdv | 2024-05-28 | False | True | Mouse Developmental Stages | https://github.com/obophenotype/developmental-... | https://github.com/obophenotype/developmental-... | None | None | 1 | 2024-09-10 15:14:22.115571+00:00 |
Tissue
uid | name | ontology_id | abbr | synonyms | description | source_id | run_id | created_by_id | updated_at | |
---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||
23 | kkib4Wcs | lateral structure | UBERON:0015212 | None | None | Any Structure That Is Placed On One Side Of Th... | 39 | 1 | 1 | 2024-09-10 15:14:29.820758+00:00 |
22 | 4QeoxdKp | body proper | UBERON:0013702 | None | None | The Region Of The Organism Associated With The... | 39 | 1 | 1 | 2024-09-10 15:14:29.820702+00:00 |
21 | 3XuRxEhw | main body axis | UBERON:0013701 | None | None | A Principle Subdivision Of An Organism That In... | 39 | 1 | 1 | 2024-09-10 15:14:29.820647+00:00 |
20 | 7ZCdHnvN | subdivision of organism along main body axis | UBERON:0011676 | None | axial subdivision of organism | A Major Subdivision Of An Organism That Divide... | 39 | 1 | 1 | 2024-09-10 15:14:29.820592+00:00 |
19 | 4o2HviGe | multicellular anatomical structure | UBERON:0010000 | None | multicellular structure | An Anatomical Structure That Has More Than One... | 39 | 1 | 1 | 2024-09-10 15:14:29.820536+00:00 |
18 | 31GPuSXP | subdivision of trunk | UBERON:0009569 | None | trunk subdivision|region of trunk | None | 39 | 1 | 1 | 2024-09-10 15:14:29.820480+00:00 |
17 | 4IV77xkH | thoracic segment organ | UBERON:0005181 | None | None | An Organ That Part Of The Thoracic Segment Reg... | 39 | 1 | 1 | 2024-09-10 15:14:29.820425+00:00 |
Show code cell content
# clean up test instance
!lamin delete --force test-transfer
! calling anonymously, will miss private instances
• deleting instance anonymous/test-transfer