Transfer data

Here, we’ll show how to transfer data from another instance into the current instance.

# !pip install 'lamindb[jupyter,aws,bionty]'
!lamin init --storage ./test-transfer --schema bionty
Hide code cell output
💡 connected lamindb: anonymous/test-transfer
import lamindb as ln

ln.settings.transform.stem_uid = "ITeOtm7bhtdq"
ln.settings.transform.version = "1"
ln.track()
Hide code cell output
💡 connected lamindb: anonymous/test-transfer
💡 notebook imports: lamindb==0.74.3
💡 saved: Transform(uid='ITeOtm7bhtdq5zKv', version='1', name='Transfer data', key='transfer', type='notebook', created_by_id=1, updated_at='2024-07-26 14:37:06 UTC')
💡 saved: Run(uid='xvh3aooj8v7Dc3cYGxQT', transform_id=1, created_by_id=1)
Run(uid='xvh3aooj8v7Dc3cYGxQT', started_at='2024-07-26 14:37:06 UTC', is_consecutive=True, transform_id=1, created_by_id=1)

All artifacts in the laminlabs/lamindata clone of CZ CELLxGENE (for more info, see cellxgene):

artifacts = ln.Artifact.using("laminlabs/lamindata")
artifacts.df().head()
Hide code cell output
uid version description key suffix type accessor size hash hash_type n_objects n_observations visibility key_is_virtual storage_id transform_id run_id created_by_id updated_at
id
781 V01p43DuAnerlhiVAXmW None hits from schmidt22 crispra GWS None .parquet None DataFrame 18368 txyBucy8ZdQ42HR76PRPtA md5 NaN NaN 1 True 2 106.0 218.0 2 2024-06-17 09:16:00.967539+00:00
450 MbTyShN2FrU9IHKDiMwx 1 Source of transform Nv48yAceNSh85zKv None .ipynb None None 10019 H6AzzBUaXFDuw79YE4OouQ md5 NaN NaN 0 True 2 NaN NaN 9 2024-01-03 00:33:24.963836+00:00
451 3q7AhNt3DId2KoCdS206 None requirements.txt None .txt None None 8673 7Q1h9ePykHz1mEdxe5hQoA md5 NaN NaN 0 True 2 NaN NaN 9 2024-01-03 00:33:26.599672+00:00
452 kReqQ8BoaXlvWyzeAsiS 3 Report of transform Nv48yAceNSh85zKv None .html None None 317980 BevkxU-ppDie_k5PQs5J6A md5 NaN NaN 0 True 2 NaN NaN 9 2024-01-03 00:33:27.680069+00:00
782 zMe3L3txyJ2voFxL3D9L 2 Source of transform PtTXoc0RbOIq65cN None .ipynb None None 4533 S-8tqKXtOJysvJUZ96_r1Q md5 NaN NaN 0 True 2 NaN NaN 2 2024-06-17 09:22:31.425987+00:00

Query or search the queryset:

artifact = artifacts.filter(description__icontains="tabula sapiens").first()
artifact
Hide code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', hash_type='sha1-fl', visibility=1, key_is_virtual=False, created_by_id=3, storage_id=2, transform_id=8, run_id=9, updated_at='2023-10-14 15:40:52 UTC')

Save the artifact to the default instance:

artifact.save()
Hide code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', hash_type='sha1-fl', visibility=1, key_is_virtual=False, created_by_id=1, storage_id=2, transform_id=1, run_id=1, updated_at='2024-07-26 14:37:07 UTC')

All features & labels have been transferred, the data itself is still on CZ’s S3:

artifact.describe()
Hide code cell output
Artifact(uid='dPraor9rU1EofcFb6Wph', description='Part of Tabula Sapiens, a benchmark, first-draft human cell atlas.', key='tabula_sapiens_lung.h5ad', suffix='.h5ad', size=3899435772, hash='8mB1KK2wd51F6HQdvqipcQ', hash_type='sha1-fl', visibility=1, key_is_virtual=False, updated_at='2024-07-26 14:37:07 UTC')
  Provenance
    .created_by = 'anonymous'
    .storage = 's3://lamindata'
    .transform = 'Transfer data'
    .run = '2024-07-26 14:37:06 UTC'

The database is populated correspondingly.

ln.view()
Hide code cell output
****************
* module: core *
****************
Artifact
uid version description key suffix type accessor size hash hash_type n_objects n_observations visibility key_is_virtual storage_id transform_id run_id created_by_id updated_at
id
1 dPraor9rU1EofcFb6Wph None Part of Tabula Sapiens, a benchmark, first-dra... tabula_sapiens_lung.h5ad .h5ad None None 3899435772 8mB1KK2wd51F6HQdvqipcQ sha1-fl None None 1 False 2 1 1 1 2024-07-26 14:37:07.262661+00:00
Run
uid started_at finished_at is_consecutive reference reference_type transform_id report_id environment_id created_by_id
id
1 xvh3aooj8v7Dc3cYGxQT 2024-07-26 14:37:06.109651+00:00 None True None None 1 None None 1
Storage
uid root description type region instance_uid run_id created_by_id updated_at
id
2 D9BilDV2 s3://lamindata None s3 us-east-1 None 1.0 1 2024-07-26 14:37:07.260378+00:00
1 Ms8IAynZTXx0 /home/runner/work/lamindb/lamindb/docs/test-tr... None local None None NaN 1 2024-07-26 14:37:03.490208+00:00
Transform
uid version name key description type reference reference_type latest_report_id source_code_id created_by_id updated_at
id
1 ITeOtm7bhtdq5zKv 1 Transfer data transfer None notebook None None None None 1 2024-07-26 14:37:06.102340+00:00
User
uid handle name updated_at
id
1 00000000 anonymous None 2024-07-26 14:37:03.486268+00:00
******************
* module: bionty *
******************
Source
uid entity organism source version in_db currently_used source_name url md5 source_website df_id run_id created_by_id updated_at
id
75 3pvh BioSample all ncbi 2023-09 False True NCBI BioSample attributes s3://bionty-assets/df_all__ncbi__2023-09__BioS... 918db9bd1734b97c596c67d9654a4126 https://www.ncbi.nlm.nih.gov/biosample/docs/at... None None 1 2024-07-26 14:37:03.603298+00:00
74 5kwU Ethnicity human hancestro 3.0 False True Human Ancestry Ontology https://github.com/EBISPOT/hancestro/raw/3.0/h... 76dd9efda9c2abd4bc32fc57c0b755dd https://github.com/EBISPOT/hancestro None None 1 2024-07-26 14:37:03.603135+00:00
73 4hcb DevelopmentalStage mouse mmusdv 2020-03-10 False True Mouse Developmental Stages http://aber-owl.net/media/ontologies/MMUSDV/9/... 5bef72395d853c7f65450e6c2a1fc653 https://github.com/obophenotype/developmental-... None None 1 2024-07-26 14:37:03.602968+00:00
72 238S DevelopmentalStage human hsapdv 2020-03-10 False True Human Developmental Stages http://aber-owl.net/media/ontologies/HSAPDV/11... 52181d59df84578ed69214a5cb614036 https://github.com/obophenotype/developmental-... None None 1 2024-07-26 14:37:03.602701+00:00
71 1auD Drug all dron 2023-03-10 False False Drug Ontology https://data.bioontology.org/ontologies/DRON/s... 75e86011158fae76bb46d96662a33ba3 https://bioportal.bioontology.org/ontologies/DRON None None 1 2024-07-26 14:37:03.602358+00:00
70 4uDt Drug all dron 2024-03-02 False True Drug Ontology https://data.bioontology.org/ontologies/DRON/s... 84138459de4f65034e979f4e46783747 https://bioportal.bioontology.org/ontologies/DRON None None 1 2024-07-26 14:37:03.602123+00:00
69 5e83 BFXPipeline all lamin 1.0.0 False True Bioinformatics Pipeline s3://bionty-assets/bfxpipelines.json a7eff57a256994692fba46e0199ffc94 https://lamin.ai None None 1 2024-07-26 14:37:03.601960+00:00
Hide code cell content
# clean up test instance
!lamin delete --force test-transfer
!rm -r test-transfer
❗ calling anonymously, will miss private instances
💡 deleting instance anonymous/test-transfer
rm: cannot remove 'test-transfer': No such file or directory