Spatial¶
# !pip install 'lamindb[jupyter,bionty]'
!lamin init --storage ./test-spatial --schema bionty
Show code cell output
→ connected lamindb: testuser1/test-spatial
import lamindb as ln
import bionty as bt
import matplotlib.pyplot as plt
import scanpy as sc
ln.track("daeFs3PkquDW0000")
Show code cell output
→ connected lamindb: testuser1/test-spatial
→ created Transform('daeFs3Pk'), started new Run('en31K8KA') at 2024-12-20 15:07:25 UTC
→ notebook imports: bionty==0.53.2 lamindb==0.77.3 matplotlib==3.10.0 scanpy==1.10.4
An example spatial dataset¶
Here, we have a spatial gene expression dataset measured using Visium from Suo22.
This collection contains two parts:
a high-res image of a slice of fetal liver
a single cell expression dataset in .h5ad
img_path = ln.core.datasets.file_tiff_suo22()
img = plt.imread(img_path)
plt.imshow(img)
plt.show()
adata = ln.core.datasets.anndata_suo22_Visium10X()
# subset to the same image
adata = adata[adata.obs["img_id"] == "F121_LP1_4LIV"].copy()
adata
Show code cell output
AnnData object with n_obs × n_vars = 3027 × 191
obs: 'in_tissue', 'array_row', 'array_col', 'sample', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'mt_frac', 'img_id', 'EXP_id', 'Organ', 'Fetal_id', 'SN', 'Visium_Area_id', 'Age_PCW', 'Digestion time', 'paths', 'sample_id', '_scvi_batch', '_scvi_labels', '_indices', 'total_cell_abundance'
var: 'feature_types', 'genome', 'SYMBOL', 'mt'
obsm: 'NMF', 'means_cell_abundance_w_sf', 'q05_cell_abundance_w_sf', 'q95_cell_abundance_w_sf', 'spatial', 'stds_cell_abundance_w_sf'
# plot where CD45+ leukocytes are in the slice
sc.pl.scatter(adata, "array_row", "array_col", color="ENSG00000081237")
Validate annotations¶
We’ll register the single-cell data and the image as a Collection
.
curate = ln.Curator.from_anndata(
adata,
var_index=bt.Gene.ensembl_gene_id,
categoricals={"sample": ln.ULabel.name},
organism="human",
)
Show code cell output
✓ added 1 record with Feature.name for "columns": 'sample'
curate.validate()
Show code cell output
• saving validated records of 'var_index'
✓ added 191 records from public with Gene.ensembl_gene_id for "var_index": 'ENSG00000002586', 'ENSG00000004468', 'ENSG00000004897', 'ENSG00000007312', 'ENSG00000008086', 'ENSG00000008128', 'ENSG00000010278', 'ENSG00000010610', 'ENSG00000012124', 'ENSG00000013725', 'ENSG00000019582', 'ENSG00000026508', 'ENSG00000039068', 'ENSG00000059758', 'ENSG00000062038', 'ENSG00000065883', 'ENSG00000066294', 'ENSG00000070831', 'ENSG00000071991', 'ENSG00000073754', ...
✓ "var_index" is validated against Gene.ensembl_gene_id
• mapping "sample" on ULabel.name
! 1 term is not validated: 'WSSS_F_IMMsp9838712'
→ fix typos, remove non-existent values, or save terms via .add_new_from("sample")
False
curate.add_new_from("sample")
Show code cell output
✓ added 1 record with ULabel.name for "sample": 'WSSS_F_IMMsp9838712'
curate.validate()
Show code cell output
✓ "var_index" is validated against Gene.ensembl_gene_id
✓ "sample" is validated against ULabel.name
True
Register curated artifact¶
artifact_ad = curate.save_artifact(description="Suo22 Visium10X image F121_LP1_4LIV")
Show code cell output
! 26 unique terms (96.30%) are not validated for name: 'in_tissue', 'array_row', 'array_col', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', ...
! did not create Feature records for 26 non-validated names: 'Age_PCW', 'Digestion time', 'EXP_id', 'Fetal_id', 'Organ', 'SN', 'Visium_Area_id', '_indices', '_scvi_batch', '_scvi_labels', 'array_col', 'array_row', 'img_id', 'in_tissue', 'log1p_n_genes_by_counts', 'log1p_total_counts', 'mt_frac', 'n_genes_by_counts', 'paths', 'pct_counts_in_top_100_genes', ...
artifact_ad.describe()
Show code cell output
Artifact .h5ad/AnnData ├── General │ ├── .uid = 'Xd4tEvEqbi8JqKFA0000' │ ├── .size = 9743793 │ ├── .hash = '-mhaV6jtrbJtLHCIphbrHg' │ ├── .n_observations = 3027 │ ├── .path = │ │ /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/Xd4tEvEqbi8JqKFA0000.h5ad │ ├── .created_by = testuser1 (Test User1) │ ├── .created_at = 2024-12-20 15:07:36 │ └── .transform = 'Spatial' ├── Dataset features/.feature_sets │ ├── var • 191 [bionty.Gene] │ │ CD99 float │ │ CD38 float │ │ CDC27 float │ │ CD79B float │ │ CDKL5 float │ │ CDK11A float │ │ CD9 float │ │ CD4 float │ │ CD22 float │ │ CD6 float │ │ CD74 float │ │ CD44 float │ │ CDH1 float │ │ CDK17 float │ │ CDH3 float │ │ CDK13 float │ │ CD84 float │ │ CDC42 float │ │ CDH19 float │ │ CD5L float │ └── obs • 1 [Feature] │ sample cat[ULabel] WSSS_F_IMMsp9838712 └── Labels └── .ulabels ULabel WSSS_F_IMMsp9838712
Register a collection¶
artifact_img = ln.Artifact(img_path, description="Suo22 image F121_LP1_4LIV")
artifact_img.save()
Show code cell output
• path content will be copied to default storage upon `save()` with key `None` ('.lamindb/xE3OSN2voDazhOBn0000.tiff')
✓ storing artifact 'xE3OSN2voDazhOBn0000' at '/home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/xE3OSN2voDazhOBn0000.tiff'
Artifact(uid='xE3OSN2voDazhOBn0000', is_latest=True, description='Suo22 image F121_LP1_4LIV', suffix='.tiff', size=119764004, hash='ZAnyai4Ys01P2fLR_aDIvq', _hash_type='sha1-fl', visibility=1, _key_is_virtual=True, storage_id=1, transform_id=1, run_id=1, created_by_id=1, created_at=2024-12-20 15:07:37 UTC)
collection = ln.Collection([artifact_ad, artifact_img], name="Suo22")
collection.save()
Show code cell output
Collection(uid='OFoFJv8sc6UbVXMd0000', is_latest=True, name='Suo22', hash='2cD_KhRukcNMArq-dfyyhg', visibility=1, created_by_id=1, transform_id=1, run_id=1, created_at=2024-12-20 15:07:37 UTC)
# clean up test instance
!lamin delete --force test-spatial
!rm -r test-flow
Show code cell output
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.11.11/x64/bin/lamin", line 8, in <module>
sys.exit(main())
^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/rich_click/rich_command.py", line 367, in __call__
return super().__call__(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/rich_click/rich_command.py", line 152, in main
rv = self.invoke(ctx)
^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/lamin_cli/__main__.py", line 209, in delete
return delete(instance, force=force)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/lamindb_setup/_delete.py", line 102, in delete
n_objects = check_storage_is_empty(
^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.11/x64/lib/python3.11/site-packages/lamindb_setup/core/upath.py", line 836, in check_storage_is_empty
raise InstanceNotEmpty(message)
lamindb_setup.core.upath.InstanceNotEmpty: Storage '/home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb' contains 2 objects - delete them prior to deleting the instance
rm: cannot remove 'test-flow': No such file or directory