Jupyter Notebook

Spatial

Here, you’ll learn how to manage spatial datasets:

  1. curate and ingest spatial data (spatial1/4)

  2. query & analyze spatial datasets (spatial2/4)

  3. load the collection into memory & train a ML model (spatial3/4)

  4. create and share interactive visualizations with vitessce (spatial4/4)

Spatial omics data integrates molecular profiling (e.g., transcriptomics, proteomics) with spatial information, preserving the spatial organization of cells and tissues. It enables high-resolution mapping of molecular activity within biological contexts, crucial for understanding cellular interactions and microenvironments.

Many different spatial technologies such as multiplexed imaging, spatial transcriptomics, spatial proteomics, whole-slide imaging, spatial metabolomics, and 3D tissue reconstruction exist which can all be stored in the SpatialData data framework. For more details we refer to the original publication:

Marconato, L., Palla, G., Yamauchi, K.A. et al. SpatialData: an open and universal data framework for spatial omics. Nat Methods 22, 58–62 (2025). https://doi.org/10.1038/s41592-024-02212-x

Note

A collection of curated spatial datasets in SpatialData format is available on the scverse/spatialdata-db instance.

spatial data vs SpatialData terminology

When we mention spatial data, we refer to data from spatial assays, such as spatial transcriptomics or proteomics, that includes spatial coordinates to represent the organization of molecular features in tissue. When we refer SpatialData, we mean spatial omics data stored in the scverse SpatialData framework.

# pip install 'lamindb[jupyter,bionty]' spatialdata spatialdata-plot
!lamin init --storage ./test-spatial --modules bionty
Hide code cell output
 initialized lamindb: testuser1/test-spatial
import lamindb as ln
import bionty as bt
import spatialdata as sd
import warnings

warnings.filterwarnings("ignore")

spatial_guide_datasets = ln.Project(name="spatial guide datasets").save()
ln.track(project=spatial_guide_datasets)
Hide code cell output
 connected lamindb: testuser1/test-spatial
/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/xarray_schema/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import DistributionNotFound, get_distribution
 created Transform('T1OMY3rtuw2z0000', key='spatial.ipynb'), started new Run('BQdBaxymLEK5qCV1') at 2025-10-30 18:59:14 UTC
 notebook imports: bionty==1.8.1 lamindb==1.15a1 spatialdata==0.5.0
 recommendation: to identify the notebook across renames, pass the uid: ln.track("T1OMY3rtuw2z", project="spatial guide datasets")

Creating artifacts

You can use from_spatialdata() method to create an Artifact object from a SpatialData object.

example_blobs_sdata = ln.core.datasets.spatialdata_blobs()
example_blobs_sdata
Hide code cell output
SpatialData object
├── Images
│     ├── 'blobs_image': DataArray[cyx] (3, 512, 512)
│     └── 'blobs_multiscale_image': DataTree[cyx] (3, 512, 512), (3, 256, 256), (3, 128, 128)
├── Labels
│     ├── 'blobs_labels': DataArray[yx] (512, 512)
│     └── 'blobs_multiscale_labels': DataTree[yx] (512, 512), (256, 256), (128, 128)
├── Points
│     └── 'blobs_points': DataFrame with shape: (<Delayed>, 4) (2D points)
├── Shapes
│     ├── 'blobs_circles': GeoDataFrame shape: (5, 2) (2D shapes)
│     ├── 'blobs_multipolygons': GeoDataFrame shape: (2, 1) (2D shapes)
│     └── 'blobs_polygons': GeoDataFrame shape: (5, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (26, 3)
with coordinate systems:
    ▸ 'global', with elements:
        blobs_image (Images), blobs_multiscale_image (Images), blobs_labels (Labels), blobs_multiscale_labels (Labels), blobs_points (Points), blobs_circles (Shapes), blobs_multipolygons (Shapes), blobs_polygons (Shapes)
blobs_af = ln.Artifact.from_spatialdata(
    example_blobs_sdata, key="example_blobs.zarr"
).save()
blobs_af
Hide code cell output
 writing the in-memory object into cache
INFO     The Zarr backing store has been changed from None the new file path:                                      
         /home/runner/.cache/lamindb/cyXf0Vym1gBwRMIr0000.zarr                                                     
Artifact(uid='cyXf0Vym1gBwRMIr0000', version=None, is_latest=True, key='example_blobs.zarr', description=None, suffix='.zarr', kind='dataset', otype='SpatialData', size=12122461, hash='LZ9HLHkw8HhoOzS7Vt_VSg', n_files=116, n_observations=None, branch_id=1, space_id=1, storage_id=1, run_id=1, schema_id=None, created_by_id=1, created_at=2025-10-30 18:59:16 UTC, is_locked=False)

To retrieve the object back from the database you can, e.g., query by key.

example_blobs_sdata = ln.Artifact.get(key="example_blobs.zarr")
local_zarr_path = blobs_af.cache()  # returns a local path to the cached .zarr store
example_blobs_sdata = (
    blobs_af.load()  # calls sd.read_zarr() on a locally cached .zarr store
)

To see data lineage.

blobs_af.view_lineage()
Hide code cell output
_images/d7f9cf06ee1902b74035f7b5e16955bdf24897a22c051bbb35ff8f3c3a078eab.svg

Curating artifacts

For the remainder of the guide, we will work with two 10X Xenium and a 10X Visium H&E image datasets that were ingested in raw form here.

Metadata is stored in two places in the SpatialData object:

  1. Dataset level metadata is stored in sdata.attrs["sample"].

  2. Measurement specific metadata is stored in the associated tables in sdata.tables.

Define a schema

We define a lamindb.Schema to curate both sample and table metadata.

Curating different spatial technologies

Reading different spatial technologies into SpatialData objects can result in very different objects with different metadata. Therefore, it can be useful to define technology specific Schemas by reusing Schema components.

# define features
ln.Feature(name="organism", dtype=bt.Organism).save()
ln.Feature(name="assay", dtype=bt.ExperimentalFactor).save()
ln.Feature(name="disease", dtype=bt.Disease).save()
ln.Feature(name="tissue", dtype=bt.Tissue).save()
ln.Feature(name="celltype_major", dtype=bt.CellType, nullable=True).save()

# define simple schemas
flexible_metadata_schema = ln.Schema(
    name="Flexible metadata", itype=ln.Feature, coerce_dtype=True
).save()
ensembl_gene_ids = ln.Schema(
    name="Spatial var level (Ensembl gene id)", itype=bt.Gene.ensembl_gene_id
).save()

# define composite schema
spatial_schema = ln.Schema(
    name="Spatialdata schema (flexible)",
    otype="SpatialData",
    slots={
        "attrs:sample": flexible_metadata_schema,
        "tables:table:obs": flexible_metadata_schema,
        "tables:table:var.T": ensembl_gene_ids,
    },
).save()

Curate a Xenium dataset

# load first of two cropped Xenium datasets
xenium_aligned_1_sdata = (
    ln.Artifact.using("laminlabs/lamindata")
    .get(key="xenium_aligned_1_guide_min.zarr")
    .load()
)
xenium_aligned_1_sdata
Hide code cell output
 transferred: Artifact(uid='kVMuYil81BHTwQ9G0001'), Storage(uid='D9BilDV2')
SpatialData object, with associated Zarr store: /home/runner/.cache/lamindb/lamindata/xenium_aligned_1_guide_min.zarr
├── Images
│     ├── 'morphology_focus': DataTree[cyx] (1, 2310, 3027), (1, 1155, 1514), (1, 578, 757), (1, 288, 379), (1, 145, 189)
│     └── 'morphology_mip': DataTree[cyx] (1, 2310, 3027), (1, 1155, 1514), (1, 578, 757), (1, 288, 379), (1, 145, 189)
├── Points
│     └── 'transcripts': DataFrame with shape: (<Delayed>, 8) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (1899, 1) (2D shapes)
│     └── 'cell_circles': GeoDataFrame shape: (1812, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (1812, 313)
with coordinate systems:
    ▸ 'aligned', with elements:
        morphology_focus (Images), morphology_mip (Images), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes)
    ▸ 'global', with elements:
        morphology_focus (Images), morphology_mip (Images), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes)
xenium_curator = ln.curators.SpatialDataCurator(xenium_aligned_1_sdata, spatial_schema)
try:
    xenium_curator.validate()
except ln.errors.ValidationError as error:
    print(error)
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! 9 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'CAFs', 'Endothelial', 'Myeloid', 'PVL', 'T-cells', 'B-cells', 'Normal Epithelial', 'Plasmablasts', 'Cancer Epithelial'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
9 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'CAFs', 'Endothelial', 'Myeloid', 'PVL', 'T-cells', 'B-cells', 'Normal Epithelial', 'Plasmablasts', 'Cancer Epithelial'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
xenium_aligned_1_sdata.tables["table"].obs["celltype_major"] = (
    xenium_aligned_1_sdata.tables["table"]
    .obs["celltype_major"]
    .replace(
        {
            "CAFs": "cancer associated fibroblast",
            "Endothelial": "endothelial cell",
            "Myeloid": "myeloid cell",
            "PVL": "perivascular cell",
            "T-cells": "T cell",
            "B-cells": "B cell",
            "Normal Epithelial": "epithelial cell",
            "Plasmablasts": "plasmablast",
            "Cancer Epithelial": "neoplastic epithelial cell",
        }
    )
)
try:
    xenium_curator.validate()
except ln.errors.ValidationError as error:
    print(error)
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! 2 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'cancer associated fibroblast', 'neoplastic epithelial cell'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
2 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'cancer associated fibroblast', 'neoplastic epithelial cell'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
xenium_curator.slots["tables:table:obs"].cat.add_new_from("celltype_major")
xenium_1_curated_af = xenium_curator.save_artifact(key="xenium1.zarr")
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
 writing the in-memory object into cache
INFO     The SpatialData object is not self-contained (i.e. it contains some elements that are Dask-backed from    
         locations outside /home/runner/.cache/lamindb/3AYeC9diJJliwPdS0000.zarr). Please see the documentation of 
         `is_self_contained()` to understand the implications of working with SpatialData objects that are not     
         self-contained.                                                                                           
INFO     The Zarr backing store has been changed from                                                              
         /home/runner/.cache/lamindb/lamindata/xenium_aligned_1_guide_min.zarr the new file path:                  
         /home/runner/.cache/lamindb/3AYeC9diJJliwPdS0000.zarr                                                     
xenium_1_curated_af.describe()
Hide code cell output
Artifact: xenium1.zarr (0000)
├── uid: 3AYeC9diJJliwPdS0000            run: BQdBaxy (spatial.ipynb)
kind: dataset                        otype: SpatialData          
hash: 68_dzfaDidoKacs0glIGNg         size: 33.5 MB               
branch: main                         space: all                  
created_at: 2025-10-30 18:59:35 UTC  created_by: testuser1       
n_files: 148                                                     
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/3AYeC9diJJliwPdS.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                           bionty.ExperimentalFactor          10x Xenium                              
│   disease                         bionty.Disease                     ductal breast carcinoma in situ         
│   organism                        bionty.Organism                    human                                   
│   tissue                          bionty.Tissue                      breast                                  
├── tables:table:obs (1)                                                                                       
│   celltype_major                  bionty.CellType                    B cell, T cell, cancer associated fibro…
└── tables:table:var.T (313 biont…                                                                             
    ABCC11                          num                                                                        
    ACTA2                           num                                                                        
    ACTG2                           num                                                                        
    ADAM9                           num                                                                        
    ADGRE5                          num                                                                        
    ADH1B                           num                                                                        
    ADIPOQ                          num                                                                        
    AGR3                            num                                                                        
    AHSP                            num                                                                        
    AIF1                            num                                                                        
    AKR1C1                          num                                                                        
    AKR1C3                          num                                                                        
    ALDH1A3                         num                                                                        
    ANGPT2                          num                                                                        
    ANKRD28                         num                                                                        
    ANKRD29                         num                                                                        
    ANKRD30A                        num                                                                        
    APOBEC3A                        num                                                                        
    APOBEC3B                        num                                                                        
    APOC1                           num                                                                        
└── Labels
    └── .projects                       Project                            spatial guide datasets                  
        .organisms                      bionty.Organism                    human                                   
        .tissues                        bionty.Tissue                      breast                                  
        .cell_types                     bionty.CellType                    endothelial cell, myeloid cell, perivas…
        .diseases                       bionty.Disease                     ductal breast carcinoma in situ         
        .experimental_factors           bionty.ExperimentalFactor          10x Xenium                              

Curate additional Xenium datasets

We can reuse the same curator for a second Xenium dataset:

xenium_aligned_2_sdata = (
    ln.Artifact.using("laminlabs/lamindata")
    .get(key="xenium_aligned_2_guide_min.zarr")
    .load()
)

xenium_aligned_2_sdata.tables["table"].obs["celltype_major"] = (
    xenium_aligned_2_sdata.tables["table"]
    .obs["celltype_major"]
    .replace(
        {
            "CAFs": "cancer associated fibroblast",
            "Endothelial": "endothelial cell",
            "Myeloid": "myeloid cell",
            "PVL": "perivascular cell",
            "T-cells": "T cell",
            "B-cells": "B cell",
            "Normal Epithelial": "epithelial cell",
            "Plasmablasts": "plasmablast",
            "Cancer Epithelial": "neoplastic epithelial cell",
        }
    )
)
Hide code cell output
 transferred: Artifact(uid='KFhRNPqcdoxBCNZt0001')
xenium_2_curated_af = ln.Artifact.from_spatialdata(
    xenium_aligned_2_sdata, key="xenium2.zarr", schema=spatial_schema
).save()
Hide code cell output
 writing the in-memory object into cache
INFO     The SpatialData object is not self-contained (i.e. it contains some elements that are Dask-backed from    
         locations outside /home/runner/.cache/lamindb/AQNNZvAEg7xBnPMe0000.zarr). Please see the documentation of 
         `is_self_contained()` to understand the implications of working with SpatialData objects that are not     
         self-contained.                                                                                           
INFO     The Zarr backing store has been changed from                                                              
         /home/runner/.cache/lamindb/lamindata/xenium_aligned_2_guide_min.zarr the new file path:                  
         /home/runner/.cache/lamindb/AQNNZvAEg7xBnPMe0000.zarr                                                     
 loading artifact into memory for validation
INFO     The SpatialData object is not self-contained (i.e. it contains some elements that are Dask-backed from    
         locations outside /home/runner/.cache/lamindb/AQNNZvAEg7xBnPMe0000.zarr). Please see the documentation of 
         `is_self_contained()` to understand the implications of working with SpatialData objects that are not     
         self-contained.                                                                                           
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
 returning schema with same hash: Schema(uid='6cLgKMjC86L4fBl8', name=None, description=None, n=4, is_type=False, itype='Feature', otype=None, dtype=None, hash='L1oI2t9ZRMJIZ-HpbugLJQ', minimal_set=True, ordered_set=False, maximal_set=False, slot=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, validated_by_id=None, composite_id=None, created_at=2025-10-30 18:59:35 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='MIQhXQY4peBBBfj8', name=None, description=None, n=1, is_type=False, itype='Feature', otype=None, dtype=None, hash='n3UpwCoB80EtvbjvdpRAqA', minimal_set=True, ordered_set=False, maximal_set=False, slot=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, validated_by_id=None, composite_id=None, created_at=2025-10-30 18:59:35 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='NW8wNBdsAkkr22X3', name=None, description=None, n=313, is_type=False, itype='bionty.Gene.ensembl_gene_id', otype=None, dtype='num', hash='FFFt-2qmlVALrsMUPNoH0g', minimal_set=True, ordered_set=False, maximal_set=False, slot=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, validated_by_id=None, composite_id=None, created_at=2025-10-30 18:59:35 UTC, is_locked=False)
xenium_2_curated_af.describe()
Hide code cell output
Artifact: xenium2.zarr (0000)
├── uid: AQNNZvAEg7xBnPMe0000            run: BQdBaxy (spatial.ipynb)
kind: dataset                        otype: SpatialData          
hash: noI1oD6jyNbhK3yysxHjAw         size: 38.9 MB               
branch: main                         space: all                  
created_at: 2025-10-30 18:59:39 UTC  created_by: testuser1       
n_files: 177                                                     
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/AQNNZvAEg7xBnPMe.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                           bionty.ExperimentalFactor          10x Xenium                              
│   disease                         bionty.Disease                     ductal breast carcinoma in situ         
│   organism                        bionty.Organism                    human                                   
│   tissue                          bionty.Tissue                      breast                                  
├── tables:table:obs (1)                                                                                       
│   celltype_major                  bionty.CellType                    B cell, T cell, cancer associated fibro…
└── tables:table:var.T (313 biont…                                                                             
    ABCC11                          num                                                                        
    ACTA2                           num                                                                        
    ACTG2                           num                                                                        
    ADAM9                           num                                                                        
    ADGRE5                          num                                                                        
    ADH1B                           num                                                                        
    ADIPOQ                          num                                                                        
    AGR3                            num                                                                        
    AHSP                            num                                                                        
    AIF1                            num                                                                        
    AKR1C1                          num                                                                        
    AKR1C3                          num                                                                        
    ALDH1A3                         num                                                                        
    ANGPT2                          num                                                                        
    ANKRD28                         num                                                                        
    ANKRD29                         num                                                                        
    ANKRD30A                        num                                                                        
    APOBEC3A                        num                                                                        
    APOBEC3B                        num                                                                        
    APOC1                           num                                                                        
└── Labels
    └── .projects                       Project                            spatial guide datasets                  
        .organisms                      bionty.Organism                    human                                   
        .tissues                        bionty.Tissue                      breast                                  
        .cell_types                     bionty.CellType                    endothelial cell, myeloid cell, perivas…
        .diseases                       bionty.Disease                     ductal breast carcinoma in situ         
        .experimental_factors           bionty.ExperimentalFactor          10x Xenium                              

Curate Visium datasets

Analogously, we can define a Schema and Curator for Visium datasets:

visium_aligned_sdata = (
    ln.Artifact.using("laminlabs/lamindata")
    .get(key="visium_aligned_guide_min.zarr")
    .load()
)
visium_aligned_sdata
Hide code cell output
 transferred: Artifact(uid='bjH534dxVi1drmLZ0001')
SpatialData object, with associated Zarr store: /home/runner/.cache/lamindb/lamindata/visium_aligned_guide_min.zarr
├── Images
│     ├── 'CytAssist_FFPE_Human_Breast_Cancer_full_image': DataTree[cyx] (3, 1213, 952), (3, 607, 476), (3, 303, 238), (3, 152, 119), (3, 76, 60)
│     ├── 'CytAssist_FFPE_Human_Breast_Cancer_hires_image': DataArray[cyx] (3, 113, 88)
│     └── 'CytAssist_FFPE_Human_Breast_Cancer_lowres_image': DataArray[cyx] (3, 34, 27)
├── Shapes
│     └── 'CytAssist_FFPE_Human_Breast_Cancer': GeoDataFrame shape: (37, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (37, 18085)
with coordinate systems:
    ▸ 'aligned', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_full_image (Images), CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'downscaled_hires', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'downscaled_lowres', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'global', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_full_image (Images), CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
visium_curated_af = ln.Artifact.from_spatialdata(
    visium_aligned_sdata, key="visium.zarr", schema=spatial_schema
).save()
Hide code cell output
 writing the in-memory object into cache
INFO     The SpatialData object is not self-contained (i.e. it contains some elements that are Dask-backed from    
         locations outside /home/runner/.cache/lamindb/OsKRqrNTuvRkHVns0000.zarr). Please see the documentation of 
         `is_self_contained()` to understand the implications of working with SpatialData objects that are not     
         self-contained.                                                                                           
INFO     The Zarr backing store has been changed from                                                              
         /home/runner/.cache/lamindb/lamindata/visium_aligned_guide_min.zarr the new file path:                    
         /home/runner/.cache/lamindb/OsKRqrNTuvRkHVns0000.zarr                                                     
 loading artifact into memory for validation
INFO     The SpatialData object is not self-contained (i.e. it contains some elements that are Dask-backed from    
         locations outside /home/runner/.cache/lamindb/OsKRqrNTuvRkHVns0000.zarr). Please see the documentation of 
         `is_self_contained()` to understand the implications of working with SpatialData objects that are not     
         self-contained.                                                                                           
! 7 terms not validated in feature 'columns' in slot 'tables:table:obs': 'in_tissue', 'array_row', 'array_col', 'spot_id', 'region', 'dataset', 'clone'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! no values were validated for columns!
 starting creation of 17761 Gene records in batches of 10000
! 17 terms not validated in feature 'columns' in slot 'tables:table:var.T': 'ENSG00000284824', 'ENSG00000240224', 'ENSG00000243135', 'ENSG00000112096', 'ENSG00000285162', 'ENSG00000183729', 'ENSG00000285447', 'ENSG00000130723', 'ENSG00000274897', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000183791', 'ENSG00000263264', 'ENSG00000182584', 'ENSG00000184258', 'ENSG00000277203', 'ENSG00000286265'
    → fix organism 'human', fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:var.T'].cat.add_new_from('columns')
 returning schema with same hash: Schema(uid='6cLgKMjC86L4fBl8', name=None, description=None, n=4, is_type=False, itype='Feature', otype=None, dtype=None, hash='L1oI2t9ZRMJIZ-HpbugLJQ', minimal_set=True, ordered_set=False, maximal_set=False, slot=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, validated_by_id=None, composite_id=None, created_at=2025-10-30 18:59:35 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='cQs2iMBESo7Pv9z8', name='Flexible metadata', description=None, is_type=False, itype='Feature', otype=None, dtype=None, hash='jKTX5yzmVwIdJdHH2ZfMAA', minimal_set=True, ordered_set=False, maximal_set=False, slot=None, branch_id=1, space_id=1, created_by_id=1, run_id=1, type_id=None, validated_by_id=None, composite_id=None, created_at=2025-10-30 18:59:17 UTC, is_locked=False)
 not annotating with 18068 features for slot tables:table:var.T as it exceeds 1000 (ln.settings.annotation.n_max_records)
visium_curated_af.describe()
Hide code cell output
Artifact: visium.zarr (0000)
├── uid: OsKRqrNTuvRkHVns0000            run: BQdBaxy (spatial.ipynb)
kind: dataset                        otype: SpatialData          
hash: K5LTYUrbsx9OfVjYLWv1tw         size: 5.5 MB                
branch: main                         space: all                  
created_at: 2025-10-30 18:59:52 UTC  created_by: testuser1       
n_files: 136                                                     
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/OsKRqrNTuvRkHVns.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                           bionty.ExperimentalFactor          Visium Spatial Gene Expression          
│   disease                         bionty.Disease                     ductal breast carcinoma in situ         
│   organism                        bionty.Organism                    human                                   
│   tissue                          bionty.Tissue                      breast                                  
├── tables:table:obs (-1)                                                                                      
└── tables:table:var.T (18068 bio…                                                                             
└── Labels
    └── .projects                       Project                            spatial guide datasets                  
        .organisms                      bionty.Organism                    human                                   
        .tissues                        bionty.Tissue                      breast                                  
        .diseases                       bionty.Disease                     ductal breast carcinoma in situ         
        .experimental_factors           bionty.ExperimentalFactor          Visium Spatial Gene Expression          

Overview of the curated datasets

visium_curated_af.view_lineage()
_images/cffd7da9bf87932161533c8af8ae6cc7a2f0db36f66d15a2486dbe82ace7c581.svg
ln.Artifact.to_dataframe(features=True, include=["hash", "size"])
 queried for all categorical features with dtype Record and non-categorical features: (0) []
uid key size hash
id
7 OsKRqrNTuvRkHVns0000 visium.zarr 5810515 K5LTYUrbsx9OfVjYLWv1tw
6 bjH534dxVi1drmLZ0001 visium_aligned_guide_min.zarr 5809684 a8rVkf_kjp9To9KI06i03g
5 AQNNZvAEg7xBnPMe0000 xenium2.zarr 40823410 noI1oD6jyNbhK3yysxHjAw
4 KFhRNPqcdoxBCNZt0001 xenium_aligned_2_guide_min.zarr 40822308 oH569Lh4koYRB1I6AatnGQ
3 3AYeC9diJJliwPdS0000 xenium1.zarr 35116259 68_dzfaDidoKacs0glIGNg
2 kVMuYil81BHTwQ9G0001 xenium_aligned_1_guide_min.zarr 35115305 8f1qC6IkpSvFw2H8TdhplQ
1 cyXf0Vym1gBwRMIr0000 example_blobs.zarr 12122461 LZ9HLHkw8HhoOzS7Vt_VSg
ln.finish()
Hide code cell output
 finished Run('BQdBaxymLEK5qCV1') after 40s at 2025-10-30 18:59:54 UTC