Curate and ingest spatial data .md .md

Now that we’ve analyzed and visualized the example dataset in the previous notebooks, let’s learn how to curate and ingest our own spatial data.

import lamindb as ln
import bionty as bt
import spatialdata as sd

ln.track()
Hide code cell output
 connected lamindb: testuser1/test-spatial
 created Transform('BfBEOezzv6rs0000', key='spatial3.ipynb'), started new Run('JX8pXm2ngXC9DoIY') at 2026-02-11 19:58:06 UTC
 notebook imports: bionty==2.1.0 lamindb==2.1.2 spatialdata==0.7.2
 recommendation: to identify the notebook across renames, pass the uid: ln.track("BfBEOezzv6rs")

Creating artifacts

You can use from_spatialdata() method to create an Artifact object from a SpatialData object.

example_blobs_sdata = ln.core.datasets.spatialdata_blobs()
example_blobs_sdata
Hide code cell output
SpatialData object
├── Images
│     ├── 'blobs_image': DataArray[cyx] (3, 512, 512)
│     └── 'blobs_multiscale_image': DataTree[cyx] (3, 512, 512), (3, 256, 256), (3, 128, 128)
├── Labels
│     ├── 'blobs_labels': DataArray[yx] (512, 512)
│     └── 'blobs_multiscale_labels': DataTree[yx] (512, 512), (256, 256), (128, 128)
├── Points
│     └── 'blobs_points': DataFrame with shape: (<Delayed>, 4) (2D points)
├── Shapes
│     ├── 'blobs_circles': GeoDataFrame shape: (5, 2) (2D shapes)
│     ├── 'blobs_multipolygons': GeoDataFrame shape: (2, 1) (2D shapes)
│     └── 'blobs_polygons': GeoDataFrame shape: (5, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (26, 3)
with coordinate systems:
    ▸ 'global', with elements:
        blobs_image (Images), blobs_multiscale_image (Images), blobs_labels (Labels), blobs_multiscale_labels (Labels), blobs_points (Points), blobs_circles (Shapes), blobs_multipolygons (Shapes), blobs_polygons (Shapes)
blobs_af = ln.Artifact.from_spatialdata(
    example_blobs_sdata, key="example_blobs.zarr"
).save()
blobs_af
Hide code cell output
 writing the in-memory object into cache
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/ome_zarr/writer.py:319: FutureWarning: Passing storage-related arguments via **kwargs is deprecated. Please use the 'zarr_store_kwargs' parameter instead. **kwargs will be removed in a future version.
  da_delayed = da.to_zarr(
Artifact(uid='xO5vi2NoJbZOaZbP0000', version_tag=None, is_latest=True, key='example_blobs.zarr', description=None, suffix='.zarr', kind='dataset', otype='SpatialData', size=13054005, hash='isnwpiOBrO4To68q0RBaEw', n_files=79, n_observations=None, branch_id=1, space_id=1, storage_id=3, run_id=3, schema_id=None, created_by_id=3, created_at=2026-02-11 19:58:08 UTC, is_locked=False)

To retrieve the object back from the database you can, e.g., query by key:

example_blobs_sdata = ln.Artifact.get(key="example_blobs.zarr")
local_zarr_path = blobs_af.cache()  # returns a local path to the cached .zarr store
example_blobs_sdata = (
    blobs_af.load()  # calls sd.read_zarr() on a locally cached .zarr store
)

To see data lineage:

blobs_af.view_lineage()
Hide code cell output
_images/9e4f2220049e83f5d452c9f5a9b0b7ddc7d1c2667e5baaf5a7a7bcb889fe457b.svg

Curating artifacts

For the remainder of the guide, we will work with two 10X Xenium and a 10X Visium H&E image datasets that were ingested in raw form here.

Metadata is stored in two places in the SpatialData object:

  1. Dataset level metadata is stored in sdata.attrs["sample"].

  2. Measurement specific metadata is stored in the associated tables in sdata.tables.

Define a schema

We define a lamindb.Schema to curate both sample and table metadata.

Curating different spatial technologies

Reading different spatial technologies into SpatialData objects can result in very different objects with different metadata. Therefore, it can be useful to define technology specific Schemas by reusing Schema components.

# define features
ln.Feature(name="organism", dtype=bt.Organism).save()
ln.Feature(name="assay", dtype=bt.ExperimentalFactor).save()
ln.Feature(name="disease", dtype=bt.Disease).save()
ln.Feature(name="tissue", dtype=bt.Tissue).save()
ln.Feature(name="celltype_major", dtype=bt.CellType).save()

# define simple schemas
flexible_metadata_schema = ln.Schema(
    name="Flexible metadata", itype=ln.Feature, coerce_dtype=True
).save()
ensembl_gene_ids = ln.Schema(
    name="Spatial var level (Ensembl gene id)", itype=bt.Gene.ensembl_gene_id
).save()

# define composite schema
spatial_schema = ln.Schema(
    name="Spatialdata schema (flexible)",
    otype="SpatialData",
    slots={
        "attrs:sample": flexible_metadata_schema,
        "tables:table:obs": flexible_metadata_schema,
        "tables:table:var.T": ensembl_gene_ids,
    },
).save()
Hide code cell output
/tmp/ipykernel_3210/2821844419.py:9: DeprecationWarning: `coerce_dtype` argument was renamed to `coerce` and will be removed in a future release.
  flexible_metadata_schema = ln.Schema(

Curate a Xenium dataset

Create the central query object of our public lamindata instance:

db = ln.DB("laminlabs/lamindata")
# load first of two cropped Xenium datasets
xenium_aligned_1_sdata = db.Artifact.get(key="xenium_aligned_1_guide_min.zarr").load()
xenium_aligned_1_sdata
Hide code cell output
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/lamindb/core/storage/_zarr.py:119: UserWarning: SpatialData is not stored in the most current format. If you want to use Zarr v3, please write the store to a new location using `sdata.write()`.
  scverse_obj = with_package("spatialdata", lambda mod: mod.read_zarr(store))
 transferred: Artifact(uid='kVMuYil81BHTwQ9G0001')
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/zarr/core/group.py:3535: ZarrUserWarning: Object at zmetadata is not recognized as a component of a Zarr hierarchy.
  warnings.warn(
SpatialData object, with associated Zarr store: /home/runner/.cache/lamindb/lamindata/xenium_aligned_1_guide_min.zarr
├── Images
│     ├── 'morphology_focus': DataTree[cyx] (1, 2310, 3027), (1, 1155, 1514), (1, 578, 757), (1, 288, 379), (1, 145, 189)
│     └── 'morphology_mip': DataTree[cyx] (1, 2310, 3027), (1, 1155, 1514), (1, 578, 757), (1, 288, 379), (1, 145, 189)
├── Points
│     └── 'transcripts': DataFrame with shape: (<Delayed>, 8) (3D points)
├── Shapes
│     ├── 'cell_boundaries': GeoDataFrame shape: (1899, 1) (2D shapes)
│     └── 'cell_circles': GeoDataFrame shape: (1812, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (1812, 313)
with coordinate systems:
    ▸ 'aligned', with elements:
        morphology_focus (Images), morphology_mip (Images), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes)
    ▸ 'global', with elements:
        morphology_focus (Images), morphology_mip (Images), transcripts (Points), cell_boundaries (Shapes), cell_circles (Shapes)
xenium_curator = ln.curators.SpatialDataCurator(xenium_aligned_1_sdata, spatial_schema)
try:
    xenium_curator.validate()
except ln.errors.ValidationError as error:
    print(error)
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! 9 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'CAFs', 'Endothelial', 'Myeloid', 'PVL', 'T-cells', 'B-cells', 'Normal Epithelial', 'Plasmablasts', 'Cancer Epithelial'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
9 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'CAFs', 'Endothelial', 'Myeloid', 'PVL', 'T-cells', 'B-cells', 'Normal Epithelial', 'Plasmablasts', 'Cancer Epithelial'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
xenium_aligned_1_sdata.tables["table"].obs["celltype_major"] = (
    xenium_aligned_1_sdata.tables["table"]
    .obs["celltype_major"]
    .replace(
        {
            "CAFs": "cancer associated fibroblast",
            "Endothelial": "endothelial cell",
            "Myeloid": "myeloid cell",
            "PVL": "perivascular cell",
            "T-cells": "T cell",
            "B-cells": "B cell",
            "Normal Epithelial": "epithelial cell",
            "Plasmablasts": "plasmablast",
            "Cancer Epithelial": "neoplastic epithelial cell",
        }
    )
)
Hide code cell output
/tmp/ipykernel_3210/4072479217.py:4: FutureWarning: The behavior of Series.replace (and DataFrame.replace) with CategoricalDtype is deprecated. In a future version, replace will only be used for cases that preserve the categories. To change the categories, use ser.cat.rename_categories instead.
  .replace(
try:
    xenium_curator.validate()
except ln.errors.ValidationError as error:
    print(error)
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! 2 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'cancer associated fibroblast', 'neoplastic epithelial cell'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
2 terms not validated in feature 'celltype_major' in slot 'tables:table:obs': 'cancer associated fibroblast', 'neoplastic epithelial cell'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('celltype_major')
xenium_curator.slots["tables:table:obs"].cat.add_new_from("celltype_major")
xenium_1_curated_af = xenium_curator.save_artifact(key="xenium1.zarr")
Hide code cell output
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
 writing the in-memory object into cache
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/ome_zarr/writer.py:319: FutureWarning: Passing storage-related arguments via **kwargs is deprecated. Please use the 'zarr_store_kwargs' parameter instead. **kwargs will be removed in a future version.
  da_delayed = da.to_zarr(
xenium_1_curated_af.describe()
Hide code cell output
Artifact: xenium1.zarr (0000)
├── uid: NVvRV9OmRnGiNEkC0000            run: JX8pXm2 (spatial3.ipynb)
kind: dataset                        otype: SpatialData           
hash: 9YRvCq3fDh3JoFYHsgsu7g         size: 34.8 MB                
branch: main                         space: all                   
created_at: 2026-02-11 19:58:27 UTC  created_by: testuser1        
n_files: 101                                                      
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/NVvRV9OmRnGiNEkC.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                          bionty.ExperimentalFactor            10x Xenium                             
│   disease                        bionty.Disease                       ductal breast carcinoma in situ        
│   organism                       bionty.Organism                      human                                  
│   tissue                         bionty.Tissue                        breast                                 
├── tables:table:obs (1)                                                                                       
│   celltype_major                 bionty.CellType                      B cell, T cell, cancer associated fibr…
└── tables:table:var.T (313 bion…                                                                              
    ABCC11                         num                                                                         
    ACTA2                          num                                                                         
    ACTG2                          num                                                                         
    ADAM9                          num                                                                         
    ADGRE5                         num                                                                         
    ADH1B                          num                                                                         
    ADIPOQ                         num                                                                         
    AGR3                           num                                                                         
    AHSP                           num                                                                         
    AIF1                           num                                                                         
    AKR1C1                         num                                                                         
    AKR1C3                         num                                                                         
    ALDH1A3                        num                                                                         
    ANGPT2                         num                                                                         
    ANKRD28                        num                                                                         
    ANKRD29                        num                                                                         
    ANKRD30A                       num                                                                         
    APOBEC3A                       num                                                                         
    APOBEC3B                       num                                                                         
    APOC1                          num                                                                         
└── Labels
    └── .organisms                     bionty.Organism                      human                                  
        .tissues                       bionty.Tissue                        breast                                 
        .cell_types                    bionty.CellType                      endothelial cell, myeloid cell, periva…
        .diseases                      bionty.Disease                       ductal breast carcinoma in situ        
        .experimental_factors          bionty.ExperimentalFactor            10x Xenium                             

Curate additional Xenium datasets

We can reuse the same curator for a second Xenium dataset:

xenium_aligned_2_sdata = db.Artifact.get(key="xenium_aligned_2_guide_min.zarr").load()

xenium_aligned_2_sdata.tables["table"].obs["celltype_major"] = (
    xenium_aligned_2_sdata.tables["table"]
    .obs["celltype_major"]
    .replace(
        {
            "CAFs": "cancer associated fibroblast",
            "Endothelial": "endothelial cell",
            "Myeloid": "myeloid cell",
            "PVL": "perivascular cell",
            "T-cells": "T cell",
            "B-cells": "B cell",
            "Normal Epithelial": "epithelial cell",
            "Plasmablasts": "plasmablast",
            "Cancer Epithelial": "neoplastic epithelial cell",
        }
    )
)
Hide code cell output
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/lamindb/core/storage/_zarr.py:119: UserWarning: SpatialData is not stored in the most current format. If you want to use Zarr v3, please write the store to a new location using `sdata.write()`.
  scverse_obj = with_package("spatialdata", lambda mod: mod.read_zarr(store))
no parent found for <ome_zarr.reader.Label object at 0x7f11060d7920>: None
no parent found for <ome_zarr.reader.Label object at 0x7f11060d74d0>: None
 transferred: Artifact(uid='KFhRNPqcdoxBCNZt0001')
/tmp/ipykernel_3210/980259403.py:6: FutureWarning: The behavior of Series.replace (and DataFrame.replace) with CategoricalDtype is deprecated. In a future version, replace will only be used for cases that preserve the categories. To change the categories, use ser.cat.rename_categories instead.
  .replace(
xenium_2_curated_af = ln.Artifact.from_spatialdata(
    xenium_aligned_2_sdata, key="xenium2.zarr", schema=spatial_schema
).save()
Hide code cell output
 writing the in-memory object into cache
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/ome_zarr/writer.py:319: FutureWarning: Passing storage-related arguments via **kwargs is deprecated. Please use the 'zarr_store_kwargs' parameter instead. **kwargs will be removed in a future version.
  da_delayed = da.to_zarr(
 loading artifact into memory for validation
! 1 term not validated in feature 'columns' in slot 'attrs:sample': 'panel'
    → fix typos, remove non-existent values, or save terms via: curator.slots['attrs:sample'].cat.add_new_from('columns')
! 10 terms not validated in feature 'columns' in slot 'tables:table:obs': 'cell_id', 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'region', 'dataset', 'celltype_minor'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
 returning schema with same hash: Schema(uid='Wvdk05lQQ4YpU9aK', is_type=False, name=None, description=None, n_members=4, coerce=True, flexible=False, itype='Feature', otype=None, hash='W-i-fTcakSftIKm5nVwzGw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=3, run_id=3, type_id=None, created_at=2026-02-11 19:58:27 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='X7n6rmuNvREF2vJu', is_type=False, name=None, description=None, n_members=1, coerce=True, flexible=False, itype='Feature', otype=None, hash='bM96XOcK2ZzKABmpqHYfOQ', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=3, run_id=3, type_id=None, created_at=2026-02-11 19:58:27 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='J5O5aFxZVEgzm7KN', is_type=False, name=None, description=None, n_members=313, coerce=None, flexible=False, itype='bionty.Gene.ensembl_gene_id', otype=None, hash='Wk3AsxwE47GcMHPnf3Ub1A', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=3, run_id=3, type_id=None, created_at=2026-02-11 19:58:27 UTC, is_locked=False)
xenium_2_curated_af.describe()
Hide code cell output
Artifact: xenium2.zarr (0000)
├── uid: D5RrDU9Q5hBMA9Hb0000            run: JX8pXm2 (spatial3.ipynb)
kind: dataset                        otype: SpatialData           
hash: NMZ6bST6f0RE-yKK0LgowA         size: 36.8 MB                
branch: main                         space: all                   
created_at: 2026-02-11 19:58:32 UTC  created_by: testuser1        
n_files: 126                                                      
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/D5RrDU9Q5hBMA9Hb.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                          bionty.ExperimentalFactor            10x Xenium                             
│   disease                        bionty.Disease                       ductal breast carcinoma in situ        
│   organism                       bionty.Organism                      human                                  
│   tissue                         bionty.Tissue                        breast                                 
├── tables:table:obs (1)                                                                                       
│   celltype_major                 bionty.CellType                      B cell, T cell, cancer associated fibr…
└── tables:table:var.T (313 bion…                                                                              
    ABCC11                         num                                                                         
    ACTA2                          num                                                                         
    ACTG2                          num                                                                         
    ADAM9                          num                                                                         
    ADGRE5                         num                                                                         
    ADH1B                          num                                                                         
    ADIPOQ                         num                                                                         
    AGR3                           num                                                                         
    AHSP                           num                                                                         
    AIF1                           num                                                                         
    AKR1C1                         num                                                                         
    AKR1C3                         num                                                                         
    ALDH1A3                        num                                                                         
    ANGPT2                         num                                                                         
    ANKRD28                        num                                                                         
    ANKRD29                        num                                                                         
    ANKRD30A                       num                                                                         
    APOBEC3A                       num                                                                         
    APOBEC3B                       num                                                                         
    APOC1                          num                                                                         
└── Labels
    └── .organisms                     bionty.Organism                      human                                  
        .tissues                       bionty.Tissue                        breast                                 
        .cell_types                    bionty.CellType                      endothelial cell, myeloid cell, periva…
        .diseases                      bionty.Disease                       ductal breast carcinoma in situ        
        .experimental_factors          bionty.ExperimentalFactor            10x Xenium                             

Curate Visium datasets

Analogously, we can define a Schema and Curator for Visium datasets:

visium_aligned_sdata = db.Artifact.get(key="visium_aligned_guide_min.zarr").load()
visium_aligned_sdata
Hide code cell output
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/lamindb/core/storage/_zarr.py:119: UserWarning: SpatialData is not stored in the most current format. If you want to use Zarr v3, please write the store to a new location using `sdata.write()`.
  scverse_obj = with_package("spatialdata", lambda mod: mod.read_zarr(store))
 transferred: Artifact(uid='bjH534dxVi1drmLZ0001')
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/zarr/core/group.py:3535: ZarrUserWarning: Object at zmetadata is not recognized as a component of a Zarr hierarchy.
  warnings.warn(
SpatialData object, with associated Zarr store: /home/runner/.cache/lamindb/lamindata/visium_aligned_guide_min.zarr
├── Images
│     ├── 'CytAssist_FFPE_Human_Breast_Cancer_full_image': DataTree[cyx] (3, 1213, 952), (3, 607, 476), (3, 303, 238), (3, 152, 119), (3, 76, 60)
│     ├── 'CytAssist_FFPE_Human_Breast_Cancer_hires_image': DataArray[cyx] (3, 113, 88)
│     └── 'CytAssist_FFPE_Human_Breast_Cancer_lowres_image': DataArray[cyx] (3, 34, 27)
├── Shapes
│     └── 'CytAssist_FFPE_Human_Breast_Cancer': GeoDataFrame shape: (37, 2) (2D shapes)
└── Tables
      └── 'table': AnnData (37, 18085)
with coordinate systems:
    ▸ 'aligned', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_full_image (Images), CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'downscaled_hires', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'downscaled_lowres', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
    ▸ 'global', with elements:
        CytAssist_FFPE_Human_Breast_Cancer_full_image (Images), CytAssist_FFPE_Human_Breast_Cancer_hires_image (Images), CytAssist_FFPE_Human_Breast_Cancer_lowres_image (Images), CytAssist_FFPE_Human_Breast_Cancer (Shapes)
visium_curated_af = ln.Artifact.from_spatialdata(
    visium_aligned_sdata, key="visium.zarr", schema=spatial_schema
).save()
Hide code cell output
 writing the in-memory object into cache
/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/ome_zarr/writer.py:319: FutureWarning: Passing storage-related arguments via **kwargs is deprecated. Please use the 'zarr_store_kwargs' parameter instead. **kwargs will be removed in a future version.
  da_delayed = da.to_zarr(
 loading artifact into memory for validation
! 7 terms not validated in feature 'columns' in slot 'tables:table:obs': 'in_tissue', 'array_row', 'array_col', 'spot_id', 'region', 'dataset', 'clone'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:obs'].cat.add_new_from('columns')
! no values were validated for columns!
 starting creation of 17759 Gene records in batches of 10000
! 19 terms not validated in feature 'columns' in slot 'tables:table:var.T': 'ENSG00000284824', 'ENSG00000240224', 'ENSG00000243135', 'ENSG00000112096', 'ENSG00000285162', 'ENSG00000183729', 'ENSG00000285447', 'ENSG00000130723', 'ENSG00000148362', 'ENSG00000274897', 'ENSG00000139656', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000183791', 'ENSG00000263264', 'ENSG00000182584', 'ENSG00000184258', 'ENSG00000277203', 'ENSG00000286265'
    → fix typos, remove non-existent values, or save terms via: curator.slots['tables:table:var.T'].cat.add_new_from('columns')
 returning schema with same hash: Schema(uid='Wvdk05lQQ4YpU9aK', is_type=False, name=None, description=None, n_members=4, coerce=True, flexible=False, itype='Feature', otype=None, hash='W-i-fTcakSftIKm5nVwzGw', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=3, run_id=3, type_id=None, created_at=2026-02-11 19:58:27 UTC, is_locked=False)
 returning schema with same hash: Schema(uid='O4XFddP81m1Ui9Uo', is_type=False, name='Flexible metadata', description=None, n_members=None, coerce=True, flexible=True, itype='Feature', otype=None, hash='jKTX5yzmVwIdJdHH2ZfMAA', minimal_set=True, ordered_set=False, maximal_set=False, branch_id=1, space_id=1, created_by_id=3, run_id=3, type_id=None, created_at=2026-02-11 19:58:09 UTC, is_locked=False)
 not annotating with 18066 features for slot tables:table:var.T as it exceeds 1000 (ln.settings.annotation.n_max_records)
visium_curated_af.describe()
Hide code cell output
Artifact: visium.zarr (0000)
├── uid: xbqj9qA4vAABG05R0000            run: JX8pXm2 (spatial3.ipynb)
kind: dataset                        otype: SpatialData           
hash: kJIvLVRIyClw6UllZ57qYg         size: 4.4 MB                 
branch: main                         space: all                   
created_at: 2026-02-11 19:58:44 UTC  created_by: testuser1        
n_files: 91                                                       
├── storage/path: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-spatial/.lamindb/xbqj9qA4vAABG05R.zarr
├── Dataset features
├── attrs:sample (4)                                                                                           
│   assay                          bionty.ExperimentalFactor            Visium Spatial Gene Expression         
│   disease                        bionty.Disease                       ductal breast carcinoma in situ        
│   organism                       bionty.Organism                      human                                  
│   tissue                         bionty.Tissue                        breast                                 
├── tables:table:obs (None)                                                                                    
└── tables:table:var.T (18066 bi…                                                                              
└── Labels
    └── .organisms                     bionty.Organism                      human                                  
        .tissues                       bionty.Tissue                        breast                                 
        .diseases                      bionty.Disease                       ductal breast carcinoma in situ        
        .experimental_factors          bionty.ExperimentalFactor            Visium Spatial Gene Expression         

Overview of the curated datasets

visium_curated_af.view_lineage()
Hide code cell output
_images/6a4f17bef11f5799c3baed52b9b96da97aca3cee8f1dfb9fd5bc593c571e9021.svg
ln.Artifact.to_dataframe(features=True, include=["hash", "size"])
Hide code cell output
 queried for all categorical features of dtypes Record or ULabel and non-categorical features: (0) []
uid key hash size
id
10 xbqj9qA4vAABG05R0000 visium.zarr kJIvLVRIyClw6UllZ57qYg 4649914
9 bjH534dxVi1drmLZ0001 visium_aligned_guide_min.zarr a8rVkf_kjp9To9KI06i03g 5809684
8 D5RrDU9Q5hBMA9Hb0000 xenium2.zarr NMZ6bST6f0RE-yKK0LgowA 38636977
7 KFhRNPqcdoxBCNZt0001 xenium_aligned_2_guide_min.zarr oH569Lh4koYRB1I6AatnGQ 40822308
6 NVvRV9OmRnGiNEkC0000 xenium1.zarr 9YRvCq3fDh3JoFYHsgsu7g 36539570
5 kVMuYil81BHTwQ9G0001 xenium_aligned_1_guide_min.zarr 8f1qC6IkpSvFw2H8TdhplQ 35115305
4 xO5vi2NoJbZOaZbP0000 example_blobs.zarr isnwpiOBrO4To68q0RBaEw 13054005
1 8sPWscz3SICG1D8t0001 xenium/2.0.0/Xenium_V1_humanLung_Cancer_FFPE_o... jgalhtHw00CzuZA_jrTygw 7045222972
ln.finish()
Hide code cell output
 finished Run('JX8pXm2ngXC9DoIY') after 38s at 2026-02-11 19:58:45 UTC