Query & integrate data¶

import lamindb as ln
import bionty as bt

ln.track("wukchS8V976U0000")

→ connected lamindb: testuser1/test-facs

→ created Transform('wukchS8V976U0000'), started new Run('OkCdRnbO...') at 2025-07-29 19:26:02 UTC

→ notebook imports: bionty==1.6.1 lamindb==1.10a1

Inspect the CellMarker registry ¶

Inspect your aggregated cell marker registry as a DataFrame:

bt.CellMarker.df().head()

Show code cell output Hide code cell output

	uid	name	synonyms	description	gene_symbol	ncbi_gene_id	uniprotkb_id	space_id	source_id	organism_id	run_id	created_at	created_by_id	_aux	branch_id
id
41	67ZpJGSKNFyE	CD14\|19	None	None	None	None	None	1	NaN	1	2	2025-07-29 19:25:56.622000+00:00	1	None	1
36	5c4A0r7gMiGw	CD95		None	FAS	2194	P49327	1	12.0	1	2	2025-07-29 19:25:56.612000+00:00	1	None	1
37	3IPMBjs68Vy1	CXCR4		None	CXCR4	7852	P61073	1	12.0	1	2	2025-07-29 19:25:56.612000+00:00	1	None	1
38	525YfNUB967z	CD49B		None	ITGA2	3673	P17301	1	12.0	1	2	2025-07-29 19:25:56.612000+00:00	1	None	1
39	1iLDs6cZIpxj	CD69		None	CD69	969	Q07108	1	12.0	1	2	2025-07-29 19:25:56.612000+00:00	1	None	1

Search for a marker (synonyms aware):

bt.CellMarker.search("PD-1").df().head(2)

Show code cell output Hide code cell output

	uid	name	synonyms	description	gene_symbol	ncbi_gene_id	uniprotkb_id	space_id	source_id	organism_id	run_id	created_at	created_by_id	_aux	branch_id
id
29	33vFR1q26vnM	PD1	PID1\|PD-1\|PD 1	None	PDCD1	5133	A0A0M3M0G7	1	12	1	1	2025-07-29 19:25:45.005000+00:00	1	None	1

Look up markers with auto-complete:

markers = bt.CellMarker.lookup()
markers.cd8

Query panels and collections based on markers, e.g., which collections have 'CD8' in the flow panel:

panels_with_cd8 = ln.FeatureSet.filter(cell_markers=markers.cd8).all()

ln.Artifact.filter(feature_sets__in=panels_with_cd8).df()

Show code cell output Hide code cell output

	uid	key	description	suffix	kind	otype	size	hash	n_files	n_observations	_hash_type	_key_is_virtual	_overwrite_versions	space_id	storage_id	schema_id	version	is_latest	run_id	created_at	created_by_id	_aux	branch_id
id
1	nFN3OYFHF2lZhW090000	None	Alpert19	.h5ad	dataset	AnnData	33450144	pQFB5xL-IDMLc9WgaYGJlg	None	166537	md5	True	False	1	1	None	None	True	1	2025-07-29 19:25:48.243000+00:00	1	{'af': {'0': True}}	1
2	SOYU78ZTkwOCvU530000	None	Oetjen18_t1	.h5ad	dataset	AnnData	46546520	BkQOx3xp3OR4FoOq4CsuJA	None	241552	md5	True	False	1	1	None	None	True	2	2025-07-29 19:25:57.133000+00:00	1	{'af': {'0': True}}	1

Access registries:

features = ln.Feature.lookup()

Find shared cell markers between two files:

artifacts = ln.Artifact.filter(feature_sets__in=panels_with_cd8).list()

shared_markers = artifacts[0].features["var"] & artifacts[1].features["var"]
shared_markers.list("name")