Changelog 2025ΒΆ
2025-10-07 db 1.12.1ΒΆ
2025-10-07 db 1.12.0ΒΆ
β οΈ Run lamin migrate deploy
All instances connected to LaminHub have been migrated and there is no need to act.
If you are an admin of a self-managed instance, please migrate your database with lamin migrate deploy
.
Necessary database migrations in this release consist in new fields and auxiliary tables.
For details see the Compatibility Matrix and the source code.
Highlights:
β¨ Introduce a
work_dir
setting and use it to manage notebooks and scripts across directories PR @falexwolf⨠Add PyTorch Lightning integration with a
Callback
class PR PR Guide @Zethson⨠Allow virtual keys with custom real storage keys PR @Koncopd
Improvements to the Record
class:
πΈ Expand
Record
so that it can act as an ontology akin toULabel
PR @falexwolfπΈ Add conditional unique constraint for
Record.name
giventype
andspace
PR @falexwolf
Database schema:
ποΈ Enable
dtype=User
PR @falexwolfποΈ Enable attaching markdown blocks to all entities PR2 @falexwolf
ποΈ Add an
is_locked
field toSQLRecord
PR1 PR2 @falexwolfποΈ Make
description
fields unlimited length for all registries and increase the max length forArtifact.key
andTransform.key
to 1024 chars PR @falexwolfποΈ Add a
description
field toProject
, replacePerson
registry withRecord
registry PR @falexwolfποΈ Make feature
dtype
non-nullable on the database level ifis_type=False
PR @falexwolf
UX:
πΈ Add the current branch and the main branch by default for filtering PR @Koncopd
πΈ Enable creating projects from CLI PR @falexwolf
πΈ Consistent delete
permanent
argument acrossQuerySet
,BaseSQLRecord
andSQLRecord
PR @KoncopdπΈ Raise an error in
AnnDataCurator
whenschema.otype != 'AnnData'
, no exceptions forotype is None
PR @sunnyosunπΈ Replace default organism warning with targeted message during validation error PR @falexwolf
πΈ Change logging for django module variable reset to debug PR @falexwolf
πΈ Improve logging when violating versioning constraints PR @falexwolf
πΈ De-duplicate data lineage in
Collection.cache()
PR @falexwolf
Bugs:
π Fix dealing with uniqueness constraints of soft-deleted ontology records PR @sunnyosun
π Extend
query_relatives()
to respect branches PR @falexwolfπ Fix small chunksize for very large files when uploading to s3 PR @Koncopd
π Set run status code to completion upon
ln.tracked()
PR @falexwolf
Code organization:
π¨ Expose main CLI commands as functions on the
lamin_cli
root API for LaminR PR @falexwolfπ¨ Deprecate
str_as_cat
inFeature.from_dict()
PR @Zethsonπ¨ Do no longer allow
upload
as a transform type PR @falexwolf
Dependencies:
2025-09-23 db 1.11.3ΒΆ
Note: v1.11.2
didnβt update internal dependencies correctly and was yanked on PyPI.
π Fix inconsistency between
.get()
and.filter()
due tobranch_id
PR @Koncopdπ Add
copy()
to iteration oversys.modules
PR @falexwolfπΈ Throw a clearer error when creating a schema without features & components and
is_type=True
PR @sunnyosunπΈ Fix the cache warning on
ln.track()
in scripts on Windows PR @Koncopd
2025-09-18 db 1.11.1ΒΆ
2025-09-14 db 1.11.0 | bionty 1.7.0 | wetlab 1.5.0ΒΆ
ποΈ Always import the entire public API & enable re-connecting in same Python session PR1 PR2 PR3 PR4 @falexwolf @Koncopd
Additional package re-organization
Because users tend to forget commonly used extras in installation commands, a few optional dependencies are now part of the default dependencies.
ποΈ Include
wetlab
&bionty
in lamindb dependencies PR @falexwolfποΈ Integrate the 3 small Jupyter-related dependencies PR @falexwolf
ποΈ Pause maintenance of
clinicore
PR @falexwolf
Data validation:
β¨ Support validating
attrs
of Pandas DataFrames PR @Zethsonimport lamindb as ln study_metadata_schema = ln.Schema([ ln.Feature(name="temperature", dtype=float).save(), ln.Feature(name="experiment", dtype=str).save(), ]).save() schema = ln.Schema( slots={"attrs": study_metadata_schema}, # validates keys in df.attrs otype="DataFrame", ).save()
β¨ Support validating linked features: replace
str_as_ulabel
argument withschema
argument in.features.add_values()
and enable passing feature annotations toln.Artifact
PR PR @Zethsonimport lamindb as ln schema = ln.Schema([ ln.Feature(name="species", dtype=str).save(), ln.Feature(name="split", dtype=str).save() ]).save() artifact = ln.Artifact( "./test_file.txt", key="test.txt", features={"species": "bird", "split": "train"}, schema=schema, ).save()
β¨ Unstructured slot validation of scverse datastructures PR @Zethson
Other features:
β¨ Fine-grained storage access management PR @falexwolf
β¨ Allow getting artifacts by path via
Artifact.get(path="...")
PR @Koncopd⨠Soft delete for all entities PR @falexwolf @Zethson
β¨ Implement lazy artifact saving for streaming data PR @Koncopd
CELLxGENE-compliant curation:
β¨ Enable validating & annotating datasets with a CELLxGENE schema. Corresponding new guide. PR @Zethson
Bug fixes:
π Include a project feature in a sheet and test it PR @falexwolf
π Add missing
RecordUser
link model PR @falexwolfπ Fix
queryset.to_dataframe(include=...)
withusing
and annotations PR @Koncopdπ Fix adding a column to a cloud
AnnData
object with consolidated metadata PR @Koncopdπ Do not reset latest version when deleting latest versions of folder artifacts PR @Koncopd
Refactoring:
β»οΈ Rename
.list()
to.to_list()
,.from_df()
to.from_dataframe()
PR @Koncopdβ»οΈ Refactor croissant file mapping PR @falexwolf
β»οΈ Refactor and prettify
login()
andlamin login
PR @falexwolfβ»οΈ Add
type
record constraint PR @chaichontatβ»οΈ Transfer feature based on uid instead of name PR @sunnyosun
β»οΈ Simplify
init_storage()
PR @falexwolfβ»οΈ Adapt to new convention for tracking R environments PR @falexwolf
UX:
πΈ Better
lamin delete
and getting versioned entities from trash PR @falexwolfπΈ Migrate easily in presence of fine-grained access connection strings PR @falexwolf
πΈ Print warning when creating spaces on SQLite or local instances PR @falexwolf
πΈ Expose available spaces for a user on fine-grained access instances PR @Koncopd
πΈ Parse and populate title in R notebooks on
lamin save
PR @KoncopdπΈ Allow updating reports of notebooks on
lamin save
PR @Koncopd
Docs:
Upgrades:
Bionty:
β¨ Add a script to register new standard ontology in
bionty-assets
PR @namsaraevaβ»οΈ Add indexes PR @falexwolf
Wetlab:
Assets:
Use cases:
π Refactor bulk RNA curation guide to being schema-based PR @Zethson
β¬οΈ Adapt
sc-imaging2
andfacs2
to anndata 0.12 PR @namsaraeva PR @sunnyosun
2025-08-12 db 1.10.2ΒΆ
β¨ Add
ProjectRecord
for annotating sheets with projects PR @falexwolfπΈ Ask for additional confirmation when creating storage locations through switching storage settings PR @falexwolf
π Fix
AnnDataAccessor
forAnnData
objects with indices stored as integers PR @Koncopdπ Annotate artifacts passed to Curator with Schema PR @Zethson
2025-08-06 db 1.10.1ΒΆ
2025-07-29 db 1.10.0ΒΆ
Features.
β¨ Add
curate_from_croissant()
to curate from MLCommonsCroissant
files PR @falexwolf⨠Allow to receive and add extra parameters for managed buckets PR @Koncopd
β¨ Enable reverting database migrations PR @falexwolf
β¨ Allow getting settings via the CLI PR @falexwolf
UX.
πΈ Stricter hash uniqueness on artifact and more indexes PR @falexwolf
πΈ Hide
VitessceConfig
artifacts PR @falexwolfπΈ Move
.datasets
from.core
to.examples
PR @falexwolfπΈ Cache
branch
andspace
settings PR @falexwolf
Refactors and bug fixes.
β»οΈ Adapt huggingface sync to the changes in their API PR @Koncopd
π Fix auto-search for corrupted local storage location PR @falexwolf
2025-07-23 db 1.9.1ΒΆ
π Fix keep-artifacts-local mode when no local storage location is found PR @falexwolf
π Enable anonymous users to access public folders on AWS S3 PR @Koncopd
πΈ Create records under subtype via
Curator.add_new_from
PR @sunnyosunπΈ On
transform
delete, deleteTransformProject
links because they might be protected through a run of the same transform PR @falexwolf
2025-07-21 db 1.9.0 | bionty 1.6.1ΒΆ
Features.
β¨ Enable validating & annotating datasets with a CELLxGENE schema. Corresponding new guide. PR @Zethson
β¨ Add
lamin annotate
, enable string-based annotation with non-ulabels, overhaul CLI docs PR @falexwolf
UX.
πΈ Enable
ln.track()
andln.finish()
for notebooks running on remote servers PR @falexwolfπΈ Improve UX of working in
keep-artifacts-local
mode PR @falexwolfπΈ Expose
storage
,branch
, andspace
in theArtifact
constructor PR @falexwolf
Bug fixes.
π Fix
search_local_root
inkeep-artifacts-local
for storage roots without access PR @Koncopdπ Fix sync for cache synchronization for timestamps with a fractional part PR PR @Koncopd
π Rework
get_storage_region()
to make it reliable and use it inupath.to_url()
PR @Koncopdπ Fix creation of a
Schema
withis_type=True
PR @sunnyosunπ Fix
index
hash calculation forSchema
PR @sunnyosun
Performance.
β‘οΈ Speed up
describe()
by 6x PR @falexwolfβ‘οΈ Speed up
lamin connect
by 2x PR @falexwolfβ‘οΈ Implement performant synchronization for directories PR @Koncopd
Docs.
π Overhaul the
Curate datasets
guide PR @sunnyosunπ Add an actual
README.md
PR @falexwolf
Bionty.
β»οΈ New error message when url of ontology doesnβt exist PR @namsaraeva
β»οΈ Remove Disease constructor overloads PR @namsaraeva
β»οΈ Update visibility of
source.dataframe_artifact
PR @sunnyosunβ»οΈ Make sure ensembl organism has a
synonyms
column PR @sunnyosunβ»οΈ Adapt to renaming of
UPath.synchronize()
toUPath.synchronize_to()
PR PR @Koncopdβ»οΈ Remove pronto warning filters PR @namsaraeva
β»οΈ Added logging to
Ontology.to_df()
PR @namsaraeva
2025-07-14 db 1.8.0ΒΆ
πΈ Improve the experience of working in
keep-artifacts-local
mode PR @falexwolfβ‘οΈ Use native
polars
Object Store by default PR @KoncopdπΈ Improve error message when attempting to curate against unsaved schema PR @Zethson
πΈ Improve UX for labeling unsaved records from other instances PR @Zethson
π Properly ignore
ln.track()
and tracking warnings on read-only connections PR @Koncopd
2025-07-07 db 1.7.1ΒΆ
2025-07-06 db 1.7.0 | bionty 1.6.0ΒΆ
β οΈ Consider lamin migrate deploy
All instances connected to LaminHub have been migrated and there is no need to act.
If you are an admin of a self-managed instance, please migrate your database with lamin migrate deploy
.
Necessary database migrations in this release consist in new link tables that can interfere with certain deletion calls.
Features.
β¨ Enable switching branches and spaces via the CLI PR @falexwolf
β¨ Upload the R environment tracked in LaminR PR @falexwolf
β¨ Create export functionality for records, enable annotating artifacts by records, enable tracking lineage of records, treat sheets as a record type PR PR PR @falexwolf
β¨ Support
cat_filters
to enable specifying Source versions PR @Zethson⨠Add
dtype
path
PR @sunnyosun
UX improvements.
πΈ Improved storage location management: conveniently create & delete storage locations PR PR PR PR PR @falexwolf @Koncopd
πΈ Introduce run status codes, save source code & run environment of scripts upon
ln.track()
instead ofln.finish()
PR @falexwolfπΈ Simplify default transfer mode: no longer transfer annotations PR @falexwolf
πΈ Add
settings.is_connected
to check if the current instance is properly connected for use PR @KoncopdπΈ No longer persist instance settings on
ln.connect()
PR @KoncopdπΈ Case-insensitive uniqueness and lower-case pre-defined branch and space names PR @falexwolf
πΈ Warn when no categorical values are validated rather than throwing an error, e.g. when validating column names of a dataframe or labels of a label vector PR @sunnyosun
πΈ Introduce two-column layout in
Artifact.describe()
and add information likespace
,branch
, andkind
PR @falexwolfπΈ Support nested sub types in curation PR PR @sunnyosun @falexwolf
πΈ Enable inferring all features of a queryset of records or artifacts by passing
features="queryset"
to.df()
PR @falexwolfπΈ Mark data transfers via transform key
__lamindb_transfer__
PR @falexwolfπΈ More aggressive checks for
anon_public
requests to S3 for public instances PR @KoncopdπΈ Mark upload failures of artifacts via a private boolean indicating a successful save event:
artifact._is_saved_to_storage_location
PR @KoncopdπΈ Correct treatment of carriage returns in logs PR @Koncopd
πΈ Add
keep
with default"first"
toLookup
class PR @sunnyosunπΈ Batched bulk saving for persisting large numbers of records efficiently PR @sunnyosun
πΈ Define all exceptions in
lamindb_setup.errors
PR @falexwolfπΈ Leverage panderas lazy validation PR @falexwolf
πΈ Enable setting
overwrite_versions
inArtifact
creator PR @falexwolfπΈ Default to hiding artifacts with
kind = "__lamindb_run__"
in regular queries PR @falexwolfπΈ Mark as
py.typed
to allow mypy inspection and fix return type ofBaseSQLRecord.save
PR PR @apβπΈ Enable passing
branch
andspace
everywhere PR @falexwolf
Bug fixes.
π Auto-version scripts in absence of
ln.track()
PR @falexwolfπ Fix updates of re-loaded
Schema
auxiliary fields PR @Zethsonπ Fix a bug preventing backwards compatibility of instance settings files PR @Koncopd
π Have transfer comply with
settings.annotation.n_max_records
and fix a bug related to repeated schema transfer PR @falexwolfπ Fix transfer from instance with superset of module to instance with subset of modules in presence of schema-annotation in the additional modules PR @falexwolf
Bionty changes.
πΈ Allow using public references without prefix filters PR @sunnyosun
β¬οΈ Add new ontology versions & fix registration bugs PR PR @Zethson
β»οΈ Downgrade sources versions in sources.yaml PR @namsaraeva
β»οΈ Do not upload dataframe if instance is bionty-assets PR @sunnyosun
β»οΈ Make treatment of
Record
link tables consistent with lamindb PR @falexwolfπ Fix
import_source
with non-existingparents
in the reference PR @namsaraeva
Deprecated code.
π₯ Remove long-deprecated
ln.setup.load()
and LaminDB v1 migration logic PR @falexwolf
2025-06-10 db 1.6.2ΒΆ
2025-06-03 db 1.6.1 | bionty 1.5.0ΒΆ
Bionty.
β¨ Flexible ontology sources PR @sunnyosun
LaminDB.
πΈ Enable passing
--branch
and--space
tolamin save
PR @falexwolfπ Fix query of feature-associated labels from non-ULabel registries PR @sunnyosun
2025-06-01 db 1.6.0 | bionty 1.4.0ΒΆ
β οΈ Consider lamin migrate deploy
All instances connected to LaminHub have been migrated and there is no need to act.
If you are an admin of a self-managed instance, please migrate your database with lamin migrate deploy
.
The migrations in this release do not break old LaminDB clients with the exception of writing to the Param
registry: the data in the corresponding SQL table got moved into the Feature
registry.
The bulk of database-level changes was made in this PR @falexwolf.
remove unique constraint from
Feature.name
replace hard unique constraint on
Transform.hash
andArtifact.hash
, with conditional unique constraint: hash can be duplicated for different keysnew names for how instances are referred to for
type
, these donβt clash with the newrecord
concept:Ulabel.ulabels
,Feature.features
,Schema.schemas
,Project.projects
default space uid is now a
"A"
for"All"
Feature._expect_many
now defaults toNone
so that the auto-display of single values as opposed to a set makes sense, and a user can enforce one (single) or the other (many) in the futurehash is populated for all
FeatureValue
records so that there is an easy way to universally identify a unique feature value
Changes to registries.
ποΈ Integrate the
Param
into theFeature
registry PR @falexwolf β the change is backward compatible on the Python/R level β on the SQL level, records are transferred from thelamindb_param
table to thelamindb_feature
table during migrations⨠Introduce a
Branch
registry PR @falexwolfβ»οΈ Rename
Record
toSQLRecord
PR PR @falexwolf⨠Introduce a flexible
Record
registry to manage any kind of entity without database migrations PR @falexwolf
Data curation.
β¨ Add schema-based
TiledbsomaExperimentCurator
PR PR @Zethson⨠Support curating lists as values in
DataFrameCurator
PR @sunnyosun
Bug fixes.
π Fix transfer for cases in which genes are insufficiently populated PR @falexwolf
Dependency changes.
β¬οΈ No longer install
contenttypes
PR @falexwolf
UX improvements.
πΈ Do no longer duplicate tracking of predecessors through the corresponding link table on
Transform
PR @falexwolfπΈ Add
is_run_input
toArtifact.get()
andCollection.get()
PR @KoncopdπΈ Clearer error in
parse_cat_dtype
if cat dtype contains a module name and the module is not found PR @KoncopdπΈ Better error message when user passes manual
uid
totrack()
+ anticipate that the user might want to create new transforms in some cases also if hash matches PR @falexwolfπΈ Improve setting relationships of unsaved records UX PR @Zethson
πΈ Improve
DoesNotExist
error message uponDBRecord.get()
PR @Zethsonβ»οΈ Set current space when transferring records PR @Koncopd
β»οΈ Mark internal lamindb-produced artifacts with
kind="__lamindb__"
instead of_branch_code=0
PR @falexwolf
2025-05-13 db 1.5.3ΒΆ
2025-05-13 db 1.5.2ΒΆ
π Reset
SpatialData
path when access in-memory representation PR @ZethsonπΈ Do not validate twice within
Artifact.from_X(...)
when passing schema PR @falexwolf
2025-05-08 db 1.5.1ΒΆ
π Fix a too strict unique constraint in composite schemas PR @falexwolf
π Fix display of parents & children in
view_parents(with_children)
PR @falexwolfβ¬οΈ Adapt
save_tiledbsoma_experiment
totiledbsoma==1.16.2
PR @Koncopd
2025-05-07 db 1.5.0 | bionty 1.3.2ΒΆ
Data lineage.
πΈ Make notebook & script tracking via
ln.track()
robust to renames PR @falexwolf⨠Enable executing notebooks via
jupyter nbconvert --execute
PR @falexwolf
CLI updates.
β¨ Enable cloud paths for
lamin save
PR @Koncopdlamin save s3://my-bucket/my-file.txt
β¨ Enable labeling with project during
lamin save
PR @falexwolflamin save ./my-folder --project my-project
Streaming artifacts.
β¨ Enable
polars
inArtifact.open()
andCollection.open()
PR @Koncopd⨠Enable
.load()
,.open()
, and.mapped()
on query sets of artifacts PR @Koncopd
Curation & schemas.
β¨ Enable curating the index of a dataframe PR @falexwolf
schema = ln.Schema( features=[ ln.Feature(name="required_feature", dtype=str).save(), ], index=ln.Feature(name="sample", dtype=ln.ULabel).save(), ).save()
πΈ Enable passing a
ULabel
type todtype
PR @falexwolfperturbation_type = ln.ULabel.get(name="Perturbation") # perturbation_type.is_type is True ln.Feature(name="perturbation", dtype=perturbation_type)
πΈ Handle schema updates decently PR @falexwolf
πΈ Do not annotate with more than
n_max_records = 1000
PR @falexwolfπΈ Introduce a submodule
lamindb.examples
with schemas PR @falexwolfπΈ Enable validating against nested dicts in
spatialdata
PR @falexwolfπΈ Better handle validation of ensembl gene IDs and add curator representation PR @Zethson
πΈ Prettier
Schema.describe()
PR @sunnyosunπΈ
AnnData
: enable explicit transposition invar
schema definition PR @falexwolfπ Rename the
components
argument ofSchema()
toslots
PR @falexwolfπ Fix respecting
schema.ordered_set
inDataFrame
validation PR @sunnyosun
Bulk annotation with features & queries via features.
β¨ Support feature dtype
dict
PR @falexwolfln.Feature(name="metadata_details", dtype=dict).save()
πΈ For artifacts, improve (1) bulk annotation with features + (2) queries by features PR @falexwolf
General UX improvements.
πΈ Do not raise exceptions on problems with
copy_or_move_to_cache
withinArtifact.save
PR @KoncopdπΈ Allow passing
key
tosave_vitessce_config()
PR @namsaraeva
Docs.
π Document
uid
generation, prettify API reference docs PR @falexwolf
Refactoring.
General refactoring.
β»οΈ Eliminate monkey patching of
django.db.models.QuerySet
anddjango.db.models.Manager
PR @Koncopdβ»οΈ Avoid non-lazy loads of settings on import of
lamindb.models
PR @Koncopd
Refactoring for curation & schemas.
β»οΈ Restore validation error messages & add their fine-grained testing PR @falexwolf
β»οΈ Can save csv artifacts in
DataFrameCurator
PR @sunnyosunβ»οΈ Clearer naming conventions in the internal curator codebase PR @falexwolf
β»οΈ Separate
CatManager
usage for.cat
attribute and as legacy interface PR @falexwolfβ»οΈ Separate legacy curators from new curators PR @falexwolf
β»οΈ Execute curator examples and also show them in the curation guide PR @falexwolf
β»οΈ Refactor annotating with inferred feature sets PR @falexwolf
Fine-grained access management (in beta).
πΈ Better access management errors on
Record.save()
PR @Koncopdπ Fix
.using
with fine-grained access instances and permissions test PR @Koncopdβ Temp table based authentication (adapt tests) PR @Koncopd
πΈ Delete version family if user wants to retain store by passing
storage=False
toartifact.delete()
, but retain warning PR @falexwolf
2025-04-25 bionty 1.3.1ΒΆ
π Fixed downloading old Ensembl versions. PR @sunnyosun
If you upgraded to bionty
1.3.0 and used Ensembl versions below 108, please clear the cached ontology source files.
import bionty as bt
import shutil
shutil.rmtree(bt.base.settings.dynamicdir)
2025-04-24 R 1.1.0ΒΆ
LaminR is now documented on docs.lamin.ai
.
The previous docs site laminr.lamin.ai
continues to host developer docs.
π Update documentation site to match the main docs website PR PR @lazappi
π· Separate Seurat analysis from rest of the introduction notebook PR @falexwolf
β»οΈ Make R and Python quickstarts parallel PR @falexwolf
β»οΈ Move
setup.Rmd
tolamin-docs
PR @falexwolf
New features.
β¨ Improved Python dependency management with
reticulate
, deprecatedinstall_lamindb()
PR @lazappi⨠Add tracking of the R environment using
pak
lockfiles PR @lazappi
Bug fixes.
π Enable setting wrapped object slots like
artifact$description
,artifact$key
, etc. PR @lazappiπ Fix an issue that was preventing
lamin_connect()
from being run multiple times with the same instance PR @lazappiπ Properly clear and delete temporary instances created using
lamin_init_temp()
PR @lazappi
Other changes.
2025-04-15 db 1.4.0 | bionty 1.3.0ΒΆ
β¨ Add schema
as an argument to Artifact.from_X()
. PR @falexwolf
artifact = ln.Artifact.from_df(df, key="my_dataset.parquet", schema=schema).save()
β¨ Enable defining simple schemas that merely enforce a feature identifier type. PR @falexwolf
schema = ln.Schema(itype=ln.Feature).save() # <-- enforce valid feature identifiers, no need to define specific required features
β¨ Enable defining optional features on a per-schema level & improve schema hash calculation. PR @sunnyosun
schema = ln.Schema(
features=[
ln.Feature(name="sample_id", dtype=str).save() # required
ln.Feature(name="sample_name", dtype=str).with_config(optional=True) # optional
],
).save()
β¨ Introduce lamin run
with a Modal backend. PR @ragyhaddad
lamin run my_script.py --project my_project # <-- will run the script on Modal
β¨ Support auto-download of Ensembl genes of all organisms. Guide PR @sunnyosun
gene_ontology = bt.base.Gene(source="ensembl", organism="rabbit", version='release-103')
gene_ontology.register_source_in_lamindb() # register the new ontology source in lamindb
source = bt.Source.get(entity="bionty.Gene", name="ensembl", organism="rabbit", version='release-103')
bt.Gene.import_source(source=source) # import all genes from that source
πΈ Enable querying by features & params through Artifact.filter()
and Run.filter()
. Guide PR @falexwolf
ln.Artifact.filter(scientist="Barbara McClintock")
User experience.
πΈ
from_source
no longer returnsNone
but throws aNoResultFound
exception if the look up in the public ontology fails PR @sunnyosunπΈ Allow renaming artifacts & transforms within the same version family PR @falexwolf
πΈ Better support
minimal_set
,maximal_set
,ordered_set
in curators PR @sunnyosunπΈ Enable passing the stem uid to
lamin save
PR @falexwolfπΈ No longer throw an error but merely print a warning when attempting to update a schema PR @falexwolf
πΈ Enable plain notebook uploads by making a default run for notebook in case no run is found PR @falexwolf
πΈ Enable to authenticate and set the current instance through environment variables PR @falexwolf
πΈ Show link to hub in
view_lineage()
and render lineage through graphviz also in scripts PR @falexwolfπΈ Order
IsVersioned.versions
query set PR @falexwolfπΈ Do not print warning about missing schema modules PR @falexwolf
Refactors.
β»οΈ Eliminate duplicated parsing & record creation during curation PR @falexwolf
β»οΈ Remove
verbosity
andorganism
arguments onCatManager
level PR PR @falexwolfβ»οΈ Organize categorical curation code with
CatColumn
PR @sunnyosunβ»οΈ Add
return_graph
argument toview_lineage()
PR @lazappi
Docs.
π Compare lamindb with pydantic and pandera in an FAQ doc PR @falexwolf
π Document access any Ensembl genes PR @sunnyosun
Bugs.
π Fix validation of
var_index
PR @sunnyosunπ Fix
numcodecs==0.16.0
incompatibility withzarr v2
PR @Koncopdπ Fix organism passing to
from_source
PR @sunnyosunπ Return an empty set not a dict for modules in instance settings PR @falexwolf
Bionty.
πΈ Make the default organism
"human"
instead ofNone
PR @falexwolfβ¬οΈ Support Python 3.13 & remove support for Python 3.9 PR @Zethson
β»οΈ Improve Ensembl prefix detection PR @sunnyosun
β»οΈ Use
UPath.synchronize
ins3_bionty_assets
PR @Koncopd
2025-03-27 db 1.3.2 | bionty 1.2.1ΒΆ
π Fix bionty ontology sources sync through
reticulate
PR @falexwolfπ Fix data transfer through when target instances has no schema modules PR @falexwolf
2025-03-26 db 1.3.1 | bionty 1.2.0ΒΆ
In Bionty, you can now add custom ontology sources through the Source
registry.
df = pd.read_csv("./our_inhouse_genes.csv") # a csv describing gene metadata e.g. from parsing a GTF file
custom_source = bt.Source(entity="bionty.Gene", organism="human", name="Our genes", version="2025-04-01").save()
bt.Gene.add_source(custom_source, df=df) # couple the custom source to the Gene registry
Detailed changes
Bionty now relies on a single file source.yaml to reference public sources.
β¨ Enable update existing records to a new ontology PRPR @sunnyosun
β¨ Robust support of custom sources PR @sunnyosun
β»οΈ Refactor
sync_public_sources
PR @sunnyosunβ»οΈ Refactor default source configuration PR @sunnyosun
β»οΈ Make EFO parsing the same as other ontologies PR @sunnyosun
β»οΈ No longer use local source yaml files PR @sunnyosun
β»οΈ Move source tests from lamindb to bionty PR @sunnyosun
β»οΈ Standardize organism scientific names from ensembl source PR @sunnyosun
β»οΈ Increase uid length for
Source
to 8 chars PR @falexwolf
LaminDB changes.
π Enable transferring features pointing to multiple labels PR @sunnyosun
π More extensive validation for updates to
artifact.key
andartifact.suffix
PR @falexwolfπΈ Refactor conventions for files written during init: the SQLite file is now
.lamindb/lamin.db
and the storage marker is.lamindb/storage_uid.txt
PR @falexwolfπΈ Make upload of large directories more robust by reducing batch size PR @Koncopd
πΈ Avoid requiring
coerce_dtype
for"int"
and"float"
in case an integer or floatpd.Series.dtype
only deviates by numerical precision/range PR @falexwolfπΈ In
AnnDataCurator
, make'obs'
schema optional and allow'uns'
schema PR @falexwolf
2025-03-16 db 1.3.0ΒΆ
New features.
β¨ Add schema-based
SpatialDataCurator
PR1 PR2 PR3 @Zethson⨠Add schema-based
MuDataCurator
PR @sunnyosun⨠Add
lamin get
for artifacts andlamin load
for collections PR @Zethson @falexwolf
Other changes.
β¬οΈ Support CELLxGENE schema 5.2.0 PR1 PR2 @sunnyosun
πΈ Skip
ln.track()
when connected in read-only mode PR @falexwolfπΈ Error if trying to register an instance without a storage in the hub PR @Koncopd
πΈ Refactor
organism
constraints during validation PR @sunnyosunπΈ Add more constructor signatures and specific inherited types PR @falexwolf
πΈ No logging message if database is behind by minor version PR @falexwolf
π Re-structure curation guides PR1 PR2 @falexwolf
π Integrate tutorials into introduction guide PR @falexwolf
2025-03-10 R 1.0.0ΒΆ
β¨ laminr
now has feature parity with lamindb
. PR @lazappi
Run
install_lamindb()
, which will ensurelamindb >= 1.2
in the Python environment used byreticulate
.Replace
db <- connect()
withln <- import_module("lamindb")
and see the βDetailed changesβ dropdown.
The ln
object is largely similar to the db
object in laminr
< v1 and matches lamindb
βs Python API (.
β $
).
Detailed changes
What |
Before |
After |
---|---|---|
Connect to the default LaminDB instance |
|
|
Start tracking |
|
|
Get an artifact from another instance |
|
|
Create an artifact from a path |
|
|
Finish tracking |
|
|
See the updated βGet startedβ vignette for more information.
User-facing changes:
Add an
import_module()
function to import Python modules with additional functionality, e.g.,import_module("lamindb")
for lamindbAdd functions for accessing more
lamin
CLI commandsAdd a new βIntroductionβ vignette that replicates the code from the Python lamindb introduction guide
Internal changes:
Add an internal
wrap_python()
function to wrap Python objects while replacing Python methods with R methods as needed, leaving most work to {reticulate}Update the internal
check_requires()
function to handle Python packagesAdd custom
cache()
/load()
methods to theArtifact
classAdd custom
track()
/finish()
methods to the lamindb module
2025-03-09 db 1.2.0ΒΆ
β¨ Enable to auto-link entities to projects. Guide PR @falexwolf
ln.track(project="My project")
πΈ Better support for spatialdata
with Artifact.from_spatialdata()
and artifact.load()
. PR1 PR2 @Zethson
πΈ Introduce .slots
in Schema
, Curator
, and artifact.features
to access schemas and curators by dataset slot. PR @sunnyosun
schema.slots["obs"] # -> schema for .obs slot of AnnData
curator.slots["obs"] # -> curator for .obs slot of AnnData
artifact.features["obs"] # -> feature set for .obs slot of AnnData
ποΈ Re-structured the internal API away from monkey-patching Django models. PR @falexwolf
β οΈ Use of internal API
If you used the internal API, you might experience a breaking change. The most drastic change is that all internal registry-related functionality is now re-exported under lamindb.models
.
πΈ When re-creating an Artifact
, link subsequent runs instead of updating .run
and linking previous runs. PR @falexwolf
On the hub.
More details here. @chaichontat
Before |
After |
---|---|
An artifact is only shown as an output for the latest run that created the artifact. Previous runs donβt show it. |
All runs that (re-)create an artifact show it as an output. |
More changes:
β¨ Enable
Artifact.open()
andArtifact.load()
for.gz
files PR @Koncopdπ Fix passing a path to
ln.track()
when no path found bynbproject
PR @Koncopdπ Do not overwrite
._state_db
of records when the current instance is passed to.using
PR @KoncopdπΈ Do not show track warning for read-only connections PR @Koncopd
πΈ Raise
NotImplementedError
inArtifact.load()
if there is no loader PR @Koncopd
2025-02-27 db 1.1.1ΒΆ
πΈ Make the
obs
andvar
DataFrameCurator
objects accessible viaAnnDataCurator.slots
PR @sunnyosunπΈ Better error message upon re-creation of schema with same name and different hash PR @falexwolf
πΈ Raise consistency error if a source path suffix doesnβt match the artifact
key
suffix PR @falexwolfπΈ Automatically add missing columns upon
DataFrameCurator.standardize()
ifnullable
isTrue
PR @falexwolfπΈ Allow specifying
fsspec
upload options inArtifact.save
PR @KoncopdπΈ Populate
Artifact.n_observations
inArtifact.from_df()
PR @Koncopdπ Run
pip freeze
with current python interpreter PR @apβπ Fix notebook re-run with same hash PR @falexwolf
2025-02-18 db 1.1.0ΒΆ
β οΈ The FeatureSet
registry got renamed to Schema
.
All your code is backward compatible. The Schema
registry encompasses feature sets as a special case.
β¨ Conveniently track functions including inputs, outputs, and parameters with a decorator: ln.tracked()
. PR1 PR2 @falexwolf
@ln.tracked()
def subset_dataframe(
input_artifact_key: str, # all arguments tracked as parameters of the function run
output_artifact_key: str,
subset_rows: int = 2,
subset_cols: int = 2,
) -> None:
artifact = ln.Artifact.get(key=input_artifact_key)
df = artifact.load() # auto-tracked as input
new_df = df.iloc[:subset_rows, :subset_cols]
ln.Artifact.from_df(new_df, key=output_artifact_key).save() # auto-tracked as output
β¨ Make sub-types of ULabel
, Feature
, Schema
, Project
, Param
, and Reference
. PR @falexwolf
On the hub.
More details here. @awgaan @chaichontat
Before |
After |
---|---|
perturbation = ln.ULabel(name="Perturbation", is_type=True).save()
ln.ULabel(name="DMSO", type=perturbation).save()
ln.ULabel(name="IFNG", type=perturbation).save()
β¨ Use an overhauled dataset curation flow. @falexwolf @Zethson @sunnyosun
support persisting validation constraints as a
pandera
-compatible schemasupport validating any feature type, no longer just categoricals
make the relationship between features, dataset schema, and curator evident
Detailed changes for the overhauled curation flow.
β οΈ The API gained the lamindb.curators
module as the new way to access Curator
classes for different data structures.
This release introduces the schema-based
DataFrameCurator
andAnnDataCurator
The old-style curation flow for categoricals based on
lamindb.Curator.from_objecttype()
continues to work
Before |
After |
---|---|
Key PRs.
β¨ Overhaul curation guides + enable default values and filters on valid categories for features PR @falexwolf
β¨ Schema-based curators:
AnnDataCurator
PR @falexwolf⨠Schema-based curators:
DataFrameCurator
PR @falexwolf
Enabling PRs.
β¨ Allow passing
artifact
toCurator
PR @sunnyosunπ¨ A
ManyToMany
betweenSchema.components
and.composites
PR @falexwolfβ»οΈ Mark
Schema
fields as non-editable PR @falexwolf⨠Add auxiliary field
nullable
toFeature
PR @falexwolfβ»οΈ Prettify
AnnDataCurator
implementation PR @falexwolfπΈ Better error for malformed categorical dtype PR @falexwolf
π¨ A
ManyToMany
betweenSchema.components
and.composites
PR @falexwolfπ Restore
.feature_sets
as aManyToManyField
PR @falexwolfπ Rename
CatCurator
toCatManager
PR @falexwolfπ¨ Let
Curator.validate()
throw an error PR @falexwolfβ»οΈ Re-purpose
BaseCurator
asCurator
, introduceCatCurator
and consolidate shared logic underCatCurator
PR @falexwolfβ»οΈ Refactor
organism
handling in curators PR @falexwolfπ₯ Eliminate all logic related to
using_key
in curators PR @falexwolfπ Bulk-rename old-style curators to
CatCurator
PR @falexwolfπ¨ Self-contained definition of
CellxGene
schema / validation constraints PR @falexwolfπ Move
PertCurator
fromwetlab
here and addCellxGene
Curator
test PR @falexwolfπ Move CellXGene
Curator
fromcellxgene-lamin
here PR @falexwolf
schema = ln.Schema(
name="small_dataset1_obs_level_metadata",
features=[
ln.Feature(name="CD8A", dtype=int).save(), # integer counts for CD8A marker
ln.Feature(name="perturbation", dtype=ln.ULabel).save(), # a categorical feature that validates against the ULabel registry
ln.Feature(name="sample_note", dtype=str).save(), # a note for the sample
],
).save()
df = pd.DataFrame({
"CD8A": [1, 4, 0],
"perturbation": ["DMSO", ],
"sample_note": ["value_1", "value_2", "value_3"],
"temperature": [22.2, 25.7, 27.3],
})
curator = ln.curators.DataFrameCurator(df, schema)
artifact = curator.save_artifact(key="example_datasets/dataset1.parquet") # validates compliance with schema, annotates with metadata
assert artifact.schema == schema # the validating schema
β¨ Easily filter on a validating schema. @falexwolf @Zethson @sunnyosun
On the hub.
With the Schema
filter button, find all datasets that satisfy a given schema (β explore).
schema = ln.Schema.get(name="small_dataset1_obs_level_metadata") # get a schema
ln.Artifact.filter(schema=schema).df() # filter all datasets that were validated by the schema
β¨ Collection.open()
returns a pyarrow
dataset. PR @Koncopd
df = pd.DataFrame({"feat1": [0, 0, 1, 1], "feat2": [6, 7, 8, 9]})
df[:2].to_parquet("df1.parquet", engine="pyarrow")
df[2:].to_parquet("df2.parquet", engine="pyarrow")
artifact1 = ln.Artifact(shard1, key="df1.parquet").save()
artifact2 = ln.Artifact(shard2, key="df2.parquet").save()
collection = ln.Collection([artifact1, artifact2], key="parquet_col")
dataset = collection.open() # backed by files in the cloud storage
dataset.to_table().to_pandas().head()
β¨ Support s3-compatible endpoint urls, say your on-prem MinIO deployment. PR @Koncopd
Speed up instance creation through squashed migrations.
β‘ Squash migrations PR1 PR2 @falexwolf
Tiledbsoma.
β¨ Support
endpoint_url
in operations with tiledbsoma PR1 PR2 @Koncopd⨠Add
Artifact.from_tiledbsoma
to populaten_observations
PR @Koncopd
MappedCollection.
π Allow filtering on
np.nan
inobs_filter
ofMappedCollection
PR @Koncopdπ Fix labels for
NaN
in categorical columns forMappedCollection
PR @Koncopd
SpatialDataCurator.
π Fix
var_index
standardization ofSpatialDataCurator
PR1 PR2 @Zethsonπ Fix sample level metadata optional in
SpatialDataCatManager
PR @Zethson
Core functionality.
β¨ Allow checking the need for syncing without actually syncing PR @Koncopd
β¨ Check for corrupted cache in
Artifact.load()
&Artifact.open()
PR PR @Koncopd⨠Infer
n_observations
inArtifact.from_anndata
PR @Koncopdπ Account for VSCode appending languageid to markdown cell in notebook tracking PR @falexwolf
π Normalize module names for robust checking in
_check_instance_setup()
PR @Koncopdπ Fix idempotency of
Feature
creation whendescription
is passed and improve filter and get error behavior PR @ZethsonπΈ Make new version upon passing existing
key
toCollection
PR @falexwolfπΈ Throw better error upon checking
instance.modules
when loading a lamindb schema module PR @KoncopdπΈ Validate existing records in the DB irrespective of whether an ontology
source
is passed or not PR @sunnyosunπΈ Full guarantee of avoiding duplicating
Transform
,Artifact
&Collection
in concurrent runs PR @falexwolfπΈ Better user feedback during keyword validation in
Record
constructor PR @ZethsonπΈ Improve local storage not found warning message PR @Zethson
πΈ Better error message when attempting to save a file while not being connected to an instance PR @Zethson
πΈ Error for non-keyword parameters for
Artifact.from_x
methods PR @Zethson
Housekeeping.
2025-01-23 db 1.0.5ΒΆ
2025-01-21 db 1.0.4ΒΆ
π Revert Collection.description
back to unlimited length TextField
. PR @falexwolf
2025-01-21 db 1.0.3ΒΆ
πΈ In track()
, improve logging in RStudio sessions. PR @falexwolf
2025-01-20 R 0.4.0ΒΆ
π Migrate to lamindb v1 PR @falexwolf
πΈ Improve the user experience for setting up Python & reticulate PR @lazappi
2025-01-20 db 1.0.2ΒΆ
π Improvments for lamindb v1 migrations. PR @falexwolf
add a
.description
field toSchema
enable labeling
Run
withULabel
add a
.predecessors
and.successors
field toProject
akin to whatβs present onTransform
make
.uid
fields not editable
2025-01-18 db 1.0.1ΒΆ
π Block non-admin users from confirming the dialogue for integrating lnschema-core
. PR @falexwolf
2025-01-17 db 1.0.0ΒΆ
This release makes the API consistent, integrates lnschema_core
& ourprojects
into the lamindb
package, and introduces a breadth of database migrations to enable future features without disruption. Youβll now need at least Python 3.10.
Your code will continue to run as is, but you will receive warnings about a few renamed API components.
What |
Before |
After |
---|---|---|
Dataset vs. model |
|
|
Python object for |
|
|
Number of files |
|
|
|
|
|
|
|
|
Consecutiveness field |
|
|
Run initiator |
|
|
|
|
|
Migration guide:
Upon
lamin connect account/instance
you will be prompted to confirm migrating away fromlnschema_core
After that, you will be prompted to call
lamin migrate deploy
to apply database migrations
New features:
β¨ Allow filtering by multiple
obs
columns inMappedCollection
PR @Koncopd⨠In git sync, also search git blob hash in non-default branches PR @Zethson
β¨ Add relationship with
Project
to everything exceptRun
,Storage
&User
so that you can easily filter for the entities relevant to your project PR @falexwolf⨠Capture logs of scripts during
ln.track()
PR1 PR2 @falexwolf @Koncopd⨠Support
"|"
-seperated multi-values inCurator
PR @sunnyosunπΈ Accept
None
inconnect()
and improve migration dialogue PR @falexwolf
UX improvements:
πΈ Simplify the
ln.track()
experience PR @falexwolfyou can omit the
uid
argumentyou can organize transforms in folders
versioning is fully automated (requirement for 1.)
you can save scripts and notebooks without running them (corollary of 1.)
you avoid the interactive prompt in a notebook and the throwing of an error in a script (corollary of 1.)
you are no longer required to add a title in a notebook
πΈ Raise error when modifying
Artifact.key
in problematic ways PR1 PR2 @sunnyosun @KoncopdπΈ Better error message on running
ln.track()
within Python terminal PR @KoncopdπΈ Hide traceback for
InstanceNotEmpty
using Click Exception PR @ZethsonπΈ Only auto-search
._name_field
in sub-classes ofCanCurate
PR @falexwolfπΈ Simplify installation & API overview PR @falexwolf
πΈ Make
lamin_run_uid
categorical in tiledbsoma stores PR @KoncopdπΈ Raise
ValueError
when trying to search aNone
value PR @Zethson
Bug fixes:
π Skip deleting storage when deleting outdated versions of folder-like artifacts PR @Koncopd
π Let
SOMACurator()
validate and annotate all.obs
columns PR @falexwolfπ Fix renaming of feature sets PR @sunnyosun
π Do not raise an exception when default AWS credentials fail PR @Koncopd
π Only map synonyms when field is name PR @sunnyosun
π Fix
source
in.from_values
PR @sunnyosunπ Fix creating instances with storage in the current local working directory PR @Koncopd
π Fix NA values in
Curator.add_new_from()
PR @sunnyosun
Refactors, renames & maintenance:
ποΈ Integrate
lnschema-core
intolamindb
PR1 PR2 @falexwolf @KoncopdποΈ Integrate
ourprojects
into lamindb PR @falexwolfβ»οΈ Manage
created_at
,updated_at
on the database-level, makecreated_by
not editable PR @falexwolfπ Rename transform type βglueβ to βlinkerβ PR @falexwolf
π Deprecate the
--schema
argument oflamin init
in favor of--modules
PR @falexwolf
DevOps:
Detailed list of database migrations
Those not yet announced above will be announced with the functionality they enable.
β»οΈ Add
contenttypes
Django plugin PR @falexwolfπ Prepare introduction of persistable
Curator
objects by renamingFeatureSet
toSchema
on the database-level PR @falexwolfπ Add a
.type
foreign key toULabel
,Feature
,FeatureSet
,Reference
,Param
PR @falexwolfπ Introduce
RunData
,TidyTable
, andTidyTableData
in the database PR @falexwolf
All remaining database schema changes were made in this PR @falexwolf. Data migrations happen automatically.
remove
_source_code_artifact
from Transform, itβs been deprecated since 0.75data migration: for all transforms that have
_source_code_artifact
populated, populatesource_code
rename
Transform.name
toTransform.description
because itβs analogous toArtifact.description
backward compat:
in the
Transform
constructor usename
to populatekey
in all cases in which onlyname
is passedreturn the same transform based on
key
in casesource_code is None
via._name_field = "key"
data migrations:
there already was a legacy
description
field that was never exposed on the constructor; to be safe, we concatenated potential data in it on the new description fieldfor all transforms that have
key=None
andname!=None
, usename
to pre-populatekey
rename
Collection.name
toCollection.key
for consistency withArtifact
&Transform
and the high likelihood of you wanting to organize them hierarchicallya
_branch_code
integer on every record to model pull requestsinclude
visibility
within that coderepurpose
visibility=0
as_branch_code=0
as βarchiveβput an index on it
code a βdraftβ as _branch_code = 2, and βdraft prsβ as negative branch codes
rename values
"number"
to"num"
in dtypean
._aux
json field onRecord
a SmallInteger
run._status_code
that allows to writefinished_at
in clean up operations so that there is a run time also for aborted runsrename
Run.is_consecutive
toRun._is_consecutive
a
_template_id
FK to store the information of the generating template (whether a record is a template is coded via _branch_code)rename
_accessor
tootype
to publicly declare the data format assuffix, accessor
rename
Artifact.type
toArtifact.kind
a FK to artifact
run._logfile
which holds logsa
hash
field onParamValue
andFeatureValue
to enforce uniqueness without running the danger of failure for large dictionariesadd a boolean field
._expect_many
toFeature
/Param
that defaults toTrue
/False
and indicates whether values for this feature/param are expected to occur a single or multiple times for every single artifact/runfor feature
if itβs
True
(default), the values come from an observation-level aggregation and a dtype ofdatetime
on the observation-level meanset[datetime]
on the artifact-levelif itβs
False
itβs an artifact-level value anddatetime
meansdatetime
; this is an edge case because an arbitrary artifact would always be a set of arbitrary measurements that would need to be aggregated (βone just happens to measure a single cell line in that artifactβ)
for param
if itβs
False
(default), the values mean artifact/run-level values anddatetime
meansdatetime
if itβs
True
, the values would be from an aggregation, this seems like an edge case but say when characterizing a model ensemble trained with different parameters it could be relevant
remove the
.transform
foreign key from artifact and collection for consistency with all other records; introduce a property and a simple filter statement instead that maintains the same UXstore provenance metadata for
TransformULabel
,RunParamValue
,ArtifactParamValue
enable linking projects & references to transforms & collections
rename
Run.parent
toRun.initiated_by_run
introduce a boolean flag on artifact thatβs called
_overwrite_versions
, which indicates whether versions are overwritten or stored separately; it defaults toFalse
for file-like artifacts and toTrue
for folder-like artifactsRename
n_objects
ton_files
for more clarityAdd a
Space
registry to lamindb with an FK on everyBasicRecord
add a name column to
Run
so that a specific run can be used as a named specific analysisremove
_previous_runs
field on everything exceptArtifact
&Collection