Changelog 2024ΒΆ

Note

πŸ’‘ LaminDB implements β€œmigration-based versioning”. When upgrading your LaminDB installation to a new minor version in major.minor.patch, you also migrate your database via lamin migrate deploy.

Get notified by watching releases for git repositories: lamindb, laminhub, laminr, and bionty.

πŸͺœ For older changes, see: 2023 Β· 2022

2024-12-01 db 0.77.3ΒΆ

  • ✨ Add Curator.from_tiledbsoma() PR @Koncopd

  • ✨ Add an artifact loader for .yaml PR @Koncopd

  • 🎨 Better names for sections in .describe() PR1 PR2 @sunnyosun

  • 🚸 Improve feedback on not-up-to-date notebook content PR @falexwolf

  • 🚸 Log tiledbsoma target path PR @Koncopd

  • πŸ› Fix curator.validate() from public PR @sunnyosun

  • πŸ› Fix very long runtimes for Artifact.describe PR @Koncopd

  • πŸ› Fix organism in curator.standardize PR @sunnyosun

  • πŸ› Better error behavior for repeated calls of standardize PR @Zethson

  • πŸ› Fix the error on special chars in search strings PR @Koncopd

  • πŸ› Fix display of (non-categorical) str features in .describe() PR @sunnyosun

  • πŸ› Fix IPython import error PR @Koncopd

  • πŸ› Fix the error on existing cache on copy to cache in Artifact.save() PR @Koncopd

  • ⬆️ Exclude s3fs==2024.10.0 PR @Koncopd

  • ⬆️ Unpin supabase PR @Koncopd

2024-12-02 R 0.3.0ΒΆ

  • ✨ Add Artifact$open() to stream array-like artifacts PR @lazappi

  • ✨ Track artifacts as inputs PR @lazappi

  • ✨ Allow connecting to private LaminDB instances PR @rcannood

  • 🚸 Improve UX of db$track() and db$finish() PR @lazappi @falexwolf

2024-12-01 db 0.77.2ΒΆ

  • 🚸 A more intuitive artifact.describe() PR @sunnyosun

  • ✨ Enable to easily join features onto artifacts via Artifact.df() PR @falexwolf

  • ✨ Support features with dtype = 'str' PR @falexwolf

  • ✨ Support features with dtype = 'datetime' and improve feature values handling PR @falexwolf

  • 🎨 Let .from_values() return RecordList and better treat categorical PR @falexwolf

  • 🎨 Add .standardize() to Curator PR @sunnyosun

  • 🎨 Make search in bionty.base consistent with lamindb PR @Koncopd

2024-11-25 hub 0.31ΒΆ

🚸 Improve speed and relevance search. Make search consistent with lamindb 0.77. PR @Koncopd @fredericenard

Details

Since lamindb 0.77 and laminhub 0.31 search isn’t fuzzy anymore. This leads to much more predictable and relevant results similar to what users know from GitHub, Slack and similar tools.

image

Search results for lamindb and bionty are now exemplified here faq/search.

More changes.

  • ⚑ Faster loading speed on launch @awgaan

  • πŸ› Fixed incorrect sorting in the version selector @chaichontat

2024-11-21 R 0.2.0ΒΆ

✨ Read and write data with LaminR, an R client for LaminDB. PR @rcannood @lazappi

install.packages("laminr", dependencies = TRUE)  # install the laminr package from CRAN
library(laminr)

db <- connect()  # connect to the instance you configured on the terminal
db$track(path = "./my-analysis.Rmd")  # track a run of your notebook or script
artifact <- db$Artifact$get("3TNCsZZcnIBv2WGb0001")  # get an artifact record by uid
df <- artifact$load() # load the artifact into memory, e.g., a DataFrame

# do your work

db$Artifact.from_path("./my_result_folder", description="My result").save()  # save a folder
db$finish()  # mark the run finished

2024-11-21 db 0.77.0 | bionty 0.53ΒΆ

✨ Enable validation of Literal and other field types. PR @sunnyosun

  • A Literal-typed CharField is validated for any Record:

    from lnschema_core import Record, fields
    
    CRISPRType = Literal[
        "CRISPRi",
        "CRISPRa",
    ]
    
    class Treatment(Record):
        system: CRISPRType = fields.CharField()
    
    Treatment(system="crispr")
    #> FieldValidationError:
    #>   system: crispr is not a valid value
    #>     β†’ valid values are: CRISPRa, CRISPRi
    
  • For custom field types, use Django’s rich field validation and subclass ValidateFields:

    from lnschema_core import Record, ValidateFields
    from django.core.validators import RegexValidator
    
    class Reference(Record, ValidateFields):
        url: str = fields.URLField()
        doi: str = fields.CharField(
            validators=[
                RegexValidator(
                    regex=r"^(?:https?://(?:dx\.)?doi\.org/|doi:|DOI:)?10\.\d+/.*$",
                    message="Must be a DOI (10.1000/xyz123 or https://doi.org/10.1000/xyz123)",
                )
            ],
        )
    
    Reference(doi="abc.ef", url="myurl.com")
    #> FieldValidationError:
    #>   url: myurl.com is not valid
    #>     β†’ Enter a valid URL.
    #>   doi: abc.ef is not valid
    #>     β†’ Must be a DOI (10.1000/xyz123 or https://doi.org/10.1000/xyz123)
    
  • These possibilities are now leveraged in all schema modules.

🍱 Overhauled Nextflow integration. PR @Zethson

  • The registration script now leverages standardized nf-core metadata paths

  • Example use case is now more comprehensive and based on nf-core/scrnaseq

Other enhancements.

  • 🚸 Mark .qmd & .Rmd files as notebooks, not scripts PR @falexwolf

  • 🚸 Suppress hf filesystem warning due to not being explicitly implemented in upath PR @Koncopd

2024-11-15 db 0.76.16ΒΆ

New features.

  • ✨ Support saving R code including .qmd and .Rmd PR @falexwolf

  • ✨ Support registering artifacts on Hugging Face PR @Koncopd

Other enhancements.

  • πŸ“ Add guide on gene symbol mapping Guide PR @Zethson

  • 🚸 Improve speed and relevance search PR @Koncopd

  • 🚸 Refactor ln.track() to improve logging and method signature PR @falexwolf

  • 🚸 Enable to query with records from a different database instance PR @falexwolf

  • 🚸 Enable autocompletion for inherited methods in Jupyter PR @Koncopd

  • 🎨 Make EHRCurator immutable PR @Zethson

  • 🚸 Warn if curating against gene symbols PR @Zethson

  • 🚸 Better logging during curation PR PR @Zethson

  • 🚸 Add pip install extras for all schema modules PR @Zethson

Fixes.

  • πŸ› Fix transfering artifacts from a source instance with fewer schema modules PR @sunnyosun

  • πŸ› Fix registering Gene columns in DataFrameCurator PR @sunnyosun

Deprecations.

  • 🚚 Deprecate import_from_source in favor of import_source PR @Zethson

2024-11-13 hub 0.30ΒΆ

✨ A link that omits the last 4 version-coding characters now links to the latest version of a record, e.g., 13VINnFk89PE β†’ 13VINnFk89PE0006. PR @chaichontat

image

🚸 Filter selectors are more intuitive. PR @chaichontat

image

✨ Plots can now visualize time series. PR @chaichontat @Golodhros

image

Further enhancements.

Fixes.

2024-10-29 db 0.76.15ΒΆ

✨ Stream pyarrow.dataset via Artifact.open() PR @Koncopd

df = pd.DataFrame({"col1": [0, 0, 1, 1], "col2": [6, 7, 8, 9]})
df.to_parquet("df.parquet", engine="pyarrow", partition_cols=["feat1"])
artifact = ln.Artifact("df.parquet", description="A partitioned parquet").save()
with artifact.open() as dataset:    # get pyarrow.dataset.Dataset
    batches = dataset.to_batches()  # get a streaming iterator over batches
    dataset.to_table().to_pandas()  # read into memory and convert to pandas Dataframe

✨ Add .query_parents() and .query_children() to hierachical registries PR @Koncopd

import lamindb as ln

label1 = ln.ULabel(name="label1").save()
label2 = ln.ULabel(name="label2").save()
label3 = ln.ULabel(name="label3").save()
label1.children.add(label2)
label2.children.add(label3)
label3.query_parents()  # returns a QuerySet with label1 and label2
label1.query_children() # returns a QuerySet with label2 and label3

More changes:

2024-10-18 db 0.76.14 | bionty 0.52ΒΆ

Features.

  • ✨ Safe handling of renaming internal features and labels PR @sunnyosun

  • ✨ Enable curating multiple categorical features per artifact against the same label registry PR @falexwolf

  • ✨ Add Collection.append() PR @Koncopd

User experience.

  • 🚸 Document how to subclass Curator PR @Zethson

  • 🚸 Automate Curator.add_validated_from and remove it from the API Breaking PR @sunnyosun

  • 🚸 Remove unnecessary BioRecord.list_source() Breaking PR @sunnyosun

Bionty changes.

  • 🎨 Add unique constraints and fix gene_ref_is_symbol to label_ref_is_name PR @falexwolf

  • 🎨 Remove update in add_ontology_from_df() PR @sunnyosun

Fixes.

  • πŸ› Fix tracking of notebooks on PyCharm PR @Koncopd

  • πŸ› Fix loading artifact by key and give clear errors if no artifacts or transforms found PR @Koncopd

Deprecations.

2024-10-14 hub 0.29ΒΆ

✨ Advanced filters. @chaichontat @adamdev21

image

✨ Filter on click. @chaichontat

image

✨ Links to filtered artifacts in record tables. @chaichontat

image

✨ Links to artifacts in Features dashboard. @chaichontat

πŸ’„ Prettify plotting panels on Overview page PR @chaichontat

image

Enhancements.

Fixes.

  • πŸ’„ Do not display N/A fields in hover card

  • πŸ› Fix duplicated alias error with filters

  • πŸ› Fix collection copy button

  • πŸ› Deduplicate transform references in β€œOutputs” section

  • πŸ› Fix hang when loading transform

  • πŸ› Fix transform pane collapsing behavior

  • πŸ› Fix flicker

Deprecations.

  • 🚸 Beta API key is now stable; deprecate legacy API key

2024-10-11 db 0.76.13ΒΆ

  • 🚸 Do not error if a schema module of an instance isn’t installed PR @falexwolf

  • 🚸 More decent failing upon invalid lamin init calls PR @Koncopd

  • 🚸 Clearer feedback when somebody tries to switch the default instance PR @falexwolf

  • πŸ› Fix .get() with .using() PR @sunnyosun

  • πŸ› For .view_lineage() for circular input/outputs PR @sunnyosun

  • πŸ› Fix transferring to target instances that have more schema modules than the source instance PR @sunnyosun

  • πŸ› Fix monkey-patching __getitem__ on QuerySet PR @sunnyosun

  • πŸ› Do not double track runs PR @falexwolf

2024-10-08 db 0.76.12 | bionty 0.51ΒΆ

  • ✨ Overhaul save_vitessce_config() to support multiple artifacts and non-.zarr PR @keller-mark

  • 🚸 Query with typed labels through .features PR @falexwolf

  • πŸ“ Document how to query by dictionary-like run parameters PR @falexwolf

  • 🍱 New ExperimentalFactor version: efo-3.70.0 PR @Zethson

  • πŸ› Fix lamin load ... --with-env across servers PR @Koncopd

2024-10-02 hub 0.28ΒΆ

Major improvements.

Minor changes.

  • πŸ› Fix ulabel filtering on Artifacts page @adamdev21

  • πŸ› Fix collection page @sunnyosun

  • πŸ› Do not re-encode vitessce url if it’s non-lamin @sunnyosun

  • 🚸 A friendly message when a user still drafts a transform and hasn’t provided source code or run report @sunnyosun

  • 🚸 Use key as a download name for a transform @sunnyosun

  • 🚸 Only show linked labels in filters @adamdev21

  • πŸ’„ Order transforms & runs by created_at @sunnyosun

  • πŸ’„ On Overview page, show data formats, feature sets, and artifact size and counts @chaichontat

  • ✨ Support refSpecUrl in VitessceConfig @sunnyosun

2024-10-01 db 0.76.11ΒΆ

  • ✨ Add a reference manager schema module: findrefs PR @falexwolf

  • πŸ› Fix label name display in .describe() PR @falexwolf

  • πŸ› Fix permission error when saving a new artifact on python 3.12 PR @Koncopd

  • πŸ’„ Strip NotebookNotSaved error from report after ln.finish() PR @Koncopd

2024-09-30 db 0.76.10ΒΆ

  • 🚸 Re-worked the CLI: lamin load β†’ lamin connect & lamin get β†’ lamin load PR @Koncopd @ap–

  • ⚑ Improve performance of ln.connect(), lamin connect, and lamin load for a notebook PR @falexwolf

  • ⚑️ Speed up curation workflows through from_values PR @sunnyosun

  • 🚸 Improve lamin load UX for notebooks & scripts PR @falexwolf

  • 🚸 Transfer: Warn about inconsistencies between source & target instances PR @falexwolf

  • 🎨 Move .from_values() from Record to CanValidate PR @falexwolf

  • πŸ“ Document how to work with run parameters PR @falexwolf

  • ✨ Track transfers as transforms PR @falexwolf

  • ⚑️ Speed up describe PR @sunnyosun

  • 🚸 Minimal ln.track() PR @falexwolf

  • ✨ Enable to lamin load from on-prem domains PR @falexwolf

2024-09-26 db 0.76.9ΒΆ

Curating perturbations

Genetic, compound, and environmental perturbations and their targets can be curated with the wetlab schema. This can be achieved at varying levels of detail. For example, with a high degree of detail:


import wetlab as wl

EGFR_kd = wl.GeneticTreatment(
    system="CRISPR Cas9",
    name="EGFR knockdown",
    sequence="AGCTGACCGTGA",
    on_target_score=85,
    off_target_score=15
).save()

EGFR_gene = bt.Gene.from_source(symbol="EGFR").save()
EGFR_kd_target = wl.TreatmentTarget(name="cell growth").save()
EGFR_kd_target.genes.add(EGFR_gene)

artifact.genetic_treatmends.add(EGFR_kd)

2024-09-23 db 0.76.8ΒΆ

  • πŸ› Ensure is_latest is set to False in previous version if matching on artifact.key PR @falexwolf

  • ✨ Store artifacts under their virtual keys in cache PR @Koncopd

2024-09-18 db 0.76.7 | bionty 0.50ΒΆ

  • ✨ Enable getting the latest run environment for a transform PR @falexwolf

  • ✨ Enable displaying images via artifact.load(), add documentation for artifact loaders PR @falexwolf

  • 🚸 Do not throw an error but prompt upon ln.context.track() in a notebook PR @falexwolf

  • 🚸 Allow to use Collection.mapped() without saving the collection PR @Koncopd

  • 🚸 Simplify CLI commands PR @falexwolf

  • 🚸 Add parameter validation in bionty PR @Zethson

Technical changes
  • πŸ”Š More logging for import_from_source PR @sunnyosun

  • πŸ”Š Warn instead of hint about missed input tracking PR @Koncopd

  • ✨ Add n_observations to tiledbsoma-like artifacts PR @Koncopd

  • 🎨 Remove stream argument from artifact.load() PR @Koncopd

Ontology versions
  • 🍱 New Tissue version: uberon-2024-03-22 PR @Zethson

  • 🍱 New Disease version: mondo-2024-05-08 PR @Zethson

  • 🍱 New ExperimentalFactor version: efo-3.65.0 PR @Zethson

  • 🍱 New CellType version: cl-2024-04-05 PR @Zethson

Use case changes
  • ✨ Add support for cellxgene-schema 5.1.0 PR @Zethson

  • πŸ“ Integrate cellxgene guides and add data loader examples PR @Koncopd

  • πŸ“ Track all AnnData inputs in scrna-tiledbsoma PR @Koncopd

2024-09-09 db 0.76.6 | bionty 0.50ΒΆ

  • ✨ Enable negations in filter() PR @falexwolf

  • ✨ lamin get via key or uid PR @falexwolf

  • 🎨 Replace direct relation of Collection to FeatureSet with indirect relation through Artifact PR @falexwolf

  • 🎨 Remove backward relationships for Run, User & Source foreign keys PR PR @falexwolf @sunnyosun

  • πŸ› Reload in Transform() upon passing existing uid PR @falexwolf

2024-09-05 db 0.76.5ΒΆ

2024-09-04 db 0.76.4ΒΆ

2024-08-30 db 0.76.3 | bionty 0.49ΒΆ

✨ tiledbsoma integration. Guide PR @Koncopd

Example

Create a tiledbsoma.Experiment array store or append AnnData objects to an existing store.

# create new versioned tiledbsoma.Experiment
artifact = ln.integrations.save_tiledbsoma_experiment(
    adatas,
    measurement_name="RNA"
)

# append to existing tiledbsoma.Experiment
revised_artifact = ln.integrations.save_tiledbsoma_experiment(
    adatas,
    measurement_name="RNA",
    revises=artifact
)

Bionty updates.

  • 🍱 Add chebi & chembl PR @Zethson

  • 🍱 Add additional relationship types & update DevelopmentalStage and Tissue PR @Zethson

  • 🍱 New CellLine version: depmap-2024-Q2 PR @Zethson

More changes.

  • 🚚 Deprecate Curate in favor of Curator PR @sunnyosun

  • πŸ”₯ Remove lamin register and password argument of lamin login PR PR @falexwolf

2024-08-26 hub 0.27ΒΆ

✨ Instance overview page. @chaichontat

Screenshot
image

πŸ’„ Show Vitessce button next to dataset instead of VitessceConfig file. @sunnyosun

πŸ—οΈ Much improved on-prem deployment. @fredericenard

2024-08-23 db 0.76.2ΒΆ

🚸 Simplify versioning. PR @falexwolf

Semantic version strings in .version are now optional as in git.
image
For Artifact & Transform, you can now also create new versions by passing the key argument.
artifact_v1 = ln.Artifact.from_df(df, key="my_datasets/my_study1.parquet").save()
# below automatically creates a new version of artifact_v1 because the `key` matches
artifact_v2 = ln.Artifact.from_df(df_updated, key="my_datasets/my_study1.parquet").save()
  • 🚚 Deprecate is_new_version_of argument in favor of revises

  • 🚚 Deprecate passing version to constructors; rather set .version after creating records

More changes.

2024-08-16 db 0.76.1ΒΆ

🚸 Overhauled context tracking experience with ln.context.track() Details & PR @falexwolf @chaichontat

ln.context.uid = "FPnfDtJz8qbE0000"  # <-- auto-generated by ln.context.track()

# track the execution of your notebook or script with inputs & outputs
ln.context.track()
What was the previous experience?

Now:

image

Previously:

image
How does it look on the hub?

If you don’t label with a semantic version tag, you’ll get an auto-generated revision id.

image

⚠️ Breaking change: ln.track() now returns None instead of a Run. Access the run via ln.context.run instead.

More changes:

  • 🚸 Update .get() to accept expressions so that it can replace .filter(...).one() PR @falexwolf

  • ✨ MappedCollection compatible with latest scdataloader PR @Koncopd PR @jkobject

2024-08-14 db 0.76ΒΆ

2024-08-10 hub 0.26ΒΆ

2024-08-08 db 0.75.1ΒΆ

🚸 Improved the cellxgene_lamin curation guide.

2024-08-08 bionty 0.48ΒΆ

New ontology versions.

  • 🍱 New Tissue version: uberon-2024-05-13 PR @Zethson

  • 🍱 New Tissue version: uberon-2024-01-18 PR @Zethson

  • 🍱 New Phenotype version: zp-2024-04-18 PR @Zethson

  • 🍱 New Phenotype version: pato-2024-03-28 PR @Zethson

  • 🍱 New Phenotype version: mp-2024-06-18 PR @Zethson

  • 🍱 New Phenotype version: hp-2024-04-26 PR @Zethson

  • 🍱 New Pathway version: pw-7.84 PR @Zethson

  • 🍱 New Pathway version: go-2024-06-17 PR @Zethson

  • 🍱 New Disease version: mondo-2024-06-04 PR @Zethson

  • 🍱 New Disease version: doid-2024-05-29 PR @Zethson

  • 🍱 New Disease version: mondo-2024-01-03 PR @Zethson

  • 🍱 New ExperimentalFactor version: efo-3.66.0 PR @Zethson

  • 🍱 New ExperimentalFactor version: efo-3.62.0 PR @Zethson

  • 🍱 New Drug version: dron-2024-08-05 PR @Zethson

  • 🍱 New CellType version: cl-2024-05-15 PR @Zethson

  • 🍱 New CellType version: cl-2024-01-04 PR @Zethson

2024-08-03 db 0.75ΒΆ

✨ Track mutations of array stores. Guide PR @Koncopd

  • Artifacts that store mutable arrays can lead to non-reproducible queries.

  • To monitor reproduciblity and data lineage, mutations are now tracked when a context manager and Artifact.open(mode="w") for tiledbsoma array stores is used:

    with artifact.open(mode="w") as array:
        # mutate `artifact`
    
    # `artifact` now points to a new version of the artifact with an updated hash
    

🚸 A better structured API. PR @falexwolf

  • 🚸 Easier typing & maintenance of categorical fields via typing.Literal instead of Django’s migration-dependent CharField.choices

  • 🚸 Less clutter in auto-complete

    • 🚚 All fields pointing to link records start with links_

    • 🚚 Several fields for Artifact are now private via _ prefix: accessor, key_is_virtual, feature_values, param_values, hash_type, previous_runs

  • 🎨 More consistency

    • 🚚 Rename Transform.parents to Transform.predecessors to disambiguate procedural/temporal from ontological/conceptual hierachies

    • 🎨 Feature names are now guaranteed to be unique in a lamindb instance Feature.name

    • 🎨 Consistent length of hash fields: HASH_LENGTH=22

    • 🚚 Rename input_of to input_of_runs

    • 🎨 Transform.latest_report is now a property point to Transform.latest_run.report to simplify the schema

    • 🎨 Artifact.type now defaults to None when passing a path so that auxiliary files and folders aren’t labeled as dataset

  • 🚸 Better definition of Collection

    • 🚚 Rename fields .artifact to .meta_artifact and .unordered_artifacts to .artifacts

    • Iteration over an ordered QuerySet of artifacts is now possible via .ordered_artifacts

    • For collections that have a single data artifact, access it via .data_artifact

  • πŸ—οΈ Towards searchable source code

    • 🚚 Rename Transform.source_code to Transform._source_code_artifact

    • Re-introduce Transform.source_code as a text field together with a field hash

Better storage management.

  • 🚸 Enable deleting artifacts in all managed storage locations of the current instance PR @falexwolf

  • ♻️ Do not write storage records to hub for local test instances PR @falexwolf

  • πŸ› Fix populating storage.instance_uid during init_instance PR @falexwolf

Various updates.

2024-08-03 bionty 0.47ΒΆ

πŸ—οΈ Bionty is now a single Python package. PR PR PR PR PR

⚠️ Migration: Once you load an instance, you’ll be asked to uninstall lnschema_bionty and lamin migrate deploy

  • On the SQL level, tables are now prefixed with bionty_ instead of lnschema_bionty_

  • On the Django level, you can mount the bionty instead of the lnschema_bionty apps

🚸 You can now import from in-house ontology sources. PR @sunnyosun

  • 🚚 Rename PublicSource to Source & from_public to from_source

  • Import from any parquet file into your registry, akin to how Bionty imports public ontology sources

User experience.

  • ⚑ Performantly import bulk records via .import_from_source()

  • 🚸 More reliable ontology_id field recognition

  • ✨ Better error message for synonym duplications PR @Zethson

  • 🚚 All link model fields start with links_ PR falexwolf

  • 🎨 CellMarker.name is now unique together with organism PR sunnyosun

New ontologies.

  • ✨ Add ICD ontology for Disease PR PR Zethson

  • 🍱 New Protein version: uniprot-2024-03 PR sunnyosun

  • 🍱 New Gene version: ensembl-111/112 PR Zethson

  • 🍱 New ExperimentalFactor version: efo-3.63 PR Zethson

  • 🍱 New CellType version: cl-2024-02-13 PR Zethson

  • 🍱 New Tissue version: uberon-2024-02-20 PR Zethson

  • 🍱 New Organism version: ensembl-release-111 & ensembl-release-112 PR sunnyosun

  • 🍱 New Disease version: mondo-2024-02-06 PR Zethson

  • 🍱 New Disease version: DOID-2024-01-31 PR Zethson

  • 🍱 New Phenotype version: hp-2024-03-06 PR Zethson

  • 🍱 New Phenotype version: mp-2024-02-07 PR Zethson

  • 🍱 New Phenotype version: zp-2024-01-22 PR Zethson

  • 🍱 New Pathway version: pw-7.82 PR Zethson

  • 🍱 New Drug version: DRON-2024-03-02 PR Zethson

2024-07-26 hub 0.25ΒΆ

Overhauled the REST API: better performance and architecture.

UI improvements.

2024-07-26 db 0.74.3ΒΆ

⚑ Speed up populating parent records by an order of magnitude, remove the parents keyword (PR @sunnyosun).

Features.

  • ✨ Allow for multiple local storage locations with the same root path PR @falexwolf

  • ✨ Add add_from_df method to BioRecord PR @sunnyosun

Chores.

2024-07-22 db 0.74.2ΒΆ

The API is now cleaner and fields are typed.

Details

All users who don’t use Django outside of lamindb can set Django’s internal API that clutters the Record name spaces by running: lamin set private-django-api on the command line.

tiledbsoma is now better supported.

  • ✨ Artifact.open() for tiledbsoma stores PR @Koncopd

Better names.

  • 🚚 Deprecate Artifact.backed() in favor of Artifact.open() PR @Koncopd

  • 🚚 Deprecate Annotate in favor of Curate PR @falexwolf

  • 🚚 Deprecate Registry in favor of Record PR @falexwolf

Better documentation.

Security updates & bug fixes.

  • πŸ”’ Enable Ruff security rules (bandit) & CodeQL PR @Zethson

  • πŸ› Fix return values of .save() for a few classes PR @falexwolf

2024-07-01 hub 0.24ΒΆ

2024-06-26 db 0.74.1ΒΆ

♻️ Refactor ln.settings PR @falexwolf.

  • ✨ Pass custom names for scripts via ln.settings.transform.name = "My script"

  • ⚠️ ln.settings.storage returns a StorageSettings object (root via ln.settings.storage.root)

Features.

  • ✨ Support different join types in QuerySet.df() PR @insavchuk

Use cases.

Docs.

2024-06-20 db 0.74ΒΆ

✨ You can now distinguish model-like and dataset-like artifacts via a type field in the Artifact registry.

  • 🚸 Leverage artifact.params.add_values() to annotate model-like artifacts like you leverage artifact.features.add_values() to annotate dataset-like artifacts

  • πŸ—οΈ Add type field to Artifact, allow linking model-like artifacts against params, validate params akin to validating features, enable features-based annotation with non-ulabels PR @falexwolf

  • 🚸 Support dict in add_values PR @Zethson

♻️ Refactor after upath upgrade. PR PR @Koncopd

2024-06-13 db 0.73.2ΒΆ

  • πŸ› Fix clashing reverse accessors for .previous_runs and .run PR @falexwolf

  • πŸ› Import IPython inside view PR @Koncopd

2024-06-05 db 0.73.1ΒΆ

  • πŸ—οΈ Instantly synchronize instance schema with the hub PR @fredericenard

  • ⬆️ Upgrade universal_pathlib to 0.2.2 PR @Koncopd

  • πŸ› Fix generation of uid for manual Transform constructor PR @falexwolf

  • πŸ”₯ Deleting artifact.stage() in favor of artifact.cache() (was deprecated in 0.70.0)

2024-05-29 db 0.73.0ΒΆ

Annotating & querying by features improved:

  • ✨ Support non-categorical feature values PR @falexwolf

  • ✨ Annotate dict-style with features & values PR @falexwolf

  • ✨ Query by features via .features.filter(key=value) PR @falexwolf

  • πŸ—οΈ Feature values decoupled from feature sets PR @falexwolf

Example:

# annotate dict-style (feature & category names get validated)
artifact.features.add_values({
    "species": "setosa",
    "scientist": ["Barbara McClintock", "Edgar Anderson"],
    "instrument": "Leica IIIc Camera",
    "temperature": 27.6,
    "study": "Study 0: initial plant gathering",
    "is_awesome": True
})

# get the dict back
artifact.features.get_values()

# query by feature
ln.Artifact.features.filter(is_awesome=True)

Various improvements:

  • 🚚 Additional non-breaking constraints in the core schema PR @falexwolf

  • 🚸 Make .upload_from(), .download_to(), and .view_tree() more user friendly PR @falexwolf PR @Koncopd

  • 🚸 More intuitive version updating dialogue PR @falexwolf

  • πŸ› Actually add tracking run for entities beyond Artifact & Collection PR @falexwolf

  • 🚸 ln.track() returns run PR @falexwolf

  • 🚸 Better duplicate detection and search PR @falexwolf

  • 🚸 Prettier .describe() PR @falexwolf

  • 🚸 More interactivity in lamin save PR @falexwolf

  • 🚸 create flag in .from_values() PR @falexwolf

  • 🚸 Better ordering of fields in dataframe & record representations PR @falexwolf

  • πŸ“ Improved API reference: docs now show relationship attributes PR @falexwolf

2024-05-19 db 0.72.1ΒΆ

  • ⬆️ Update bionty PR @sunnyosun

  • πŸ› Deal with migration errors when keep-artifacts-local is true PR @falexwolf

2024-05-19 db 0.72.0ΒΆ

  • ✨ Extend managed access for AWS S3 to arbitrary paths PR @Koncopd @fredericenard

  • ✨ Extended data lineage tracking PR @falexwolf

    • Now store all creating runs and all updating runs for any entity, not just for Artifact & Collection, e.g., runs can now have CellType record outputs

    • Code is simpler through inheritance from two new base classes: TracksRun and TracksUpdates

  • ♻️ Briefer and richer syntax for denoting feature types, renamed Feature.type to Feature.dtype, e.g., for categorical features, a valid type can be: cat[ULabel|bionty.Drug] PR @falexwolf

  • ✨ Support non-categorical metadata PR @falexwolf

    • Track non-categorical features: int, float, bool, datetime, lists & dictionaries stored in a FeatureValue registry

    • Track arbitrary typed parameters for runs through a Param registry analogous to the Feature registry: this replaces the hard-to-validate, hard-to-migrate, and hard-to-query json field of Run

  • πŸ—οΈ Refactor link models PR PR @falexwolf

    • All annotation-related links are now stratified by Feature: what held for ULabel now also holds CellType and all other Bionty registries

    • Indicate whether semantic keys were used during validation to enable warnings upon renames

    • Protect artifact annotations rather than cascade delete them

    • More consistent naming of link models, e.g., ulabels.artifact_links instead of ulabels.artifactulabel_set

    • Dropped linking Bionty entities directly against Collection

    • Pruned & squashed migrations for faster instance creation

2024-05-14 db 0.71.3ΒΆ

2024-05-07 db 0.71.2ΒΆ

  • ✨ Enable passing parameters to ln.track() PR @falexwolf

2024-05-07 db 0.71.1ΒΆ

  • 🚸 Upload source code of scripts upon ln.finish() and no longer upon ln.track() PR @falexwolf

  • 🎨 Make features.add_feature_set public PR @sunnyosun

  • 🎨 Use the same uid for the same feature set in transfer PR @sunnyosun

  • 🎨 Upon upload switch to virtual key PR @falexwolf

  • ⚑️ Zarr and cache improvements PR @Koncopd

  • ♻️ Extend valid suffixes to composite suffixes PR @falexwolf

  • πŸ”₯ Remove little-used artifact.view_tree() PR @falexwolf

2024-05-01 db 0.71.0ΒΆ

  • ✨ Manage multiple storage locations with integrity PR @falexwolf

  • 🚚 Add an instance_uid field to Storage | 374 falexwolf

  • 🚸 Proper progress bars for upload and download PR @Koncopd

  • 🚸 Make save return self PR @falexwolf

2024-04-24 db 0.70.4ΒΆ

  • ✨ Allow passing path to .from_anndata PR @sunnyosun

  • 🚸 In .setup.delete(), check for data deletion & delete from hub PR @falexwolf

  • ⚑️ Speed up latest_version PR @falexwolf

  • 🚸 Better user feedback on folder-like artifacts PR @falexwolf

2024-04-22 db 0.70.3ΒΆ

  • 🚸 Update metadata like description upon re-running PR @falexwolf

  • πŸ› Fix detection of AnnData in zarr and h5ad, refactor directory upload PR @Koncopd

  • 🚸 Raise error if transforms of type notebook or script are passed manually PR @falexwolf

2024-04-19 db 0.70.2ΒΆ

  • ♻️ In Vitessce integration, separate VitessceConfig from its referenced artifacts PR @falexwolf

  • 🚸 In ln.finish(), remove flag i_saved_the_notebook PR @falexwolf

2024-04-18 db 0.70.1ΒΆ

2024-04-17 db 0.70.0ΒΆ

  • 🚸 Update data source in case transform is re-run PR @falexwolf

  • 🚸 Enable to label transforms via transform.ulabels PR @falexwolf

  • 🚚 Deprecate stage() in favor of cache() PR @falexwolf

2024-04-12 db 0.69.10ΒΆ

  • ✨ Add .obsm and .layers to MappedCollection and rename label_keys to obs_keys PR @Koncopd

  • 🚸 Eliminate kwargs PR @sunnyosun

  • ✨ Introduce Annotate.from_mudata PR @sunnyosun

2024-04-08 db 0.69.9ΒΆ

2024-04-04 db 0.69.8ΒΆ

2024-04-03 db 0.69.7ΒΆ

  • ✨ Add ability to upload arbitrary files or folders from CLI PR @falexwolf

  • πŸ› Fix anndata backed mode incompatibility with scipy 1.13.0 f

2024-04-02 db 0.69.6ΒΆ

  • πŸš‘οΈ Temp fix region for non-hosted buckets PR @sunnyosun

2024-03-30 db 0.69.5ΒΆ

2024-03-30 db 0.69.4ΒΆ

2024-03-28 db 0.69.3ΒΆ

  • ✨ Introduce annotation flow via Annotate.from_df and Annotate.from_anndata PR 1 2 3 @sunnyosun

2024-03-26 db 0.69.2ΒΆ

2024-03-18 db 0.69.1ΒΆ

✨ To try out, add lamindb.validation with the Validator class PR @sunnyosun

2024-03-17 db 0.69.0ΒΆ

Main new features:

  • ✨ Integrate lamindb with git PR PR @falexwolf

  • ✨ Introduce ln.finish(), track run finish times as run.finished_at, rename run.run_at to run.started_at, upload notebooks during ln.finish() PR @falexwolf

  • 🚸 Upload script source code and environment during ln.track() PR @falexwolf

Other changes:

  • ✨ Allow including simple related fields in .df() PR @falexwolf

  • 🚚 Move transform settings into settings PR @falexwolf

  • ✨ Add latest_version filter for QuerySet PR @falexwolf

  • 🚚 Rename transform.short_name to transform.key PR @falexwolf

  • 🚸 Return storage_idx in MappedCollection PR @Koncopd

  • ♻️ Add a JSON field to Run PR @falexwolf

2024-03-11 db 0.68.2ΒΆ

  • 🚸 Move transform & run artifacts into cache before uploading PR @falexwolf

  • 🚸 More sensible transform types PR @falexwolf

  • 🚚 Rename lnschema_lamin1 to wetlab PR @falexwolf

2024-03-08 db 0.68.1ΒΆ

  • 🚸 You can now use ln.connect() to connect to a LaminDB instance PR @falexwolf

  • 🚸 You can no longer delete data from non-default storage locations, as these might be tracked in other instances PR @sunnyosun

  • 🚸 Enable transferring data from local instances to remote instances PR @sunnyosun

2024-03-01 db 0.68.0ΒΆ

🚸 Decouple features linking from Artifact construction PR 1 2 3 @sunnyosun.

# default constructor for PathLike
artifact = ln.Artifact("mysc.h5ad", description="raw data")
# from_ constructors for other types
artifact = ln.Artifact.from_anndata(mysc_adata, description="raw data")  # no longer links features
artifact = artifact.save()

# high-level feature linking
artifact.features.add_from_anndata(var_field=bt.Gene.ensembl_gene_id)
artifact.features.add_from_df()

# low-level feature linking
meta = ln.Feature.from_values(mysc_adata.obs.columns, field="name")
genes = bt.Gene.from_values(mysc_adata.var.ensembl_gene_id, field="ensembl_gene_id")
artifact.features.add(genes, slot="obs")
artifact.features.add(genes, slot="var")

# labels linking (no change)
labels = ln.ULabel.from_values(adata.obs.donor, field=...)
ln.save(labels)
artifact.labels.add(labels)

2024-02-02 db 0.67.3ΒΆ

2024-01-14 db 0.67.2ΒΆ

  • ✨ Enable staging notebooks & code using the CLI PR @falexwolf

2024-01-12 db 0.67.1ΒΆ

  • πŸ› Fix idempotency of collection.save() PR @falexwolf

  • 🚸 Disallow bulk-delete for Artifact, Transform & Collection PR @falexwolf

  • 🚸 Init transform versions at 1 PR @falexwolf

  • ✨ Load json and html files PR @falexwolf

2024-01-11 db 0.67.0ΒΆ

  • 🚚 Rename .bionty to .public, .from_bionty to .from_public PR @sunnyosun

2024-01-09 db 0.66.1ΒΆ

  • πŸ› Fix id matching in view_lineage PR @sunnyosun

  • ♻️ Fix connection time outs PR @Koncopd

  • ♻️ Incorporate edge cases in inner and outer join in Collection.mapped PR @Koncopd

  • 🎨 Not create organism records when calling .bionty() PR @sunnyosun

2024-01-07 db 0.66.0ΒΆ

2024-01-05 db 0.65.1ΒΆ

2024-01-02 db 0.65.0ΒΆ

biontyΒΆ

Name

PR

Developer

Date

Version

🚚 Rename Bionty to PublicOntology class

536

sunnyosun

2024-01-12

0.36.0

🚚 Rename bionty to bionty-base

539

sunnyosun

🚚 Rename PublicSource to Source

263

sunnyosun

2024-07-26

0.44.0

🎨 Do not add obsolete terms due to ontology_id duplication

261

sunnyosun

2024-07-25

⚑️ Speed up parents

259

sunnyosun

2024-07-22

🚚 Rename registry to record

256

falexwolf

2024-07-17

♻️ Consciously use class method

255

falexwolf

2024-07-10

πŸ› Fix clashing reverse accessors between .previous_runs and .run

249

falexwolf

2024-06-13

0.43.0

♻️ Reformulate data lineage, remove json field from run

247

falexwolf

2024-05-19

0.42.0

♻️ Protect gene, protein, cell_marker & pathway in their FeatureSet relationships

246

falexwolf

2024-05-18

πŸ—οΈ Naming conventions for link tables, protecting deletion in link tables, maintaining integrity upon label & feature renames

245

falexwolf

2024-05-18

♻️ Account for migrations in lnschema_core

244

falexwolf

2024-05-17

πŸ”₯ Prune migrations

243

falexwolf

2024-05-16

πŸ—οΈ Spell out link tables with Artifact and link features

239

falexwolf

2024-05-16

πŸ”₯ Remove linking Collection to all Bionty entities

238

falexwolf

2024-05-15

✨ Add sources

237

sunnyosun

2024-05-14

🎨 Fix passing arguments to from_public

236

sunnyosun

2024-05-13

πŸ› Fix organism

235

sunnyosun

2024-05-08

πŸš‘οΈ Fix public_source in inspect

232

sunnyosun

2024-04-18

πŸ› Fix syncing public sources

230

sunnyosun

2024-04-11

✨ Add PublicSource.set_as_currently_used

223

sunnyosun

2024-03-14

0.41.4

✏️ Fix encoding

213

sunnyosun

2024-01-12

0.38.4

πŸš‘οΈ Re-encode PublicSource

212

sunnyosun

2024-01-10

0.38.3

🚚 Rename .bionty to .public

208

sunnyosun

2024-01-09

🚸 Do not create organism when calling bionty

206

sunnyosun

2024-01-08

wetlabΒΆ

Name

PR

Developer

Date

Version

♻️ Model categoricals via simple Literal

60

falexwolf

2024-08-02

✨ Improved support for perturbations

56

Zethson

2024-07-29

🚚 Rename lnschema-lamin1 to wetlab

47

sunnyosun

2024-03-08

0.27.0

nbprojectΒΆ

Name

PR

Developer

Date

Version

⬆️ Upgrade to pydantic v2

284

falexwolf

2024-07-23

⚑️ Warn instead of raising the exception when ipylab is not installed

283

Koncopd

2024-05-08

0.10.3

♻️ Make ipylab an optional dependency

282

falexwolf

2024-05-06

πŸ”‡ Silence erroneous logging

279

falexwolf

2024-02-27

0.10.1

🚸 Init version at 1

277

falexwolf

2024-01-11

0.10.0