lamindb.Branch .md

class lamindb.Branch(name: str, description: str | None = None)

Bases: BaseSQLRecord

Branches for change management with archive and trash states.

The 3 built-in branches: main, trash & archive

The main branch acts as the default branch.

The trash branch acts like a trash bin on a file system. It you delete a SQLRecord object via .delete(), it gets moved onto the trash branch and scheduled for deletion.

The archive acts like an archive that hides objects from queries and searches without scheduling them for deletion. To move an object into the archive, run: obj.branch_id = 0; obj.save().

Parameters:
  • name – A unique name. When lower-cased, is constrained to be unique across all branches.

  • description – A description.

Examples

To create a contribution branch and switch to it, run:

lamin switch -c my_branch

To merge a contribution branch into main, run:

lamin switch main  # switch to the main branch
lamin merge my_branch  # merge contribution branch into main

To see the current branch along with other information, run:

lamin info

To annotate the current branch with a README.md, run:

lamin annotate branch --readme README.md

To comment on the current branch, run:

lamin annotate branch --comment "I think we should revisit this, tomorrow, WDYT?"

To describe the current branch (optionally include comments), run:

lamin describe branch --include comments

To trace on which branch a SQLRecord object was created, run:

sqlrecord.created_on.describe()

To open a Change Request for a branch, run:

lamin update branch --status draft  # for current branch
lamin update branch --name my_branch --status review  # for any branch
branch = ln.Branch.get(name="my_branch")
branch.status = "draft"
branch.save()

branch.status = "review"
branch.save()

Just like Pull Requests on GitHub, branches are never deleted so that the provenance of a change stays traceable.

Managing is_latest during branching

is_latest is branch-aware during development and reconciled on merge.

  • Creating a new version on a contribution branch keeps the previous version on main as is_latest=True.

  • After lamin merge, only one object per version family remains with is_latest=True in the target branch.

  • If both source and target branches have is_latest=True, the merged branch keeps the newest object by created_at.

Example flow:

# before merge
# main: v1.is_latest=True
# contribution branch: v2(revises=v1).is_latest=True
lamin switch main
lamin merge my_branch
# after merge on main: v2.is_latest=True, v1.is_latest=False
Logical vs. physical branching

LaminDB uses logical branching via SQLRecord’s .branch field, treating branch like any other field during queries & tracing, and keeping infrastructure simple and platform-agnostic. However, it doesn’t allow isolating SQL UPDATE statements on a branch (only their corresponding DbWrite events). Here are some notable alternatives:

  • Some Postgres platforms like Supabase or Neon, by contrast, provide physical branching through cloning entire databases. This allows for isolated SQL UPDATE statements but creates separate, disconnected environments and much overhead.

  • Project Nessie is a versioned catalog for data lakes that tracks file states. LaminDB is analogous to Nessie in that it also treats branching on the metadata catalog level (considering LaminDB’s SQL database as the metadata catalog).

  • Dolt is a specialized database engine that provides storage-level branching. It allows branch isolation and merging at the engine level. While powerful, it requires using the Dolt database itself.

Why logical branching? Data science and ML workflows are primarily append-only. Because a “change” usually results in a new version of an artifact, transform, or collection or new runs or other new objects rather than an in-place modification, the row-level branch field provides isolation for 99% of use cases. This avoids the technical complexity of row duplication, preserves database integrity, and allows the is_latest logic to reconcile versions globally upon merge.

Attributes

property status: Literal['standalone', 'draft', 'review', 'merged', 'closed']

Branch status.

Get and set the status of the branch.

status

code

description

closed

-2

Change Request was closed without merging.

merged

-1

The branch was merged into another branch.

standalone

0

A standalone branch without Change Request.

draft

1

Change Request exists but is not ready for review.

review

2

Change Request is ready for review.

The database stores the branch status as an integer code in field _status_code.

Example

See the status of a branch:

branch.status
#> 'standalone'

Open a Change Request in draft state:

branch.status = "draft"
branch.save()

Request review for the Change Request:

branch.status = "review"
branch.save()

Query by status:

ln.Branch.filter(status="merged").to_dataframe()

Simple fields

name: str

Name of branch.

uid: str

Universal id.

This id is useful if one wants to apply the same patch to many database instances.

description: str | None

Description of branch.

created_at: datetime

Time of creation of record.

Relational fields

space: Space

The space in which the object is defined.

created_by: User

Creator of branch.

users: RelatedManager[User]

Users linked to this branch (e.g. reviewers) ← branches.

ulabels: RelatedManager[ULabel]

ULabels annotating this branch ← ulabel.

projects: RelatedManager[Project]

Projects annotating this branch ← project.

ablocks: RelatedManager[BranchBlock]

Attached blocks ← branch.

Class methods

classmethod filter(*queries, **expressions)

Query records.

Parameters:
  • queries – One or multiple Q objects.

  • expressions – Fields and values passed as Django query expressions.

Return type:

QuerySet

See also

Examples

>>> ln.Project(name="my label").save()
>>> ln.Project.filter(name__startswith="my").to_dataframe()
classmethod get(idlike=None, **expressions)

Get a single record.

Parameters:
  • idlike (int | str | None, default: None) – Either a uid stub, uid or an integer id.

  • expressions – Fields and values passed as Django query expressions.

Raises:

lamindb.errors.ObjectDoesNotExist – In case no matching record is found.

Return type:

SQLRecord

See also

Examples

record = ln.Record.get("FvtpPJLJ")
record = ln.Record.get(name="my-label")
classmethod to_dataframe(include=None, features=False, limit=100)

Evaluate and convert to pd.DataFrame.

By default, this returns up to 100 rows for a fast overview. Pass limit=None to fetch all matching records.

By default, maps simple fields and foreign keys onto DataFrame columns.

Guide: Query & search registries

Parameters:
  • include (str | list[str] | None, default: None) – Related data to include as columns. Takes strings of form "records__name", "cell_types__name", etc. or a list of such strings. For Artifact, Record, and Run, can also pass "features" to include features with data types pointing to entities in the core schema. If "privates", includes private fields (fields starting with _).

  • features (bool | list[str], default: False) – Configure the features to include. Can be a feature name or a list of such names. If "queryset", infers the features used within the current queryset. Only available for Artifact, Record, and Run.

  • limit (int, default: 100) – Maximum number of rows to display. Defaults to 100. If None, includes all results.

  • order_by – Field name to order the records by. Prefix with ‘-’ for descending order. Defaults to ‘-id’ to get the most recent records. This argument is ignored if the queryset is already ordered or if the specified field does not exist.

Return type:

DataFrame

Examples

Include the name of the creator:

ln.Record.to_dataframe(include="created_by__name"])

Include features:

ln.Artifact.to_dataframe(include="features")

Include selected features:

ln.Artifact.to_dataframe(features=["cell_type_by_expert", "cell_type_by_model"])
classmethod search(string, *, field=None, limit=20, case_sensitive=False)

Search.

Parameters:
  • string (str) – The input string to match against the field ontology values.

  • field (str | DeferredAttribute | None, default: None) – The field or fields to search. Search all string fields by default.

  • limit (int | None, default: 20) – Maximum amount of top results to return.

  • case_sensitive (bool, default: False) – Whether the match is case sensitive.

Return type:

QuerySet

Returns:

A sorted DataFrame of search results with a score in column score. If return_queryset is True. QuerySet.

See also

filter() lookup()

Examples

records = ln.ULabel.from_values(["Label1", "Label2", "Label3"]).save()
ln.ULabel.search("Label2")
classmethod lookup(field=None, return_field=None)

Return an auto-complete object for a field.

Parameters:
  • field (str | DeferredAttribute | None, default: None) – The field to look up the values for. Defaults to first string field.

  • return_field (str | DeferredAttribute | None, default: None) – The field to return. If None, returns the whole record.

  • keep – When multiple records are found for a lookup, how to return the records. - "first": return the first record. - "last": return the last record. - False: return all records.

Return type:

NamedTuple

Returns:

A NamedTuple of lookup information of the field values with a dictionary converter.

See also

search()

Examples

Lookup via auto-complete on .:

import bionty as bt
bt.Gene.from_source(symbol="ADGB-DT").save()
lookup = bt.Gene.lookup()
lookup.adgb_dt

Look up via auto-complete in dictionary:

lookup_dict = lookup.dict()
lookup_dict['ADGB-DT']

Look up via a specific field:

lookup_by_ensembl_id = bt.Gene.lookup(field="ensembl_gene_id")
genes.ensg00000002745

Return a specific field value instead of the full record:

lookup_return_symbols = bt.Gene.lookup(field="ensembl_gene_id", return_field="symbol")
classmethod connect(instance)

Query a non-default LaminDB instance.

Parameters:

instance (str | None) – An instance identifier of form “account_handle/instance_name”.

Return type:

QuerySet

Examples

ln.Record.connect("account_handle/instance_name").search("label7", field="name")

Methods

save(*args, **kwargs)

Save.

Always saves to the default database.

Return type:

TypeVar(T, bound= SQLRecord)

classmethod describe(include=None)

Describe record including relations.

Parameters:
  • return_str (bool, default: False) – Return a string instead of printing.

  • include (None | Literal['comments'], default: None) – Include additional content. Use "comments" to display readme and comment blocks.

Return type:

None | str

delete(permanent=None)

Delete.

Parameters:

permanent (bool | None, default: None) – For consistency, False raises an error, as soft delete is impossible.

Returns:

When permanent=True, returns Django’s delete return value – a tuple of (deleted_count, {registry_name: count}). Otherwise returns None.

refresh_from_db(using=None, fields=None, from_queryset=None)

Reload field values from the database.

By default, the reloading happens from the database this instance was loaded from, or by the read router if this instance wasn’t loaded from any database. The using parameter will override the default.

Fields can be used to specify which fields to reload. The fields should be an iterable of field attnames. If fields is None, then all non-deferred fields are reloaded.

When accessing deferred fields of an instance, the deferred loading of the field will call this method.