Transform a number of array shards to a single array store

In the previous notebooks, we’ve seen how to incrementally create a collection of datasets and train models on it.

In some situations we want to concatenate all datasets to one big array store to speed up ad-hoc queries for slices for arbitrary metadata from the cloud.

This is what CELLxGENE does to create Census: a number of .h5ad files are concatenated to give rise to a single TileDB-SOMA array store. See how this looks for cellxgene here: CELLxGENE: scRNA-seq.