facs3/4 Jupyter Notebook lamindata

Query & integrate data#

import lamindb as ln
import lnschema_bionty as lb

lb.settings.species = "human"
馃挕 loaded instance: testuser1/test-facs (lamindb 0.55.0)
hello

ln.track()
馃挕 notebook imports: lamindb==0.55.0 lnschema_bionty==0.31.2
馃挕 Transform(id='wukchS8V976Uz8', name='Query & integrate data', short_name='facs3', version='0', type=notebook, updated_at=2023-10-04 16:42:50, created_by_id='DzTjkKse')
馃挕 Run(id='pZPMQ0V1hGPsIyVB3lIH', run_at=2023-10-04 16:42:50, transform_id='wukchS8V976Uz8', created_by_id='DzTjkKse')
hello

within hello

Inspect the CellMarker registry #

Inspect your aggregated cell marker registry as a DataFrame:

lb.CellMarker.filter().df().head()
name synonyms gene_symbol ncbi_gene_id uniprotkb_id species_id bionty_source_id updated_at created_by_id
id
lRZYuH929QDw CD85j None None None uHJU EBWO 2023-10-04 16:42:20 DzTjkKse
L0WKZ3fufq0J CD11c ITGAX 3687 P20702 uHJU EBWO 2023-10-04 16:42:20 DzTjkKse
0evamYEdmaoY Igd None None None uHJU EBWO 2023-10-04 16:42:20 DzTjkKse
CR7DAHxybgyi CD38 CD38 952 B4E006 uHJU EBWO 2023-10-04 16:42:20 DzTjkKse
0qCmUijBeByY CD94 KLRD1 3824 Q13241 uHJU EBWO 2023-10-04 16:42:20 DzTjkKse

Search for a marker (synonyms aware):

lb.CellMarker.search("PD-1").head(2)
hello

id synonyms __ratio__
name
PD1 2VeZenLi2dj5 PID1|PD-1|PD 1 100.000000
CD14/19 9VptKqpwq9BZ 54.545455

Look up markers with auto-complete:

markers = lb.CellMarker.lookup()

markers.cd8
hello

CellMarker(id='ttBc0Fs01sYk', name='CD8', synonyms='', gene_symbol='CD8A', ncbi_gene_id='925', uniprotkb_id='P01732', updated_at=2023-10-04 16:42:20, species_id='uHJU', bionty_source_id='EBWO', created_by_id='DzTjkKse')

Query files by markers #

Query panels and datasets based on markers, e.g., which datasets have 'CD8' in the flow panel:

panels_with_cd8 = ln.FeatureSet.filter(cell_markers=markers.cd8).all()
ln.File.filter(feature_sets__in=panels_with_cd8).df()
storage_id key suffix accessor description version size hash hash_type transform_id run_id initial_version_id updated_at created_by_id
id
GUHGDTFLeHzBjo4TKBwe dSnMNy0e None .h5ad AnnData Oetjen18_t1 None 46501304 I8nRS02iBs5z1J01b2qwOg md5 SmQmhrhigFPLz8 Xw5xc4OteUjTU31pfcZC None 2023-10-04 16:42:41 DzTjkKse
dwdXtuUm7KR2XuBNSbsv dSnMNy0e None .h5ad AnnData Alpert19 None 33369696 VsTnnzHN63ovNESaJtlRUQ md5 OWuTtS4SAponz8 E3HTifOWblJJ2zFrUEug None 2023-10-04 16:42:27 DzTjkKse

Access registries:

features = ln.Feature.lookup()
hello

Find shared cell markers between two files:

files = ln.File.filter(feature_sets__in=panels_with_cd8).list()
file1, file2 = files[0], files[1]
shared_markers = file1.features["var"] & file2.features["var"]
shared_markers.list("name")
hello

within hello

hello

within hello

['CD8', 'Cd4', 'Ccr7', 'CD27', 'CD45RA', 'CD3']