decoupler.filter_by_expr

decoupler.filter_by_expr(adata, obs=None, group=None, lib_size=None, min_count=10, min_total_count=15, large_n=10, min_prop=0.7)

Determine which genes have sufficiently large counts to be retained in a statistical analysis.

Adapted from the function filterByExpr of edgeR (https://rdrr.io/bioc/edgeR/man/filterByExpr.html).

Parameters:
adataAnnData

AnnData obtained after running decoupler.get_pseudobulk.

obsDataFrame, None

If provided, metadata dataframe, only needed if adata is not an AnnData.

groupstr, None

Name of the .obs column to group by. If None, it assumes that all samples belong to one group.

lib_sizeint, float, None

Library size. If None, default to the sum of reads per sample.

min_countint

Minimum count requiered per gene for at least some samples.

min_total_countint

Minimum total count required per gene across all samples.

large_nint

Number of samples per group that is considered to be “large”.

min_propfloat

Minimum proportion of samples in the smallest group that express the gene.

Returns:
genesndarray

List of genes to be kept.