decoupler.benchmark

decoupler.benchmark(mat, obs, net, perturb, sign, metrics=['auroc', 'auprc', 'mcauroc', 'mcauprc', 'rank', 'nrank'], groupby=None, by='experiment', f_expr=True, f_srcs=False, min_exp=5, pi0=0.5, n_iter=1000, seed=42, verbose=True, use_raw=True, decouple_kws={})

Benchmark methods or networks on a given set of perturbation experiments using activity inference with decoupler.

Parameters:

matlist, DataFrame or AnnData: List of [features, matrix], dataframe (samples x features) or an AnnData instance.
obsDataFrame or None: Metadata containing the perturbed targets and the sign of the perturbation. If mat is AnnData, use mat.obs attribute instead.
netDataFrame, dict: Network in long format. Can be dictionary of nets, where key is the name and value is the long format DataFrame.
perturbstr: Column name in obs with perturbed sources.
signstr, int: Column name in obs with sign of the perturbation. Can be set to 1 or -1 if all experiments are overexpression or knockouts, respectively.
metricslist, str: Performance metric(s) to compute. See the description of get_performance for more details.
groupbylist, str, None: Performance metrics(s) can be computed per groups if enough experiments are available.
bystr: Whether to evaluate performances at the “experiment” or at the “source” level.
f_exprbool: Whether to filter out experiments whose perturbed sources are not in the given net. Defaults to True.
f_srcsbool: Whether to fitler out sources in net for which there are not perturbation data. Defaults to False.
min_expint: Minimum of perturbation experiments per group.
pi0float: Reference ratio for calibrated metrics. Corresponds to the baseline/reference class inbalance to which to set the metric.
n_iterint: Number of downsampling iterations used for the ‘mcroc’ and ‘mcprc’ metrics.
seedint: Random seed to use.
verbosebool: Whether to show progress.
use_rawbool: Use raw attribute of mat if present.
decouple_kwsdict: Parameters for the decoupler.decouple function. If more than one net, use a nested dictionary where the main key is the network name and the value is a dictionary with the requiered arguments.

Returns:

dfDataFrame: DataFrame containing the metrics’ scores.