decoupler.benchmark
- decoupler.benchmark(mat, obs, net, perturb, sign, metrics=['auroc', 'auprc', 'mcauroc', 'mcauprc', 'rank', 'nrank'], groupby=None, by='experiment', f_expr=True, f_srcs=False, min_exp=5, pi0=0.5, n_iter=1000, seed=42, verbose=True, use_raw=True, decouple_kws={})
Benchmark methods or networks on a given set of perturbation experiments using activity inference with decoupler.
- Parameters:
- matlist, DataFrame or AnnData
List of [features, matrix], dataframe (samples x features) or an AnnData instance.
- obsDataFrame or None
Metadata containing the perturbed targets and the sign of the perturbation. If mat is AnnData, use mat.obs attribute instead.
- netDataFrame, dict
Network in long format. Can be dictionary of nets, where key is the name and value is the long format DataFrame.
- perturbstr
Column name in obs with perturbed sources.
- signstr, int
Column name in obs with sign of the perturbation. Can be set to 1 or -1 if all experiments are overexpression or knockouts, respectively.
- metricslist, str
Performance metric(s) to compute. See the description of get_performance for more details.
- groupbylist, str, None
Performance metrics(s) can be computed per groups if enough experiments are available.
- bystr
Whether to evaluate performances at the “experiment” or at the “source” level.
- f_exprbool
Whether to filter out experiments whose perturbed sources are not in the given net. Defaults to True.
- f_srcsbool
Whether to fitler out sources in net for which there are not perturbation data. Defaults to False.
- min_expint
Minimum of perturbation experiments per group.
- pi0float
Reference ratio for calibrated metrics. Corresponds to the baseline/reference class inbalance to which to set the metric.
- n_iterint
Number of downsampling iterations used for the ‘mcroc’ and ‘mcprc’ metrics.
- seedint
Random seed to use.
- verbosebool
Whether to show progress.
- use_rawbool
Use raw attribute of mat if present.
- decouple_kwsdict
Parameters for the decoupler.decouple function. If more than one net, use a nested dictionary where the main key is the network name and the value is a dictionary with the requiered arguments.
- Returns:
- dfDataFrame
DataFrame containing the metrics’ scores.