Release notes

1.4.0

get_pseudobulk changes:
- Default values now do not filter features. For feature filtering check the new functions filter_by_expr or filter_by_prop.
- If feature filters are used, it may return more genes than before due to a change of > min_props to >= min_props.
- Now it returns quality control metrics such as psbulk_n_cells, psbulk_counts and psbulk_props.
- Now groups_col accepts take multiple keys.
- Now mode accepts a dictionary of callable functions. The resulting profiles will be stored in .layers.
swap_layer now has a new argument X_layer_key, a .layers key where to move and store the original .X.
Pseudobulk and bulk vignettes have been updated to use the PyDESeq2 package
run_consensus now accepts extra arguments with the new parametter args that are passed down to decouple.
Omnipath functions now return resources with sorted indexes and throw a warning if the version is too old.
run_wsum and run_wmean now correctly accept empty null distributions.

Added filter_by_expr feature filtering function from edgeR.
Added filter_by_prop feature filtering function. In previous versions it was incorporated inside get_pseudobulk.
Added plot_psbulk_samples to assess the quality of pseudobulk samples.
Added plot_filter_by_expr to assess which filtering thresholds to use in filter_by_expr.
Added plot_filter_by_prop to assess which filtering thresholds to use in filter_by_prop.
Added plot_volcano_df to plot volcano plots from long format dataframes.
Added plot_targets to plot downstream target genes of a source by their change and weight.
Added get_collectri to retrieve the CollecTRI gene regulatory network.
Added get_ksn_omnipath to retrieve the Kinase-Substrate network from omnipath.
Added rank_sources_groups to identify marker sources (TFs, pathways, etc.) per group of samples/cells.

get_pseudobulk now has new arguments: mode to change how to summarize profiles and skip_checks to bypass checks.
OmniPath functions now accept more organism synonyms.

get_pseudobulk and get_acts now have a dtype argument due to future AnnData changes.
plot_metrics_scatter and plot_metrics_boxplot now use GroupBy.mean(numeric_only=True).

Omnipath wrappers (get_resource, get_dorothea and get_progeny) now accept any organism name.

Fixed change in api from sklearn.tree.
Forced gene names in extract to be in unicode format.
Changed integer format from int32 to int64 to accommodate larger datasets across methods.

Added conversion utility function translate_net to translate nets across organisms.

extract now removes empty samples and features.
run_consensus now follows the same format as other methods, old function is now called cons.
get_pseudobulk now checks if input are raw integer counts.
plot_volcano now can plot without subsetting features by a network and can save plots to disk.
plot_volcano now uses adjustText to better plot text labels.
plot_volcano now can set logFCs and p-value limits for outliers.
get_top_targets now can also work without subsetting features by a network and returns significant adjusted p-values.
get_contrast now can also work without needing to group.
udt and mdt now check if skranger and sklearn are installed, respectively.
get_toy_data now contains more example TFs.
get_top_targets now returns logFCs and pvals as column names instead of logFC and pval.
format_contrast_results now returns also the adjusted p-value.

Added dense_run util function which runs methods ignoring zeros in the data.
Added plot_violins and plot_barplot functions.
Added p_adjust_fdr util function to correct p-values for FDR.
Added get_ora_df function to infer ora from lists of genes instead of an input mat.
Added shuffle_net function to randomize networks.
Added benchmarking metrics metric_auroc, metric_auprc, metric_mcauroc and metric_mcauprc.
Added get_toy_benchmark_data function to generate a toy example for benchmarking.
Added show_metrics function to show available metrics.
Added benchmark, format_benchmark_inputs and get_performances functions to benchmark methods and nets.
Added plot_metrics_scatter function to plot the results of running the benchmarking pipeline.
Added plot_metrics_scatter_cols function to plot the results of running the benchmarking pipeline grouped by two levels.
Added plot_metrics_scatter function to plot the results of running the benchmarking pipeline.
Added plot_metrics_boxplot function to plot the distributions of Monte-Carlo benchmarking metrics.