decoupler.run_gsva
- decoupler.run_gsva(mat, net, source='source', target='target', kcdf=False, mx_diff=True, abs_rnk=False, min_n=5, seed=42, verbose=False, use_raw=True)
Gene Set Variation Analysis (GSVA).
GSVA (Hänzelmann et al., 2013) starts by transforming the input molecular readouts in mat to a readout-level statistic using Gaussian kernel estimation of the cumulative density function. Then, readout-level statistics are ranked per sample and normalized to up-weight the two tails of the rank distribution. Afterwards, an enrichment score gsva_estimate is calculated using a running sum statistic that is normalized by subtracting the largest negative estimate from the largest positive one.
Hänzelmann S. et al. (2013) GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics, 14, 7.
- Parameters:
- matlist, DataFrame or AnnData
List of [features, matrix], dataframe (samples x features) or an AnnData instance.
- netDataFrame
Network in long format.
- sourcestr
Column name in net with source nodes.
- targetstr
Column name in net with target nodes.
- kcdfbool
Whether to use a Gaussian kernel or not during the non-parametric estimation of the cumulative distribution function. By default no kernel is used (faster), to reproduce GSVA original behaviour in R set to True.
- mx_diffbool
Changes how the enrichment statistic (ES) is calculated. If True (default), ES is calculated as the difference between the maximum positive and negative random walk deviations. If False, ES is calculated as the maximum positive to 0.
- abs_rnkbool
Used when mx_diff = True. If False (default), the enrichment statistic (ES) is calculated taking the magnitude difference between the largest positive and negative random walk deviations. If True, feature sets with features enriched on either extreme (high or low) will be regarded as ‘highly’ activated.
- min_nint
Minimum of targets per source. If less, sources are removed.
- seedint
Random seed to use.
- verbosebool
Whether to show progress.
- use_rawbool
Use raw attribute of mat if present.
- Returns:
- estimateDataFrame
GSVA scores. Stored in .obsm[‘gsva_estimate’] if mat is AnnData.