decoupler.run_ora

decoupler.run_ora(mat, net, source='source', target='target', n_up=None, n_bottom=0, n_background=20000, min_n=5, seed=42, verbose=False, use_raw=True)

Over Representation Analysis (ORA).

ORA measures the overlap between the target feature set and a list of most altered molecular features in mat. The most altered molecular features can be selected from the top and or bottom of the molecular readout distribution, by default it is the top 5% positive values. With these, a contingency table is build and a one-tailed Fisher’s exact test is computed to determine if a regulator’s set of features are over-represented in the selected features from the data. The resulting score, ora_estimate, is the minus log10 of the obtained p-value.

Parameters:
matlist, DataFrame or AnnData

List of [features, matrix], dataframe (samples x features) or an AnnData instance.

netDataFrame

Network in long format.

sourcestr

Column name in net with source nodes.

targetstr

Column name in net with target nodes.

n_upint, None

Number of top ranked features to select as observed features. By default is the top 5% of positive features.

n_bottomint

Number of bottom ranked features to select as observed features.

n_backgroundint

Integer indicating the background size.

min_nint

Minimum of targets per source. If less, sources are removed.

seedint

Random seed to use.

verbosebool

Whether to show progress.

use_rawbool

Use raw attribute of mat if present.

Returns:
estimateDataFrame

ORA scores, which are the -log(p-values). Stored in .obsm[‘ora_estimate’] if mat is AnnData.

pvalsDataFrame

Obtained p-values. Stored in .obsm[‘ora_pvals’] if mat is AnnData.