decoupler.run_mdt
- decoupler.run_mdt(mat, net, source='source', target='target', weight='weight', trees=100, min_leaf=5, n_jobs=-1, min_n=5, seed=42, verbose=False, use_raw=True)
Multivariate Decision Tree (MDT).
MDT fits a multivariate regression random forest for each sample, where the observed molecular readouts in mat are the response variable and the regulator weights in net are the covariates. Target features with no associated weight are set to zero. The obtained feature importances from the fitted model are the activities (mdt_estimate) of the regulators in net.
- Parameters:
- matlist, DataFrame or AnnData
List of [features, matrix], dataframe (samples x features) or an AnnData instance.
- netDataFrame
Network in long format.
- sourcestr
Column name in net with source nodes.
- targetstr
Column name in net with target nodes.
- weightstr
Column name in net with weights.
- treesint
Number of trees in the forest.
- min_leafint
The minimum number of samples required to be at a leaf node.
- n_jobsint
Number of jobs to run in parallel
- min_nint
Minimum of targets per source. If less, sources are removed.
- seedint
Random seed to use.
- verbosebool
Whether to show progress.
- use_rawbool
Use raw attribute of mat if present.
- Returns:
- estimateDataFrame
MDT scores. Stored in .obsm[‘mdt_estimate’] if mat is AnnData.
- pvalsDataFrame
Obtained p-values. Stored in .obsm[‘mdt_pvals’] if mat is AnnData.