pydeseq2.ds.DeseqStats
- class DeseqStats(dds, contrast=None, alpha=0.05, cooks_filter=True, independent_filter=True, prior_LFC_var=None, lfc_null=0.0, alt_hypothesis=None, inference=None, quiet=False)
Bases:
object
PyDESeq2 statistical tests for differential expression.
Implements p-value estimation for differential gene expression according to the DESeq2 pipeline [LHA14].
Also supports apeGLM log-fold change shrinkage [ZIL19].
- Parameters:
dds (
DeseqDataSet
) – DeseqDataSet for which dispersion and LFCs were already estimated.contrast (
list
orNone
) – A list of three strings, in the following format:['variable_of_interest', 'tested_level', 'ref_level']
. Names must correspond to the metadata data passed to the DeseqDataSet. E.g.,['condition', 'B', 'A']
will measure the LFC of ‘condition B’ compared to ‘condition A’. For continuous variables, the last two strings should be left empty, e.g.['measurement', '', '']. If None, the last variable from the design matrix is chosen as the variable of interest, and the reference level is picked alphabetically. (default: ``None
).alpha (
float
) – P-value and adjusted p-value significance threshold (usually 0.05). (default:0.05
).cooks_filter (
bool
) – Whether to filter p-values based on cooks outliers. (default:True
).independent_filter (
bool
) – Whether to perform independent filtering to correct p-value trends. (default:True
).prior_LFC_var (
ndarray
) – Prior variance for LFCs, used for ridge regularization. (default:None
).lfc_null (
float
) – The (log2) log fold change under the null hypothesis. (default:0
).alt_hypothesis (
str
orNone
) – The alternative hypothesis for computing wald p-values. By default, the normal Wald test assesses deviation of the estimated log fold change from the null hypothesis, as given bylfc_null
. One of["greaterAbs", "lessAbs", "greater", "less"]
orNone
. The alternative hypothesis corresponds to what the user wants to find rather than the null hypothesis. (default:None
).inference (
Inference
) – Implementation of inference routines object instance. (default:DefaultInference
).quiet (
bool
) – Suppress deseq2 status updates during fit.
- base_mean
Genewise means of normalized counts.
- Type:
- contrast_vector
Vector encoding the contrast (variable being tested).
- Type:
ndarray
- design_matrix
A DataFrame with experiment design information (to split cohorts). Indexed by sample barcodes. Depending on the contrast that is provided to the DeseqStats object, it may differ from the DeseqDataSet design matrix, as the reference level may need to be adapted.
- Type:
- LFC
Estimated log-fold change between conditions and intercept, in natural log scale.
- Type:
- SE
Standard LFC error.
- Type:
- statistics
Wald statistics.
- Type:
- p_values
P-values estimated from Wald statistics.
- Type:
- padj
P-values adjusted for multiple testing.
- Type:
- results_df
Summary of the statistical analysis.
- Type:
References
[LHA14]Michael I Love, Wolfgang Huber, and Simon Anders. Moderated estimation of fold change and dispersion for rna-seq data with deseq2. Genome biology, 15(12):1–21, 2014. doi:10.1186/s13059-014-0550-8.