pydeseq2.ds.DeseqStats
- class DeseqStats(dds, contrast, alpha=0.05, cooks_filter=True, independent_filter=True, prior_LFC_var=None, lfc_null=0.0, alt_hypothesis=None, inference=None, quiet=False, n_cpus=None)
Bases:
objectPyDESeq2 statistical tests for differential expression.
Implements p-value estimation for differential gene expression according to the DESeq2 pipeline [LHA14].
Also supports apeGLM log-fold change shrinkage [ZIL19].
- Parameters:
dds (
DeseqDataSet) – DeseqDataSet for which dispersion and LFCs were already estimated.contrast (
listorndarray) – Either a list of three strings or a numpy array. If a list of three strings, it must be in the following format:['variable_of_interest', 'tested_level', 'ref_level']. Names must correspond to the metadata data passed to the DeseqDataSet. E.g.,['condition', 'B', 'A']will measure the LFC of ‘condition B’ compared to ‘condition A’. If a numpy array, it must be a contrast vector of the same length as the design matrix.alpha (
float) – P-value and adjusted p-value significance threshold (usually 0.05). (default:0.05).cooks_filter (
bool) – Whether to filter p-values based on cooks outliers. (default:True).independent_filter (
bool) – Whether to perform independent filtering to correct p-value trends. (default:True).prior_LFC_var (
ndarray) – Prior variance for LFCs, used for ridge regularization. (default:None).lfc_null (
float) – The (log2) log fold change under the null hypothesis. (default:0).alt_hypothesis (
str, optional) – The alternative hypothesis for computing wald p-values. By default, the normal Wald test assesses deviation of the estimated log fold change from the null hypothesis, as given bylfc_null. One of["greaterAbs", "lessAbs", "greater", "less"]orNone. The alternative hypothesis corresponds to what the user wants to find rather than the null hypothesis. (default:None).inference (
Inference) – Implementation of inference routines object instance. (default:DefaultInference).quiet (
bool) – Suppress deseq2 status updates during fit.n_cpus (int | None)
- base_mean
Genewise means of normalized counts.
- Type:
- contrast_vector
Vector encoding the contrast (variable being tested).
- Type:
ndarray
- design_matrix
A DataFrame with experiment design information (to split cohorts). Indexed by sample barcodes. Depending on the contrast that is provided to the DeseqStats object, it may differ from the DeseqDataSet design matrix, as the reference level may need to be adapted.
- Type:
- LFC
Estimated log-fold change between conditions and intercept, in natural log scale.
- Type:
- SE
Standard LFC error.
- Type:
- statistics
Wald statistics.
- Type:
- p_values
P-values estimated from Wald statistics.
- Type:
- padj
P-values adjusted for multiple testing.
- Type:
- results_df
Summary of the statistical analysis.
- Type:
References
[LHA14]Michael I Love, Wolfgang Huber, and Simon Anders. Moderated estimation of fold change and dispersion for rna-seq data with deseq2. Genome biology, 15(12):1–21, 2014. doi:10.1186/s13059-014-0550-8.
Methods
lfc_shrink(coeff[, adapt])LFC shrinkage with an apeGLM prior [ZIL19].
Perform a Wald test.
summary(**kwargs)Run the statistical analysis.
- lfc_shrink(coeff, adapt=True)
LFC shrinkage with an apeGLM prior [ZIL19].
Shrinks LFCs using a heavy-tailed Cauchy prior, leaving p-values unchanged.
- plot_MA(log=True, save_path=None, **kwargs)
Create an log ratio (M)-average (A) plot using matplotlib.
Useful for looking at log fold-change versus mean expression between two groups/samples/etc. Uses matplotlib to emulate the
make_MA()function in DESeq2 in R.
- run_wald_test()
Perform a Wald test.
Get gene-wise p-values for gene over/under-expression.
- Return type:
- summary(**kwargs)
Run the statistical analysis.
The results are stored in the
results_dfattribute.- Parameters:
**kwargs – Keyword arguments: providing new values for
lfc_nulloralt_hypothesiswill override the correspondingDeseqStatattributes.- Return type:
- property variables
Get the names of the variables used in the model definition.