pydeseq2.preprocessing

Functions

deseq2_norm(counts)

Return normalized counts and size_factors.

deseq2_norm(counts)

Return normalized counts and size_factors.

Uses the median of ratios method.

Parameters:

counts (pandas.DataFrame or ndarray) – Raw counts. One column per gene, one row per sample.

Return type:

Tuple[Union[DataFrame, ndarray], Union[DataFrame, ndarray]]

Returns:

  • deseq2_counts (pandas.DataFrame or ndarray) – DESeq2 normalized counts. One column per gene, rows are indexed by sample barcodes.

  • size_factors (pandas.DataFrame or ndarray) – DESeq2 normalization factors.

deseq2_norm_fit(counts)

Return logmeans and filtered_genes, needed in the median of ratios method.

Logmeans and filtered_genes can then be used to normalize external datasets.

Parameters:

counts (pandas.DataFrame or ndarray) – Raw counts. One column per gene, one row per sample.

Return type:

Tuple[ndarray, ndarray]

Returns:

  • logmeans (ndarray) – Gene-wise mean log counts.

  • filtered_genes (ndarray) – Genes whose log means are different from -∞.

deseq2_norm_transform(counts, logmeans, filtered_genes)

Return normalized counts and size factors from the median of ratios method.

Can be applied on external dataset, using the logmeans and filtered_genes previously computed in the fit function.

Parameters:
  • counts (pandas.DataFrame or ndarray) – Raw counts. One column per gene, one row per sample.

  • logmeans (ndarray) – Gene-wise mean log counts.

  • filtered_genes (ndarray) – Genes whose log means are different from -∞.

Return type:

Tuple[Union[DataFrame, ndarray], Union[DataFrame, ndarray]]

Returns:

  • deseq2_counts (pandas.DataFrame or ndarray) – DESeq2 normalized counts. One column per gene, rows are indexed by sample barcodes.

  • size_factors (pandas.DataFrame or ndarray) – DESeq2 normalization factors.