When working with RNA-Seq data, you may wonder whether expression values such as TPM, FPKM, or RPKM can be used directly for differential expression analysis.
For example, many researchers have questions such as:
- Can TPM be used for differential expression analysis?
- Can FPKM or RPKM values be analyzed with a t-test?
- Can TPM or FPKM be used as input for DESeq2?
- How should we understand the differences between TPM, FPKM, RPKM, and Gene Counts?
- Can TPM or FPKM be used for visualization, such as PCA or clustering?
In past RNA-Seq papers and public datasets, expression values reported as RPKM, FPKM, or TPM are often seen. Because of this, it may appear as if using RPKM, FPKM, or TPM for differential expression analysis is a standard approach in RNA-Seq data analysis.
However, the fact that these values have been widely used in the past does not mean that they are appropriate input data for differential expression analysis. In particular, when using differential expression methods such as DESeq2 or edgeR, the basic starting point should be Gene Counts, not TPM, FPKM, or RPKM.
In short, for RNA-Seq differential expression analysis, you should generally start from Gene Counts rather than TPM, FPKM, or RPKM.
If there is a main reason to use TPM or FPKM,
it is mainly when you want to compare the relative expression levels of different genes within the same sample.
For sample-to-sample comparisons, group comparisons, differential expression analysis,
or checking the results of differential expression analysis,
there is little reason to use TPM or FPKM.
Below, we explain why.
TPM, FPKM, and RPKM Are Expression Measures, Not Standard Input for Differential Expression Analysis
TPM, FPKM, and RPKM are expression measures calculated from RNA-Seq reads while taking gene length and library size into account.
Because these values appear as continuous numerical values in a table, they may seem suitable for general statistical analysis. However, an important point is that converting counts into TPM, FPKM, or RPKM does not remove the instability that was present in the original count values.
When looking at raw Gene Counts, low-count genes have small values, and it is relatively easy to recognize that they may be strongly affected by measurement variability or sampling noise.
However, TPM, FPKM, and RPKM include a correction step for gene length.
As a result, especially for short genes,
even a small original count can become a relatively large value after conversion.
Conversely, for long genes, even a moderately large count can become a relatively small value.
As a result, low-count genes can be stretched diagonally in scatter plots, and values that may actually have low reliability can appear as if they were ordinary expression data.
This issue is important when considering the dynamic range of RNA-Seq.
It is explained with figures in the following article:
The Dynamic Range of RNA-Seq

In this way, with FPKM or RPKM, instability derived from low counts can become less visible because of gene length correction. If statistical tests such as t-tests are then applied to such values, group differences are evaluated while unreliable values are still included. As a result, the analysis may pick up differences that reflect measurement-level fluctuations or unstable low-count signals rather than meaningful biological differences.
Can TPM Be Used for Differential Expression Analysis?
TPM is sometimes described as being more comparable across samples than FPKM or RPKM. For this reason, one might think that TPM, unlike FPKM or RPKM, can be used for differential expression analysis.
However, TPM does not fundamentally solve this problem. TPM is calculated by correcting for gene length and then scaling values so that the total expression within each sample becomes constant. Therefore, the instability derived from low counts does not disappear.
As with FPKM and RPKM, genes with originally low counts can appear as ordinary expression values after conversion. For this reason, TPM values should not simply be used as input for a t-test.
In addition, TPM has another important limitation. TPM is scaled so that the total expression within each sample is constant. Therefore, when the TPM value of one gene becomes larger, the TPM values of other genes may appear relatively smaller.
In other words, TPM represents relative proportions within a sample. It cannot be treated as if the expression level of each gene changes independently. This is known as the problem of compositional data, and it is one reason why TPM is difficult to use directly for sample-to-sample comparisons or differential expression analysis.
The Main Reason to Use TPM, FPKM, or RPKM Is Gene-to-Gene Comparison Within the Same Sample
So, what should TPM, FPKM, or RPKM be used for?
The main reason to actively use these values is when you want to compare relative expression levels between genes within the same sample.
For example, if you want to examine whether Gene A or Gene B is relatively more highly expressed within a single sample, TPM or FPKM may be useful because they take gene length and library size into account.
However, for sample-to-sample comparisons, group comparisons, differential expression analysis, or checking the results of differential expression analysis, there is little reason to use TPM or FPKM.
This point is explained in more detail using real data in the following article:
The Pitfall of “Cleaned” Data — Why FPKM and TPM Are Not Enough: Insights from GSE159751
Should TPM or FPKM Be Used for PCA or Clustering?
TPM and FPKM are sometimes said to be useful for displaying expression values or for visualization methods such as PCA and clustering. Because of this, it is sometimes misunderstood that Gene Counts should be used for differential expression analysis, while TPM or FPKM should be used for PCA and clustering.
However, this interpretation requires caution. When TPM or FPKM are said to be usable for visualization, in many cases this means that they are better than using unnormalized raw Gene Counts directly for PCA or clustering. If Gene Counts are appropriately normalized and transformed, they are useful not only for differential expression analysis, but also for PCA and clustering.
In particular, if differential expression analysis is performed from Gene Counts, but PCA and clustering are performed using RPKM, FPKM, or TPM, the interpretation of the analysis results can become very difficult. In other words, it is more consistent to use Gene Counts as the starting point throughout the workflow, from differential expression analysis to result validation.
This point is also discussed using real data in the following article:
The Pitfall of “Cleaned” Data — Why FPKM and TPM Are Not Enough: Insights from GSE159751
Why TPM or FPKM Should Not Be Used as Input for DESeq2
DESeq2 and edgeR are widely used methods for RNA-Seq differential expression analysis. These methods are designed based on the assumption that RNA-Seq count data are used as input.
The important point is that
the input for DESeq2 and edgeR should basically be Gene Counts, not TPM or FPKM.
See also:
Are t-Tests Really Inappropriate for RNA-Seq?|DESeq2, edgeR, and Why We Should Not Overtrust Statistical Models
TPM and FPKM are values that have already been transformed while taking gene length and library size into account. By contrast, DESeq2 and edgeR take Gene Counts as input and model factors such as library size, mean-dependent variance, and variability among samples.
Therefore, if TPM or FPKM values are used as input for DESeq2, the data no longer match the assumptions of the count-based model.
In other words, combinations such as TPM + DESeq2 or FPKM + DESeq2 are generally inappropriate. When using DESeq2 or edgeR, Gene Counts should be used as input.
Differential Expression Analysis Should Start from Gene Counts
In RNA-Seq differential expression analysis, the first step should be to prepare Gene Counts. From there, normalization, preprocessing, and filtering should be performed according to the purpose of the analysis.
When using DESeq2 or edgeR, Gene Counts are also used as input. Even for visualization and exploratory analysis, it is more consistent with the logic of differential expression analysis to use values appropriately normalized and transformed from Gene Counts.
The important point is not to assume that a data format is appropriate simply because it is “already normalized,” it consists of “continuous values,” or it is labeled as “expression data.”
In RNA-Seq data analysis, we should not judge values only by their names. We need to understand how the values were generated, what assumptions they carry, and what types of analysis they are appropriate for.
Summary: TPM, FPKM, and RPKM Should Be Treated Carefully in Differential Expression Analysis
TPM, FPKM, and RPKM are common expression measures encountered in RNA-Seq data analysis. However, they should not be treated as standard input data for differential expression analysis.
- TPM, FPKM, and RPKM are mainly useful for comparing relative expression levels between genes within the same sample.
- For sample-to-sample comparisons, group comparisons, and differential expression analysis, you should generally start from Gene Counts.
- Applying a t-test directly to FPKM or RPKM requires caution.
- TPM does not fundamentally solve the problems of low-count instability or variance structure.
- When TPM or FPKM are said to be usable for PCA or clustering, this often means that they are better than using raw Gene Counts directly.
- Appropriately normalized and transformed Gene Counts are also useful for PCA and clustering.
- Using TPM or FPKM as input for DESeq2 is inconsistent with the assumptions of the model.
The reliability of results is not determined by the name of the statistical method alone. Whether using DESeq2, edgeR, or a t-test, it is important to examine the data distribution, relationships among samples, low-expression genes, outliers, batch effects, and the biological context.
What really matters in RNA-Seq differential expression analysis is not which method to trust, but to look at the data, check the assumptions, and interpret the results.
For related discussions, please see the following articles:
- The Dynamic Range of RNA-Seq
- The Pitfall of “Cleaned” Data — Why FPKM and TPM Are Not Enough: Insights from GSE159751
- Are t-Tests Really Inappropriate for RNA-Seq?|DESeq2, edgeR, and Why We Should Not Overtrust Statistical Models
- Bulk RNA-Seq Data Analysis Tutorial: Learn the Workflow and How to Interpret Results