Methylation Data Analysis Tutorial

The way we present in this tutorial is widely applicable to any Illumina methylation arrays or other technologies. The key is to separate methylation sites into two categories: promoters and others. The promoter is slightly different from the CpG island. 30% of genes don't have CpG islands around their TSS. The methylation levels of such genes' promoters are appropriately controlled without a CpG island. Thus, we start from the preparation to analyze the comprehensive promoters, not only CpG islands.

Methylation Data Analysis Tutorial 1 - Defining the promoter probes and other probes.

The methylation levels of promoters are generally kept unmethylated, and outside of them are typically kept highly-methylated. CpG islands are nearly equal to promoters, but some promoters exist outside of CpG islands. So, we show how to expand regions to be recognized as promoters flexibly. This is only a basic idea; you can modify the definition to meet your needs.

Methylation Data Analysis Tutorial 2 - Calculating average beta values per promoter or genomic bin.

According to the defined promoter regions in the previous movie, extract probes in the promoters and calculate average beta values per promoter. Other probes are grouped together with neighbors in a fixed-sized genomic bin. Converting enormous individual methylation sites to a manageable number of biological regions makes data analysis easy and understandable.

Methylation Data Analysis Tutorial 3 - Analyzing methylation levels of promoters.

As you know, the methylation levels of promoters are strongly associated with the downstream gene expression patterns. The methylation patterns of promoters are not random; they are controlled somehow. You see the clusters of promoters according to their methylation patterns.

Methylation Data Analysis Tutorial 4 - Analyzing methylation levels of genomic bins.

The methylation patterns of genomic bins don't relate to neighboring genes' expression patterns. However, the methylation patterns of genomic bins form clusters, and they are associated with chromosomal regions. The meaning of it in the biological context needs to be studied.