In this section we will see descriptive figures about quality of the data, reads with adapter, reads mapped to miRNAs, reads mapped to other small RNAs.
After adapter removal, we can plot the size distribution of the small RNAs.
Number of miRNAs with > 3 counts.
colSums(counts(obj) > 10) | |
---|---|
miRQC_A | 590 |
miRQC_A_repeat | 626 |
miRQC_B | 531 |
miRQC_B_repeat | 494 |
miRQC_C | 647 |
miRQC_C_repeat | 640 |
miRQC_D | 619 |
miRQC_D_repeat | 631 |
The data was analyzed with seqcluster
This tools used all reads, uniquely mapped and multi-mapped reads. The first step is to cluster sequences in all locations they overlap. The second step is to create meta-clusters: is the unit that merge all clusters that share the same sequences. This way the output are meta-clusters, common sequences that could come from different region of the genome.
In this table 1 means % of the genome with at least 1 read, and 0 means % of the genome without reads.
coverage | ratio_genome |
---|---|
0 | 0.9997890 |
1 | 0.0002112 |
The normal value for human data with strong small RNA signal is: 0.0002. This will change for smaller genomes.
Number of reads in the data after each step:
Check complex meta-clusters: This kind of events happen when there are small RNA over the whole genome, and all repetitive small rnas map to thousands of places and sharing many sequences in many positions. If any meta-cluster is > 40% of the total data, maybe it is worth to add some filters like: minimum number of counts -e
or --min--shared
in seqcluster prepare
miRQC_A miRQC_A_repeat miRQC_B miRQC_B_repeat miRQC_C miRQC_C_repeat
miRQC_D miRQC_D_repeat
Number of miRNAs with > 10 counts.
colSums(clus_ma > 10) | |
---|---|
miRQC_A | 710 |
miRQC_A_repeat | 717 |
miRQC_B | 504 |
miRQC_B_repeat | 491 |
miRQC_C | 706 |
miRQC_C_repeat | 710 |
miRQC_D | 732 |
miRQC_D_repeat | 723 |
DESeq2 is used for this analysis.
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | |
---|---|---|---|---|---|---|
hsa-miR-10a-5p | 74339.987 | 7.109443 | 0.0548203 | 129.68640 | 0 | 0 |
hsa-miR-10b-5p | 224315.085 | 6.742609 | 0.0544403 | 123.85329 | 0 | 0 |
hsa-miR-133a-3p | 7913.271 | 5.212065 | 0.0712453 | 73.15658 | 0 | 0 |
hsa-miR-141-3p | 10951.626 | 9.265565 | 0.1646798 | 56.26411 | 0 | 0 |
hsa-miR-143-3p | 708824.163 | 2.675688 | 0.0490673 | 54.53099 | 0 | 0 |
hsa-miR-148a-3p | 22314.666 | 3.931444 | 0.0573949 | 68.49818 | 0 | 0 |
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | |
---|---|---|---|---|---|---|
hsa-miR-10a-5p.iso.t5:0.t3:0.ad:u-A.mm:0 | 2765.335 | 6.714222 | 0.1423670 | 47.16135 | 0 | 0 |
hsa-miR-10a-5p.iso.t5:0.t3:d-T.ad:0.mm:0 | 1785.733 | 6.689695 | 0.1741045 | 38.42346 | 0 | 0 |
hsa-miR-10a-5p.iso.t5:0.t3:u-G.ad:0.mm:0 | 46731.083 | 7.038884 | 0.0483076 | 145.70955 | 0 | 0 |
hsa-miR-10a-5p.iso.t5:0.t3:u-TG.ad:0.mm:0 | 3222.939 | 6.982275 | 0.1455374 | 47.97583 | 0 | 0 |
hsa-miR-10a-5p.iso.t5:d-T.t3:0.ad:0.mm:0 | 3713.865 | 7.028550 | 0.1394841 | 50.38961 | 0 | 0 |
hsa-miR-10a-5p.iso.t5:d-T.t3:0.ad:u-A.mm:0 | 3873.404 | 7.126067 | 0.1373717 | 51.87435 | 0 | 0 |
baseMean | log2FoldChange | lfcSE | stat | pvalue | padj | miRQC_A | miRQC_A_repeat | miRQC_B | miRQC_B_repeat | miRQC_C | miRQC_C_repeat | miRQC_D | miRQC_D_repeat | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
297 | 14544.633 | 5.560037 | 0.0946888 | 58.71908 | 0 | 0 | 22681 | 31034 | 812 | 570 | 6907 | 7753 | 18505 | 19849 |
334 | 11359.672 | 9.382545 | 0.1810657 | 51.81847 | 0 | 0 | 18401 | 23887 | 41 | 34 | 4976 | 5468 | 15495 | 15582 |
360 | 12436.680 | -4.044059 | 0.0862888 | -46.86657 | 0 | 0 | 1034 | 1347 | 27992 | 20045 | 22317 | 23068 | 7280 | 7983 |
478 | 8452.558 | 5.186183 | 0.0964175 | 53.78882 | 0 | 0 | 13495 | 17804 | 602 | 441 | 4318 | 4724 | 10346 | 11209 |
529 | 2577.567 | 5.266629 | 0.1290920 | 40.79748 | 0 | 0 | 3921 | 5424 | 168 | 125 | 1308 | 1344 | 3432 | 3478 |
549 | 2769.405 | 5.199808 | 0.1208995 | 43.00935 | 0 | 0 | 4420 | 5441 | 195 | 132 | 1445 | 1581 | 3634 | 3790 |