転写配列の統計情報を記録しています。
- 2006.12.20 dbESTが更新されたことを発見しました。
- dbEST release 121506 - December 15, 2006 Number of public entries: 40,227,012
- UniGene Mouse,Rat,Dog,Rice が更新されました。
- dbESTが release 111706 - November 17, 2006 にビルドアップされました。
- SoybeanのUniGeneのBuildが#25から#26にあがりました。Nov. 3.2006
- HumanのUniGeneのBuildが#195から#196にあがりました。Oct. 18. 2006
NCBIのdbESTとUniGeneのSummaryを蓄積していって、そのグラフを描くことで、見えてくるものがあるんじゃないかと思っていて、このブログに蓄積することにしました。
蓄積する情報は、次の2種類です。
- NCBI dbEST Summary release 121506 - December 15, 2006
- NCBI Unigene summary
まだ、蓄積を始めたばかりなので、グラフを書くまでにはいたっておりません。気の長い話ですが、来年のいまごろには、グラフを書くことができるようになるかもしれません。
dbEST Summary
dbEST release 121506 - December 15, 2006
Homo sapiens (human) 7,895,603 Mus musculus + domesticus (mouse) 4,740,859 Bos taurus (cattle) 1,236,788 Oryza sativa (rice) 1,211,078 Zea mays (maize) 1,160,495 Danio rerio (zebrafish) 1,152,629 Xenopus tropicalis 1,090,063 Rattus norvegicus + sp. (rat) 871,144 Triticum aestivum (wheat) 855,067 Xenopus laevis (African clawed frog) 737,698 Arabidopsis thaliana (thale cress) 686,778 Ciona intestinalis 686,396 Sus scrofa (pig) 641,857 Gallus gallus (chicken) 599,175 Drosophila melanogaster (fruit fly) 514,613 Hordeum vulgare + subsp. vulgare (barley) 437,713 Salmo salar (Atlantic salmon) 430,223 Canis familiaris (dog) 365,909 Glycine max (soybean) 359,402 Caenorhabditis elegans (nematode) 346,064 Pinus taeda (loblolly pine) 329,469 Vitis vinifera (wine grape) 316,756 Oryzias latipes (Japanese medaka) 309,868 Aedes aegypti (yellow fever mosquito) 298,060 Branchiostoma floridae (Florida lancelet) 277,538 Gasterosteus aculeatus (three spined stickleback) 276,992 Oncorhynchus mykiss (rainbow trout) 260,886 Malus x domestica (apple tree) 254,891 Pimephales promelas 249,941 Solanum lycopersicum (tomato) 249,392 Saccharum officinarum (sugarcane) 246,301 Solanum tuberosum (potato) 226,798 Medicago truncatula (barrel medic) 225,129 Sorghum bicolor (sorghum) 204,208 Ovis aries (sheep) 186,664 Bombyx mori (domestic silkworm) 184,200 Gossypium hirsutum (upland cotton) 177,048 Physcomitrella patens subsp. patens 174,908 Hydra magnipapillata 174,162 Chlamydomonas reinhardtii 167,641 Schistosoma mansoni (blood fluke) 158,841 Dictyostelium discoideum 155,032 Anopheles gambiae (African malaria mosquito) 153,165 Lotus japonicus 150,631 Trichosurus vulpecula 147,199 Strongylocentrotus purpuratus (purple urchin) 141,833 Picea glauca 132,624 Toxoplasma gondii 129,421 Molgula tectiformis 106,863 Macaca fascicularis 101,192
dbEST release 111706 - November 17, 2006
Homo sapiens (human) 7,895,572 Mus musculus + domesticus (mouse) 4,722,069 Oryza sativa (rice) 1,211,064 Zea mays (maize) 1,160,485 Danio rerio (zebrafish) 1,152,269 Bos taurus (cattle) 1,141,099 Xenopus tropicalis 1,039,143 Rattus norvegicus + sp. (rat) 871,144 Triticum aestivum (wheat) 855,067 Arabidopsis thaliana (thale cress) 734,275 Ciona intestinalis 686,396 Sus scrofa (pig) 640,034 Gallus gallus (chicken) 599,171 Xenopus laevis (African clawed frog) 542,288 Drosophila melanogaster (fruit fly) 514,613 Hordeum vulgare + subsp. vulgare (barley) 437,321 Salmo salar (Atlantic salmon) 428,803 Canis familiaris (dog) 365,909 Glycine max (soybean) 359,402 Caenorhabditis elegans (nematode) 346,064 Pinus taeda (loblolly pine) 329,469 Vitis vinifera (wine grape) 316,756 Oryzias latipes (Japanese medaka) 309,868 Aedes aegypti (yellow fever mosquito) 298,060 Branchiostoma floridae (Florida lancelet) 277,538 Gasterosteus aculeatus (three spined stickleback) 276,992 Oncorhynchus mykiss (rainbow trout) 260,886 Malus x domestica (apple tree) 254,422 Pimephales promelas 249,941 Solanum lycopersicum (tomato) 249,392 Saccharum officinarum (sugarcane) 246,301 Solanum tuberosum (potato) 226,798 Medicago truncatula (barrel medic) 225,129 Sorghum bicolor (sorghum) 204,208 Ovis aries (sheep) 186,664 Bombyx mori (domestic silkworm) 184,200 Gossypium hirsutum (upland cotton) 177,047 Physcomitrella patens subsp. patens 174,908 Hydra magnipapillata 174,162 Chlamydomonas reinhardtii 167,641 Schistosoma mansoni (blood fluke) 158,841 Dictyostelium discoideum 155,032 Anopheles gambiae (African malaria mosquito) 153,165 Lotus japonicus 150,631 Trichosurus vulpecula 147,199 Strongylocentrotus purpuratus (purple urchin) 141,833 Picea glauca 132,624 Toxoplasma gondii 129,421 Molgula tectiformis 106,863 Macaca fascicularis 101,192
dbEST release 100606 - October 6, 2006
Number of public entries: 38,953,178 Homo sapiens (human) 7,893,983 Mus musculus + domesticus (mouse) 4,720,064 Oryza sativa (rice) 1,188,565 Zea mays (maize) 1,143,728 Bos taurus (cattle) 1,137,353 Danio rerio (zebrafish) 1,134,553 Xenopus tropicalis 1,044,182 Rattus norvegicus + sp. (rat) 871,144 Triticum aestivum (wheat) 855,066 Ciona intestinalis 686,396 Sus scrofa (pig) 623,929 Arabidopsis thaliana (thale cress) 622,973 Gallus gallus (chicken) 599,141 Xenopus laevis (African clawed frog) 537,424 Drosophila melanogaster (fruit fly) 514,545 Hordeum vulgare + subsp. vulgare (barley) 437,321 Canis familiaris (dog) 365,909 Glycine max (soybean) 359,151 Caenorhabditis elegans (nematode) 346,064 Pinus taeda (loblolly pine) 329,469 Vitis vinifera (wine grape) 316,756 Oryzias latipes (Japanese medaka) 309,868 Aedes aegypti (yellow fever mosquito) 298,060 Branchiostoma floridae (Florida lancelet) 277,538 Gasterosteus aculeatus (three spined stickleback) 273,259 Oncorhynchus mykiss (rainbow trout) 260,886 Malus x domestica (apple tree) 254,169 Pimephales promelas 249,941 Saccharum officinarum (sugarcane) 246,301 Salmo salar (Atlantic salmon) 237,274 Solanum tuberosum (potato) 226,798 Medicago truncatula (barrel medic) 225,129 Sorghum bicolor (sorghum) 204,208 Lycopersicon esculentum (tomato) 199,873 Ovis aries (sheep) 186,664 Bombyx mori (domestic silkworm) 184,200 Gossypium hirsutum (upland cotton) 177,037 Physcomitrella patens subsp. patens 174,908 Hydra magnipapillata 174,162 Chlamydomonas reinhardtii 167,641 Schistosoma mansoni (blood fluke) 158,841 Dictyostelium discoideum 155,032 Anopheles gambiae (African malaria mosquito) 153,165 Lotus japonicus 150,631 Strongylocentrotus purpuratus (purple urchin) 141,833 Picea glauca 132,624 Toxoplasma gondii 129,421 Trichosurus vulpecula 111,634 Molgula tectiformis 106,863 Macaca fascicularis 101,442
UniGene Homo sapiens:Human
UniGene Build #196
Sequences Included in UniGene Known genes are from GenBank 30 Aug 2006 ESTs are from dbEST through 30 Aug 2006 163,705 mRNAs 4,881 Models 48,742 HTC 1,733,348 EST, 3'reads 3,986,551 EST, 5'reads 1,051,626 EST, other/unknown 6,988,853 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 86,804 sets total 26,053 sets contain at least one mRNA 12,687 sets contain at least one HTC sequence 80,829 sets contain at least one EST 23,349 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Hs build 196 32769-65536 1 16385-32768 6 8193-16384 22 4097-8192 62 2049-4096 233 1025-2048 739 513-1024 2141 257-512 4326 129-256 4268 65-128 3376 33-64 3150 17-32 3436 9-16 4061 5-8 5367 3-4 6217 2 6068 1 40423
UniGene Build #195
Sequences Included in UniGene Known genes are from GenBank 25 Jul 2006 ESTs are from dbEST through 25 Jul 2006 161,677 mRNAs 6,454 Models 48,622 HTC 1,732,950 EST, 3'reads 3,985,237 EST, 5'reads 1,044,933 EST, other/unknown 6,979,873 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 86,804 sets total 26,187 sets contain at least one mRNA 12,683 sets contain at least one HTC sequence 83,579 sets contain at least one EST 23,357 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Hs build 195 16385-32768 8 8193-16384 22 4097-8192 59 2049-4096 240 1025-2048 740 513-1024 2146 257-512 4289 129-256 4239 65-128 3271 33-64 3089 17-32 3274 9-16 4043 5-8 5434 3-4 6547 2 6579 1 42824
UniGene Mus musculus:Mouse
UniGene Build #159
Sequences Included in UniGene Known genes are from GenBank 30 Oct 2006 ESTs are from dbEST through 30 Oct 2006 84,068 mRNAs 6,172 Models 128,190 HTC 1,545,208 EST, 3'reads 2,223,743 EST, 5'reads 292,822 EST, other/unknown 4,280,203 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 64,618 sets total 22,203 sets contain at least one mRNA 23,765 sets contain at least one HTC sequence 61,396 sets contain at least one EST 19,498 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Mm build 159 8193-16384 3 4097-8192 13 2049-4096 63 1025-2048 292 513-1024 1186 257-512 3451 129-256 4677 65-128 3738 33-64 3298 17-32 3020 9-16 3464 5-8 4190 3-4 5631 2 4827 1 26765
UniGene Build #158
Sequences Included in UniGene Known genes are from GenBank 04 Sep 2006 ESTs are from dbEST through 04 Sep 2006 83,045 mRNAs 6,232 Models 128,755 HTC 1,543,211 EST, 3'reads 2,223,941 EST, 5'reads 292,786 EST, other/unknown 4,277,970 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 66,184 sets total 22,171 sets contain at least one mRNA 24,120 sets contain at least one HTC sequence 61,418 sets contain at least one EST 19,490 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Mm build 158 8193-16384 3 4097-8192 12 2049-4096 62 1025-2048 280 513-1024 1152 257-512 3434 129-256 4779 65-128 3920 33-64 3556 17-32 3444 9-16 3784 5-8 4167 3-4 5218 2 4083 1 26738
UniGene Rattus norvegicus:Rat
UniGene Build #157
Sequences Included in UniGene Known genes are from GenBank 30 Oct 2006 ESTs are from dbEST through 30 Oct 2006 31,717 mRNAs 9,383 Models 643 HTC 333,209 EST, 3'reads 335,508 EST, 5'reads 60,722 EST, other/unknown 771,182 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 83,779 sets total 14,027 sets contain at least one mRNA 601 sets contain at least one HTC sequence 48,217 sets contain at least one EST 10,102 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Rn build 157 2049-4096 5 1025-2048 11 513-1024 29 257-512 103 129-256 479 65-128 1962 33-64 4371 17-32 4661 9-16 4318 5-8 4107 3-4 4462 2 5345 1 22351
UniGene Build #156
Sequences Included in UniGene Known genes are from GenBank 03 Sep 2006 ESTs are from dbEST through 03 Sep 2006 31,483 mRNAs 9,532 Models 644 HTC 333,155 EST, 3'reads 335,452 EST, 5'reads 60,644 EST, other/unknown 770,910 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 52,183 sets total 13,877 sets contain at least one mRNA 601 sets contain at least one HTC sequence 48,205 sets contain at least one EST 10,006 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Rn build 156 2049-4096 5 1025-2048 10 513-1024 29 257-512 103 129-256 480 65-128 1953 33-64 4393 17-32 4670 9-16 4344 5-8 4143 3-4 4470 2 5231 1 22352
UniGene Gallus gallus:chicken
UniGene Build #31
Sequences Included in UniGene Known genes are from GenBank 02 Aug 2006 ESTs are from dbEST through 02 Aug 2006 30,376 mRNAs 0 Models 0 HTC 22,361 EST, 3'reads 408,319 EST, 5'reads 78,287 EST, other/unknown 539,343 total sequences in clusters Build Method: Transcript Based Alignments between all transcript sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 30,837 sets total 17,010 sets contain at least one mRNA 0 sets contain at least one HTC sequence 30,241 sets contain at least one EST 16,414 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Gga build 31 1025-2048 4 513-1024 16 257-512 66 129-256 223 65-128 1231 33-64 3272 17-32 3977 9-16 4425 5-8 4996 3-4 7541 2 2704 1 2382
UniGene Canis familiaris:Dog
UniGene Build #17
Sequences Included in UniGene Known genes are from GenBank 30 Oct 2006 ESTs are from dbEST through 30 Oct 2006 2,331 mRNAs 0 Models 0 HTC 125,578 EST, 3'reads 22,469 EST, 5'reads 139,759 EST, other/unknown 290,137 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 22,349 sets total 1,236 sets contain at least one mRNA 0 sets contain at least one HTC sequence 21,861 sets contain at least one EST 748 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Cfa build 17 4097-8192 4 2049-4096 10 1025-2048 13 513-1024 18 257-512 41 129-256 122 65-128 305 33-64 752 17-32 1469 9-16 2522 5-8 3939 3-4 5453 2 2881 1 4820
UniGene Build #16
Sequences Included in UniGene Known genes are from GenBank 16 Jul 2006 ESTs are from dbEST through 16 Jul 2006 2,170 mRNAs 27,336 Models 0 HTC 120,542 EST, 3'reads 21,631 EST, 5'reads 142,756 EST, other/unknown 314,435 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 23,611 sets total 1,105 sets contain at least one mRNA 0 sets contain at least one HTC sequence 23,167 sets contain at least one EST 829 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Cfa build 16 2049-4096 3 1025-2048 10 513-1024 22 257-512 52 129-256 184 65-128 453 33-64 1100 17-32 2282 9-16 3604 5-8 3806 3-4 3183 2 2449 1 6463
UniGene Oryza sativa:Rice
UniGene Build #63
Sequences Included in UniGene Known genes are from GenBank 29 Oct 2006 ESTs are from dbEST through 29 Oct 2006 72,256 mRNAs 0 Models 60 HTC 544,384 EST, 3'reads 341,075 EST, 5'reads 175,939 EST, other/unknown 1,133,714 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 35,244 sets total 28,140 sets contain at least one mRNA 50 sets contain at least one HTC sequence 32,079 sets contain at least one EST 24,975 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Os build 63 2049-4096 18 1025-2048 41 513-1024 98 257-512 313 129-256 996 65-128 2649 33-64 4577 17-32 5422 9-16 4739 5-8 3835 3-4 2920 2 1458 1 8178
< |
UniGene Build #62
Sequences Included in UniGene Known genes are from GenBank 15 Jul 2006 ESTs are from dbEST through 15 Jul 2006 44,773 mRNAs 12,738 Models 61 HTC 537,566 EST, 3'reads 336,307 EST, 5'reads 171,956 EST, other/unknown 1,103,401 total sequences in clusters Build Method: Genome Based Alignments between transcript sequences and genomic sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 46,381 sets total 27,631 sets contain at least one mRNA 51 sets contain at least one HTC sequence 39,744 sets contain at least one EST 20,998 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Os build 62 4097-8192 2 2049-4096 14 1025-2048 39 513-1024 94 257-512 295 129-256 987 65-128 2495 33-64 4344 17-32 5114 9-16 4649 5-8 4125 3-4 3617 2 3078 1 17528
UniGene Triticum aestivum:Wheat
UniGene Build #46
Sequences Included in UniGene Known genes are from GenBank 31 Jul 2006 ESTs are from dbEST through 31 Jul 2006 2,313 mRNAs 0 Models 0 HTC 176,268 EST, 3'reads 293,083 EST, 5'reads 274,521 EST, other/unknown 746,185 total sequences in clusters Build Method: Transcript Based Alignments between all transcript sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 38,566 sets total 1,730 sets contain at least one mRNA 0 sets contain at least one HTC sequence 38,425 sets contain at least one EST 1,589 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Ta build 46 16385-32768 1 8193-16384 1 4097-8192 6 2049-4096 13 1025-2048 35 513-1024 103 257-512 204 129-256 395 65-128 967 33-64 1863 17-32 2829 9-16 4011 5-8 6274 3-4 10419 2 4368 1 7077
UniGene Zea mays:Maize
UniGene Build #59
Sequences Included in UniGene Known genes are from GenBank 12 Sep 2006 ESTs are from dbEST through 12 Sep 2006 5,260 mRNAs 0 Models 8,962 HTC 197,119 EST, 3'reads 193,851 EST, 5'reads 483,327 EST, other/unknown 888,519 total sequences in clusters Build Method: Transcript Based Alignments between all transcript sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 54,378 sets total 4,178 sets contain at least one mRNA 7,731 sets contain at least one HTC sequence 54,240 sets contain at least one EST 4,043 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Zm build 59 4097-8192 1 2049-4096 5 1025-2048 16 513-1024 82 257-512 255 129-256 738 65-128 1884 33-64 3549 17-32 4109 9-16 4510 5-8 5281 3-4 7512 2 4708 1 21728
UniGene Glycine max:soybean
UniGene Build #26
Sequences Included in UniGene Known genes are from GenBank 12 Oct 2006 ESTs are from dbEST through 12 Oct 2006 1,149 mRNAs 0 Models 173 HTC 63,426 EST, 3'reads 224,767 EST, 5'reads 7,047 EST, other/unknown 296,562 total sequences in clusters Build Method: Transcript Based Alignments between all transcript sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 21,707 sets total 920 sets contain at least one mRNA 144 sets contain at least one HTC sequence 21,635 sets contain at least one EST 848 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Gma build 26 2049-4096 3 1025-2048 4 513-1024 23 257-512 68 129-256 172 65-128 365 33-64 930 17-32 1938 9-16 3230 5-8 4657 3-4 6756 2 1579 1 1982
UniGene Build #25
Sequences Included in UniGene Known genes are from GenBank 06 Aug 2006 ESTs are from dbEST through 06 Aug 2006 1,102 mRNAs 0 Models 107 HTC 63,419 EST, 3'reads 224,542 EST, 5'reads 7,042 EST, other/unknown 296,212 total sequences in clusters Build Method: Transcript Based Alignments between all transcript sequences are used to generate clusters of sequences originating from the same gene. More... Final Number of Clusters (sets) 21,699 sets total 879 sets contain at least one mRNA 89 sets contain at least one HTC sequence 21,618 sets contain at least one EST 798 sets contain both mRNAs and ESTs Histogram of cluster sizes for UniGene Gma build 25 2049-4096 3 1025-2048 4 513-1024 23 257-512 68 129-256 172 65-128 365 33-64 927 17-32 1932 9-16 3221 5-8 4661 3-4 6762 2 1571 1 1990