Hi folks,
I want to calculate the frequency with which a defined stretch of 8 amino acids should appear in the human proteome. For example, assuming that there are 20,000 proteins in a human cell and each protein is 300 aa long, how often one would see the 'RSPMRSSM' stretch? Any help is highly appreciated. Thanks.
Simple math question
-
- Carbon Member
- Posts: 4
- Joined: Thu Jan 23, 2014 1:17 am
msaeed wrote:Hi folks,
I want to calculate the frequency with which a defined stretch of 8 amino acids should appear in the human proteome. For example, assuming that there are 20,000 proteins in a human cell and each protein is 300 aa long, how often one would see the 'RSPMRSSM' stretch? Any help is highly appreciated. Thanks.
I am not an expert in math/probabilities, but here an approach:
The peptide is 8 aa long -> if on each position every amino acid is equally likely -> Total 21^8 (= 37,822,859,361) different peptides possible.
Per protein of 300 aa we can have 293 chances to find the peptide of interest (300-8+1). Since there are 20'000 proteins we have 20'000 times the chance of 293 to find the protein (5,860,000 chances).
Now we devide the chances we have from all proteins by the total possibilities for a 8 aa peptide -> 5,860,000/37,822,859,361 = this means we have a chance of 1.55E-04 to see that peptide.
Of course the proteins do not use the whole sequence space (a lot of proteins contain conserved domains with similar peptide stretches).
What do you think of my calculation? Any mistakes?
-
- E. Coli Lysate Member
- Posts: 107
- Joined: Wed Dec 21, 2011 8:22 pm
-
- Angiotensin Member
- Posts: 42
- Joined: Thu Dec 27, 2012 12:26 pm
-
- Carbon Member
- Posts: 4
- Joined: Fri Apr 10, 2015 6:12 pm
Hi Rolando, thanks for taking time to figure out the answer. I could not get my head around how did you come up with the number 293. How about tackling this problem by simple maths:
1. Total number of possible 8-mer peptides= 20^8= 2.56 X 10^10
2. If every other factor is unchanged, the chances of appearing my peptide of choice will be once in 2.56 X 10^10 peptides, or once every 2 X 10^11 amino acids (2.56 X 10^10* 8).
3. Since there are 20,000 proteins each containing 300 amino acids, total number of amino acids are 6,000,000.
4. 6,000,000/2X10^11 = 2.9 X 10^-5, so apparently the chances of appearing my peptide of choice in the whole proteome are almost negligible.
Let me know if this makes sense to you. thanks again
1. Total number of possible 8-mer peptides= 20^8= 2.56 X 10^10
2. If every other factor is unchanged, the chances of appearing my peptide of choice will be once in 2.56 X 10^10 peptides, or once every 2 X 10^11 amino acids (2.56 X 10^10* 8).
3. Since there are 20,000 proteins each containing 300 amino acids, total number of amino acids are 6,000,000.
4. 6,000,000/2X10^11 = 2.9 X 10^-5, so apparently the chances of appearing my peptide of choice in the whole proteome are almost negligible.
Let me know if this makes sense to you. thanks again
-
- Carbon Member
- Posts: 4
- Joined: Fri Apr 10, 2015 6:12 pm
-
- Ubiquitin Member
- Posts: 62
- Joined: Sun Sep 25, 2011 6:29 am
-
- Carbon Member
- Posts: 4
- Joined: Fri Apr 10, 2015 6:12 pm
Who is online
Users browsing this forum: No registered users and 0 guests