Page 1 of 1

PRM experiment using different size FASTA files returns totally different results

Posted: Tue Jul 17, 2018 2:02 pm
by Suola
Hi all,

I recently performed a targeted PRM experiment on Thermo QEplus, with about 60 target peptides from three different human proteins. Among other analyzes, I did two MaxQuant analyses just for giggles on the resulting data, one using a full human FASTA file from UniProt, and one with just the three proteins and their close relatives (ten proteins total) in the FASTA file. These ten proteins are also in the larger whole human proteome FASTA file. Otherwise, all MaxQuant parameters were exactly the same in these two analyses.

I was surprised to see that the results were very different.. The analysis using the small FASTA file returned total of nine peptides that belong to the three proteins of interest, the PEP values were all about 0.01 and the MaxQuant scores range from 4 to 22. The MS/MS counts are typically around 200.

The analysis using the larger fasta file returned eleven peptides that all belong to various proteins but no peptides that are related to my three proteins of interest. The PEP values range from 0.00010434 down to 8.20E-71 and MaxQuant scores from 56 to 146. The MS/MS counts are 1 for all the peptides.

What does this data mean? Why does the larger FASTA file that contains all the sequences in the smaller one does not return the same peptides. Does the peptide scoring depend on the FASTA file size?

Best wishes,

Re: PRM experiment using different size FASTA files returns totally different results

Posted: Wed Sep 05, 2018 1:08 am
by aky
peptide scoring does not specifically depend on FASTA size but there is a correlation with database size. To clarify, DB size doesn't mean FASTA literally but rather the effective search space (or candidate peptide for a given spectrum). While the raw scoring is a goodness-of-fit measure (denoting how good is a match between a given peptide with my spectrum), the final evalue/pvalue or any other measure like PEP, is a global metric of confidence assessment. This means that it depends on which all peptides competed to match the spectrum, and how closely they matched to the spectrum is also defined their competitors. You have poor competitors, you look great. Your competitors are fierce, you barely win.