Problems converting .wiff to MGF

Search algorithms, post-searching processing, quantitation software, etc. Share and discuss software here.
ablakely
Carbon Member
Carbon Member
Posts: 3
Joined: Fri Oct 02, 2015 7:24 am

Problems converting .wiff to MGF

Postby ablakely » Fri Oct 02, 2015 7:45 am

Hello,
I am working on an undergraduate bioinformatics project but I have reached an impass in my workflow.

I am trying to convert a .wiff file from a TripleTOF 5600 instrument to MGF format using msconvert.
The input file is ~2.7 GB and the resulting MGF file is ~29 GB and is seemingly random data.

The output looks something like this:

Code: Select all

BEGIN IONS
114.13452 0.0
114.25532 10.0
114.55423 10.0
114.74324 0.0
115.11503 0.0
115.21001 0.0
115.33174 10.0
115.63323 10.0
...


Each peak list has hundreds of peaks at very small m/z increments from 114 on up with intensities of only 0.0 and 10.0

I would expect that the MGF output would be much smaller than the .wiff input as well so obviously there is something wrong. Has anyone else ever encountered this problem?

additional information:

the .wiff file is from this data repository on PRIDE
http://www.ebi.ac.uk/pride/archive/projects/PXD001506


the specific files I was using are Duo_CTRL_200ng_BSA1fmol_SWATH_20121003_01.wiff and the corresponding wiff.scan file
(http://www.ebi.ac.uk/pride/data/archive/2015/08/PXD001506/Duo_CTRL_200ng_BSA1fmol_SWATH_20121003_01.wiff
http://www.ebi.ac.uk/pride/data/archive/2015/08/PXD001506/Duo_CTRL_200ng_BSA1fmol_SWATH_20121003_01.wiff.scan)

Any help would be much appreciated
Thanks

Craig
E. Coli Lysate Member
E. Coli Lysate Member
Posts: 220
Joined: Sun Jun 26, 2011 6:49 pm

Postby Craig » Sat Oct 03, 2015 5:13 am

It looks like it's writing profile data to the MGF file. Try adding this to your msconvert command line:
--filter "peakPicking true 1-"
That will tell it to write centroid data, using the vendor centroiding algorithm ("true"), for MS levels 1 and higher.

ablakely
Carbon Member
Carbon Member
Posts: 3
Joined: Fri Oct 02, 2015 7:24 am

Postby ablakely » Sun Oct 04, 2015 12:17 pm

Thank you for the reply. I ran msconvert again with the new argument, this time from a command window, and noticed some errors.

Code: Select all

[SpectrumWorkerThreads::work] error in thread: bad allocation
[SpectrumWorkerThreads::work] error in thread: bad allocation
[SpectrumWorkerThreads::work] error in thread: bad allocation


It still appears that the output is incorrect.
There are only certain discreet values that are showing up for intensity, those being 10.0, 20.0, 31.0, 41.0, 51.0, 61.0 and 71.0


Looks like I will have to start over using a different dataset, perhaps something that doesn't use the .wiff format.
Thanks for your help.

Craig
E. Coli Lysate Member
E. Coli Lysate Member
Posts: 220
Joined: Sun Jun 26, 2011 6:49 pm

Postby Craig » Mon Oct 05, 2015 3:26 pm

I'm not sure about the errors (sounds like you may be out of memory), but I think the quantized nature of the data is correct. You can look at the .wiff file in a ProteoWizard GUI called SeeMS. Here is a screenshot of a random scan (#1366, retention time 58.0255 min or 3480.153 s):
Duo_CTRL_200ng_BSA1fmol_SWATH_20121003_01.scan1366.png


And here is the section of the .mgf file for that scan:
[ATTACH]74[/ATTACH]

You will see they agree perfectly.
You do not have the required permissions to view the files attached to this post.

ablakely
Carbon Member
Carbon Member
Posts: 3
Joined: Fri Oct 02, 2015 7:24 am

Postby ablakely » Thu Oct 08, 2015 7:09 pm

I see. I wonder why the data is quantized like that for many of the earlier scans in the file. Is it that the peaks are close to the limit of detection so there isn't much resolution with the intensity at those levels?

I now have ProteoWizard installed which should make it much easier for me to interpret what I'm seeing.


Anyways, thanks for looking into it further for me.


Return to “Bioinformatics”

Who is online

Users browsing this forum: Yahoo [Bot] and 1 guest