Input and output#
glaDIAtor-nf consumes:
.mzMLor.mzXMLfiles with spectra from DIA mass spectrometry.mzMLor.mzXMLfiles with spectra from DDA mass spectrometry (optional).fastafiles with corresponding protein aminoacid sequences
produces:
sample × peptide abundance matrices in
.tsvsample × protein abundance matrices in
.tsv
Input data from DIA and DDA#
DIA and DDA input files need to be either in mzML or mzXML format, with the data simplified through pick picking.
In case the DIA or DDA data is in a proprietary raw format, a conversion is necessary. Both pick picking and convertion can be often performed with ProteoWizard msconvert and qtofpeakpicker tools.
Raw DDA files can be peak picked and converted to mzML with qtofpeakpeaker
for f in dda-spectra/*.wiff; do
wine qtofpeakpicker --resolution=2000 --area=1 --threshold=1 --smoothwidth=1.1 --in $f --out dda-spectra/$(basename --suffix=.wiff $f).mzML
done
Raw DIA files can be peak picked and converted to mzML with msconvert
find . -iname 'dia-spectra/*.wiff' -print0 | xargs -P5 -0 -i wine msconvert {} --filter "peakPicking true 1-" --filter 'titleMaker <RunId>.<ScanNumber>.<ScanNumber>.<ChargeState> File:"<SourcePath>", NativeID:"<Id>"' -o dia-spectra/
Some raw data found in databases was already pick picked. In such case omit --filter "peakPicking true 1-" switch to perform only the conversion to mzML.
ProteoWizard is available as a Docker image at https://hub.docker.com/r/chambm/pwiz-skyline-i-agree-to-the-vendor-licenses. Use of this image is demonstrated in getting started.
mzML vs mzXML
mzXML format is a predecessor of mzML format. We recommend converting raw data directly to mzML and avoiding the use of the older mzXML format whenever possible.