OpenMS  2.7.0
GNPSExport

Export MS/MS data in .MGF format for GNPS (http://gnps.ucsd.edu).

GNPS (Global Natural Products Social Molecular Networking, http://gnps.ucsd.edu) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, network annotation propagation (http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006089), and the Dereplicator-based annotation (https://www.nature.com/articles/nchembio.2219). The GNPS manuscript is available here: https://www.nature.com/articles/nbt.3597

This tool was developed for the Feature Based Molecular Networking (FBMN) workflow on GNPS (https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash2.jsp)

Please cite our preprint: Nothias, LF., Petras, D., Schmid, R. et al. Feature-based molecular networking in the GNPS analysis environment. Nat Methods 17, 905–908 (2020). https://doi.org/10.1038/s41592-020-0933-6

See the FBMN workflow documentation here (https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/)

In brief, after running an OpenMS "metabolomics" pipeline, the GNPSExport TOPP tool can be used on the consensusXML file and corresponding mzML files to generate the files needed for FBMN on GNPS. These two files are:

  • The MS/MS spectral data file (.MGF format) which is generated with the GNPSExport util.
  • The feature quantification table (.CSV format) which is generated with the TextExport util.

For each consensusElement in the consensusXML file, the GNPSExport produces one representative consensus MS/MS spectrum (named peptide annotation in OpenMS jargon) outputted in the MS/MS spectral file (.MGF file). Several modes for the generation of the consensus MS/MS spectrum are available and described below. Note that these parameters are defined in the GNPSExport INI parameters file.

Representative command:

GNPSExport -ini iniFile-GNPSExport.ini -in_cm filefilter.consensusXML -in_mzml inputFile0.mzML inputFile1.mzML -out GNPSExport_output.mgf

The GNPSExport TOPP tool can be run on a consensusXML file and the corresponding mzML files to generate a MS/MS spectral file (MGF format) and corresponding feature quantification table (.TXT format) that contains the LC-MS peak area intensity.

Requirements:

  • The IDMapper has to be run on the featureXML files, in order to associate MS2 scan(s) (peptide annotation) with each features. These peptide annotations are used by the GNPSExport.
  • The FileFilter has to be run on the consensusXML file, prior to the GNPSExport, in order to remove consensusElements without MS2 scans (peptide annotation).

Parameters:

  • Binning (ms2_bin_size): Defines the binning width of fragment ions during the merging of eligible MS/MS spectra.
  • Cosine Score Threshold (merged_spectra:cos_similarity): Defines the necessary pairwise cosine similarity with the highest precursor intensity MS/MS scan.
  • Output Type (output_type): Options for outputting GNPSExport spectral processing are:
    1. [RECOMMENDED] merged_spectra For each consensusElement, the GNPSExport will merge all the eligible MS/MS scans into one representative consensus MS/MS spectrum. Eligible MS/MS scans have a pairwise cosine similarity with the MS/MS scan of highest precursor intensity above the Cosine Similarity Threshold. The fragment ions of merged MS/MS scans are binned in m/z (or Da) range defined by the Binning width parameter.
  1. Most intense: most_intense - For each consensusElement, the GNPSExport will output the most intense MS/MS scan (with the highest precursor ion intensity) as consensus MS/MS spectrum.

Note that mass accuracy and the retention time window for the pairing between MS/MS scans and a LC-MS feature or consensusElement is defined at the IDMapper tool step.

A representative OpenMS-GNPS workflow would sequentially use these OpenMS TOPP tools:

  1. Input mzML files
  2. Run the FeatureFinderMetabo tool on the mzML files.
  3. Run the IDMapper tool on the featureXML and mzML files.
  4. Run the MapAlignerPoseClustering tool on the featureXML files.
  5. Run the TOPP_MetaboliteAdductDecharger on the featureXML files.
  6. Run the FeatureLinkerUnlabeledKD tool or FeatureLinkerUnlabeledQT, on the featureXML files and output a consensusXML file.
  7. Run the FileFilter on the consensusXML file to keep only consensusElements with at least MS/MS scan (peptide identification).
  8. Run the GNPSExport on the "filtered consensusXML file" to export an .MGF file.
  9. Run the TextExporter on the "filtered consensusXML file" to export an .TXT file.
  10. Upload your files to GNPS and run the Feature-Based Molecular Networking workflow. Instructions are here: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking/

The GitHub for that ProteoSAFe workflow and an OpenMS python wrappers is available here: https://github.com/Bioinformatic-squad-DorresteinLab/openms-gnps-workflow

An online version of the OpenMS-GNPS pipeline for FBMN running on CCMS server (http://proteomics.ucsd.edu/) is available on GNPS: https://ccms-ucsd.github.io/GNPSDocumentation/featurebasedmolecularnetworking-with-openms/

GNPS (Global Natural Products Social Molecular Networking, https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash2.jsp) is an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. The GNPS web-platform makes possible to perform spectral library search against public MS/MS spectral libraries, as well as to perform various data analysis such as MS/MS molecular networking, Network Annotation Propagation Network Annotation Propagation (http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1006089) and the DEREPLICATOR (https://www.nature.com/articles/nchembio.2219) The GNPS paper is available here (https://www.nature.com/articles/nbt.3597)

The command line parameters of this tool are:

INI file documentation of this tool: