Interesting article in PLoS Genetics:
The Complex Genetic Architecture of the Metabolome
Association between more than 200,000 single-nucleotide variants across the genome and levels of 327 metabolites in 96 strains of Arabidopsis thaliana showed that only 23–30% of the variation in cellular metabolite levels was associated with specific sites in the genome.
Tuesday, November 23, 2010
Monday, November 15, 2010
The Untargeted Metabolomics Workflow
During the past 2 years, the methodology that I have employed to make metabolite identifications using an untargeted metabolomics workflow has evolved. In 2008, not only were metabolite databases smaller, but they also did not have some of the advanced functionality that is available today. For example, searching for sodium and potassium adducts required manually calculating masses from the observed m/z values. We have come a long way with improvements in both metabolomics software and databases facilitating metabolite identification. Major databases emerging as the key players for untargeted studies are HMDB, Lipid Maps, and METLIN. Each, in my opinion, have their own advantages. I would like to start this blog by surveying which databases metabolomics investigators utilize the most frequently and why. HMDB, for example, provides so-called MetaboCards in which fundamental biological facts are introduced for queried molecules. This information can be particularly useful in filtering putative hits for metabolites that may not be relevant to the sample type being analyzed, such as a hit for a plant metabolite from bacterial cell results. Another new function that has been recently incorporated into METLIN is the ability to search fragment ions from MS/MS data. With this function, it is now possible to do MS/MS on all features of interest in a dataset prior to querying databases to potentially reduce false-negative hits. A few years ago, the workflow of identifying metabolites in a global MS-based study offered little room for creativity. I am certain today, however, that investigators are taking advantage of the various new database functionality in a multitude of innovative ways. I hope that by discussing and exchanging ideas about our untargeted workflows we can learn new ways to facilitate what I still would classify as the rate-limiting step in metabolomics, metabolite identification. So what process do you use to make metabolite identifications? How do you prioritize your feature lists? Do you search all the databases on the web, or do you refine yourself to an in-house library? I look forward to reading about your different points of view!
Friday, November 12, 2010
A short history of XCMS
XCMS is an open-source, platform-independent R-package that was developed to perform untargeted metabolite profiling with LC/MS. XCMS reads and processes LC/MS data stored in netcdf , mzXML, mzData and mzML files. It provides method for peak picking, non-linear retention time alignment, visualization, relative quantization and statistics. XCMS is capable of simultaneously preprocessing, analyzing, and visualizing the raw data from hundreds of samples. The original XCMS paper published in 2006 was cited more than 270 times (Google Scholar, 11/12/2010).
Colin Smith initially developed XCMS in 2004. In 2008 Steffen Neumann and Ralf Tautenhahn joined the development team and later in 2009 Paul Benton. Today, XCMS contains two methods for LC/MS feature detection and a method for peak detection in single high-res spectra (FTICR, MALDI, DIMS). Two different non-linear retention time correction methods are available, and two methods to group LC/MS features. A separate method is implemented to align single high-res spectra using a moving-window technique. Mass spectra, TICs, EICs and EIC overlays, 3D LC/MS surface plots and boxplots can be generated by XCMS. Methods to read and preprocess MS/MS data are available. XCMS can make use of multicore processors, as well as MPI or SNOW clusters to speed up the data processing. XCMS is now widely used for untargeted metabolomics, metabolic profiling and biomarker discovery.
Spring 2004: Added methods for reading and displaying raw data from NetCDF files (Colin)
Summer 2004: Developed methods for kernel density peak grouping and LOESS retention time alignment (Colin)
Fall 2004: Developed matched filter peak picker, EIC generation (Colin)
March 2005: Checked into Bioconductor SVN repository (Colin)
December 2005: mzXML, mzData import added (Colin)
April 2007: centWave peak detection added (Ralf)
November 2007: Reading of MS/MS spectra added (Steffen)
January 2008: single spectra alignment method added (Steffen)
July 2008: Multiprocessor peak picking added via MPI (Ralf)
March 2009: OBI-Warp retention time alignment added (Steffen, Ralf)
April 2009: group nearest alignment method added (Steffen, Ralf)
June 2010: gap filler/stitch method added (Paul)
September 2010: 64 bit support added (Steffen, Ralf)
Colin Smith initially developed XCMS in 2004. In 2008 Steffen Neumann and Ralf Tautenhahn joined the development team and later in 2009 Paul Benton. Today, XCMS contains two methods for LC/MS feature detection and a method for peak detection in single high-res spectra (FTICR, MALDI, DIMS). Two different non-linear retention time correction methods are available, and two methods to group LC/MS features. A separate method is implemented to align single high-res spectra using a moving-window technique. Mass spectra, TICs, EICs and EIC overlays, 3D LC/MS surface plots and boxplots can be generated by XCMS. Methods to read and preprocess MS/MS data are available. XCMS can make use of multicore processors, as well as MPI or SNOW clusters to speed up the data processing. XCMS is now widely used for untargeted metabolomics, metabolic profiling and biomarker discovery.
Spring 2004: Added methods for reading and displaying raw data from NetCDF files (Colin)
Summer 2004: Developed methods for kernel density peak grouping and LOESS retention time alignment (Colin)
Fall 2004: Developed matched filter peak picker, EIC generation (Colin)
March 2005: Checked into Bioconductor SVN repository (Colin)
December 2005: mzXML, mzData import added (Colin)
April 2007: centWave peak detection added (Ralf)
November 2007: Reading of MS/MS spectra added (Steffen)
January 2008: single spectra alignment method added (Steffen)
July 2008: Multiprocessor peak picking added via MPI (Ralf)
March 2009: OBI-Warp retention time alignment added (Steffen, Ralf)
April 2009: group nearest alignment method added (Steffen, Ralf)
June 2010: gap filler/stitch method added (Paul)
September 2010: 64 bit support added (Steffen, Ralf)
Subscribe to:
Posts (Atom)