Visit our GitHub page to find out more about our current projects. Below you can find a list our latest software.

MSFragger: ultrafast and comprehensive peptide identification in mass spectrometry-based proteomics


MSFragger is an ultrafast database search tool that uses a fragment ion indexing method to rapidly perform spectra similarity comparisons. On a typical quad-core workstation, MSFragger is able to perform open searching (500 Da precursor mass window tolerance) in under 10 minutes for a single LC-MS/MS run. It is implemented in the Java programming language and is available as a standalone JAR.s

Kong, A. T.; Leprevost, F. V.; Avtonomov, D. M.; Mellacheruvu, D.; Nesvizhskii, A. I. MSFragger: Ultrafast and Comprehensive Peptide Identification in Mass Spectrometry-Based Proteomics. Nat. Methods 2017, 14 (5), 513–520.

MSFragger website
Download via FragPipe

Philosopher: A complete toolkit for shotgun proteomics data analysis

Philosopher provides easy access to third-party tools and custom algorithms allowing users to develop proteomics analysis, from Peptide Spectrum Matching to annotated protein reports. Philosopher is also tuned for Open Search analysis, providing a modified version of the prophets for peptide validation and protein inference. To this date, Philosopher is the only proteomics toolkit that allows you to process and analyze close and open search results.

Philosopher website

TMT-Integrator: A tool integrates channel abundances from multiple TMT samples and exports a general report for downstream analysis

The main purpose of TMT-Integrator is to extract and combine channel abundances from multiple TMT samples. It takes psm tables generated by Philosphor as input files and exports a general report in which columns are the sample names and rows are the abundances in a specified level. TMT-Integrator currently provides four levels in the output results, including gene, protein, peptide, and phosphor site levels.

TMT-Integrator Website

BatMass: a Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics


BatMass is a mass spectrometry data visualization tool. It was created to provide an extensible platform, providing basic functionality, like project management, raw mass-spectrometry data access, various GUI widgets and extension points.

Avtonomov, D. M.; Raskind, A.; Nesvizhskii, A. I. BatMass: A Java Software Platform for LC-MS Data Visualization in Proteomics and Metabolomics. J. Proteome Res. 2016, 15 (8), 2500–2509.

BatMass Website

DIA-Umpire: comprehensive computational framework for data independent acquisition proteomics


DIA-Umpire is an open source Java program for computational analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data. It enables untargeted peptide and protein identification and quantitation using DIA data, and also incorporates targeted extraction to reduce the number of cases of missing quantitation.

Tsou, C. C.; Avtonomov, D.; Larsen, B.; Tucholska, M.; Choi, H.; Gingras, A. C.; Nesvizhskii, A. I. DIA-Umpire: Comprehensive Computational Framework for Data-Independent Acquisition Proteomics. Nat. Methods 2015, 12 (3), 258–264.

DIA-Umpire Website

mapDIA: Preprocessing and statistical analysis of quantitative proteomics data from data independent acquisition mass spectrometry


mapDIA is software for statistical analysis of differential expression using MS/MS fragment-level quantitative data from data independent acquisition (DIA) proteomics experiments. It offers a series of tools for essential data preprocessing, including a novel retention time-based normalization method and multiple peptide/fragment selection steps.

Teo, G.; Kim, S.; Tsou, C. C.; Collins, B.; Gingras, A. C.; Nesvizhskii, A. I.; Choi, H. MapDIA: Preprocessing and Statistical Analysis of Quantitative Proteomics Data from Data Independent Acquisition Mass Spectrometry. J Proteomics 2015, 129, 108–120.

mapDIA Website

The CRAPome: a contaminant repository for affinity purification - mass spectrometry data


Contaminant Repository for Affinity Purification (CRAPome) is a database of annotated negative controls contributed by the proteomics research community. It addresses the common problem of distinguishing real interactions from the non-specific background (also known as 'contaminants'). The database and associated computational tools to score protein interactions are available online. The intuitive web-interface can be used to explore the database and to analyze user-uploaded data.

Mellacheruvu, D.; Wright, Z.; Couzens, A. L.; Lambert, J. P.; St-Denis, N. A.; Li, T.; Miteva, Y. V.; Hauri, S.; Sardiu, M. E.; Low, T. Y.; et al. The CRAPome: A Contaminant Repository for Affinity Purification-Mass Spectrometry Data. Nat. Methods 2013, 10 (8), 730–736.

CRAPome home

SAINT: probabilistic scoring of affinity purification–mass spectrometry data


Computational models and software for assigning confidence scores to protein-protein interactions in label-free quantitative AP-MS datasets. For each observed interaction with associated label-fee quantification, SAINT calculates the probability of true interaction. The modeling incorporates various data normalization steps and is also capable of utilizing the quantittaive information from negative control purifications for improving specificity in small-to-intermediate scale experiments (SAINT v. 2). The method was initially developed for label-free spectral count data, but was later extended to MS1 intensity-based quantitative data (SAINT-MS1). SAINTexpress is a recently developed fast version of the algorithm.

Choi, H.; Liu, G.; Mellacheruvu, D.; Tyers, M.; Gingras, A. C.; Nesvizhskii, A. I. Analyzing Protein-Protein Interactions from Affinity Purification-Mass Spectrometry Data with SAINT. Curr Protoc Bioinformatics 2012, Chapter 8, Unit8.15.

SAINT Website

ProHits: integrated software for mass spectrometry-based interaction proteomics

ProHits is a Laboratory Management System (LIMS) for interaction proteomics developed primarily by the Anne-Claude Gingras and Mike Tyers laboratories in collaboration with Nesvizhskii lab. It is a comprehensive system that integrates the TPP/iProphet for peptide/protein identification and SAINT suite of tools for interaction scoring.

Nature Biotechnology, 2010


LuciPHOr2: site localization of generic post-translational modifications from tandem mass spectrometry data

Luciphor2 re-implements the original Luciphor algorithm 9see above) in JAVA and expands it to work on any post-translational modification. Luciphor2 has several features over the previous version: It can run on any computer that uses JAVA It can score any PTM It can score results from any search tool Like the original Luciphor, this release can process PeptideProphet XML files (pepXML). It can also read in tab-delimited files with scores from any protein search tool.

Bioinformatics, 2014

LuciPHOR2 Website

Abacus: A computational tool for extracting and pre-processing spectral count data for label-free quantitative proteomic analysis

ABACUS is a computational tool for extracting label-free quantitative information (spectral counts) from MS/MS data sets. It aggregates data from multiple experiments, adjusts spectral counts to accurately account for peptides shared across multiple proteins, and performs common normalization steps. It can also output the spectral count data at the gene level, thus simplifying the integration and comparison between gene and protein expression data. Abacus is compatible with the widely used Trans-Proteomic Pipeline suite of tools and comes with a graphical user interface making it easy to interact with the program. The main aim of Abacus is to streamline the analysis of spectral count data by providing an automated, easy to use solution for extracting this information from proteomic data sets for subsequent, more sophisticated statistical analysis.

Proteomcis, 2011

Abacus Website

QPROT: Statistical method for testing differential expression using protein-level intensity data in label-free quantitative proteomics

QPROT is a software for differential protein expression using MS1 and MS/MS-level continuous quantitative data. Features a hierarchical model with predictive recursive algorithm. Includes percentile normalization and multiple threading for fast computing.

Proteomics, 2015

QPROT Website

QSPEC: Significance analysis of spectral count data in label-free shotgun proteomics.

Software for the analysis of differential protein expression using label-free spectral count data. The hierarchical model of QSPEC pools statistical information for mean and variance estimates across all proteins in the presence of limited number of replicate data. In a typical quantitative proteomics experiment, there are rarely a sufficient number of replicates to render conventional statistic-based tests such as T-test applicable. QSPEC addresses this problem and calculates the ratio of likelihoods (Bayes Factor) for differential expression for each protein based on certain model assumptions (Poisson-family distributions for count data and Gaussian distribution for intensity data).

Molecular & Cell Proteomics, 2008

Please use QProt instead

NestedCluster: Analysis of Protein Complexes via Model-based Biclustering of Label-free Quantitative AP-MS Data

A biclustering method for constructing protein complexes using (filtered) high-confidence interaction data from label-free quantitative AP-MS experiment. The method forms bait clusters based on the similarity of quantitative interaction profiles as anchors of protein complexes, and identifies submatrices of prey proteins showing consistent quantitative association within the anchor bait clusters. The statistical model here determines the optimal number of bait clusters and prey clusters in the data, automatically yielding the configuration of highly probable protein complexes.

Molecular Systems Biology, 2010

Download NestedCluster

Trans-Proteomic Pipeline

We developed core components of the widely used open source data analysis pipeline (Trans-Proteomic Pipeline, TPP) for primary processing of mass spectrometry-based proteomic data. The pipeline is currently maintained by the Seattle Proteomics Center at the Institute for Systems Biology.

Download TPP