
The annotation of genomes from next-generation sequencing platforms needs to be rapid, high-throughput, and fully integrated and automated. Although a few Web-based annotation services have recently become available, they may not be the best solution for researchers that need to annotate a large number of genomes, possibly including proprietary data, and store them locally for further analysis.
A standalone software application, the
Annotation of microbial Genome
Sequences (AGeS) system, incorporates
publicly available and in-house-developed
bioinformatics tools and databases, many
of which are parallelized for high-throughput
performance. AGeS was designed to
support three main capabilities. The first is
the storage of input contig sequences in
FASTA format and the resulting annotation
data in a central, customized database,
where the data manipulation and
visualization steps are performed through
easy-to-use graphical user interfaces
(GUIs). The second is the annotation of
microbial genomes using an integrated
software pipeline, which analyzes sequence contigs and locates genomic
regions that code for proteins, RNAs, and
other genomic elements through the Do-It-Yourself Annotation (DIYA) framework.
The identified protein-coding regions are
then annotated using an in-house-developed,
high-throughput pipeline, the
Pipeline for Protein Annotation (PIPA).
The third capability is the visualization of
annotated sequences using the opensource
genome browser GBrowse. Full
genome and protein annotation, storage,
and visualization for bacterial genomes
have been implemented.
The AGeS system was designed and
implemented to provide a standalone,
integrated solution that users can
install on their computers. AGeS can
be installed on either a standalone
Linux computer or a Linux cluster.
When run on a multicore Linux computer
or a Linux cluster, AGeS supports
OpenMPI for parallel execution
and PBS for batch submission. The
AGeS system has been designed for
easy integration with future sequence
analysis modules. Its Web applications
use technologies based on open standards,
including Java, JavaScript, and
XML.
This work was done by Kamal Kumar,
Valmik Desai, Li Cheng, Maxim Khitrov,
Deepak Grover, Ravi Vijaya Satya,
Chenggang Yu, Nela Zavaljevski, and
Jaques Reifman of the Army Medical
Research and Materiel Command. ARL-0129
AGeS: A Software System for Microbial Genome Sequence Annotation (reference ARL-0129) is currently available for download from the TSP library.
Please Login at the top of the page to download.
Subscribe today to receive the INSIDER, a FREE e-mail newsletter from Defense Tech Briefs featuring exclusive previews of upcoming articles, late breaking NASA and industry news, hot products and design ideas, links to online resources, and much more.