Biology

[*] ABaCUS ABaCUS is a no-frills program developed to investigate the significance of the putative correspondence between exons and units of protein structure. This type of analysis takes the form of an attempt to eliminate the reference hypothesis (sometimes called a "null" hypothesis) that no correspondence exists. This program may be useful in your research if you are studying the origin and evolution of intron-containing genes.
[CM] AMBER (Assisted Model Building with Energy Refinement) refers to two things: a molecular mechanical force field for the simulation of biomolecules (which is in general use in a variety of simulation programs); and a package of molecular simulation programs which includes source code and demos. The current version of this package is AMBER ver 5 which is sold for UCSF by Oxford Molecular. The code is written in Fortran and C and requires approximately 65 megabytes of disk space.
[*] Angler is a Browser of C.elegans Embryo Development In Time and Space.
[*] Autoseq is a small package of base calling software for ABI automated DNA sequencers. It is intended as a starting point for researchers interested in new base calling algorithms. It implements a simple Bayesian peak identification algorithm which is experimental and does not generally perform better than software provided by ABI.
[*] Biomer is a Java-based, on-line biomolecular modeling package. It should be useful in generating initial structures of biopolymers and small organic molecules. Simple force-field energy minimizations can be carried out in addition to simulated annealing with molecular dynamics.
[*] Biomer A web-based molecular modeling package for biopolymers and small organic molecules. Features include model builders for nucleic acids, polypeptides, and polysaccharides, AMBER force-field geometry optimization, and simulated annealing with molecular dynamics. Require Java virtual machine (http://www.javasoft.com/)
[*] Biostatistic Software is a collection of software developed in the statistical research of Department of Biostatistics, University of Zurich. It currently has following items:
  • Local polynomial regression fitting with Epanechnikov weights
  • Local polynomial regression fitting with ridging
  • MATLAB Smoothing Toolbox -- a shell script of tools for integration of nonparametric regression and smoothing techniques into MATLAB.
  • Kernel regression fitting with local and global bandwidth optimization
  • Kernel density estimation with global bandwidth selection
[*] BRAGI is a interactive protein-modelling program. It was developed for the special purpose to model unknown proteins from the structure of a known one. But it has many other possibilities:
  • Display of proteins or non-proteins; Different file formats (Brookhaven, RDI, Sybyl MOL2)
  • Display of structures in various modes (space filling, Ball&Stick, tube, ...)
  • Display in stereo mode
  • Calculation and display of the molecular surface (coloured by electrostatics or hydrophobicity; solid or wireframe)
  • Calculation and display of secondary structure
  • Display of backbone torsion angles (RAMACHANDRAN-Plot)
  • Exchange of amino acids & variation of torsion angles with interactive energy calculation
  • Comparison of the three dimensional structure and supercomposition of two molecules by RMS fitting
  • Calculation of covalent and potential hydrogen bonds
  • Output as a hard copy or plot
  • Building or replacing protein loops
  • Creation of new or variation of molecules in a molecule editor (still in development)
[GPL] BTL (Bioinformatics Template Library) uses templates to implement generic programming in same way as the Standard Template Library (STL). The focus of this library is on the data structures and algorithms used within the fields of bioinformatics and molecular modelling. This library uses templates to implement generic programming. Templates allow the development of efficient generic programming modules using compile-time mechanisms. Although the BTL has been designed with biomolecular applications in mind it contains classes of more general utility.
[GPL] Clarrhmos Description language and simulator for myocardium structure and electrophysiology. Input to the program is a model file specifying types of cells, action potential shape, refractory period, 3D placement of the cells, relation of parameters to other functions, pacing, electrode placement. Output is a file describing depolarisation and repolarisation of the myocardium and electrograms. A graphic tool for interactive inspection of the output file is also included.
[*] CloneIt! An ANSI-C program finding sub-cloning strategies, in-frame deletions and frameshifts using restriction enzymes and DNA polymerases.
[*] Clustal W is a general purpose multiple alignment program for DNA or proteins. See also http://www.no.embnet.org/clustalw.html
[*] DCSE (Dedicated Comparative Sequence Editor) is a multiple alignment editor. It can be used to edit protein, DNA or RNA alignments. The structure of the molecules can be incorporated in the alignment.
[*] New! DNA/GUI (DNA Graphical User Interface) is an interactive software system for rapid and efficient analysis of images of the types used in genome mapping, such as autoradiograms and electrophoretic gels.
[*] Domain Finder DomainFinder is an interactive program for the determination and characterization of dynamical domains in proteins.
[*] Dotter : a dot-matrix program with interactive greyscale rendering for genomic DNA and Protein sequence analysis. Dotter is a graphical dotplot program for detailed comparison of two sequences. Here, every residue in one sequence is compared to every residue in the other sequence. The first sequence runs along the x-axis and the second sequence along the y-axis. In regions where the two sequences are similar to each other, a row of high scores will run diagonally across the dot matrix. If you're comparing a sequence against itself to find internal repeats, you'll notice that the main diagonal scores maximally, since it's the 100% perfect self-match.
[*] Dynamite is a code generating language developed for sequence comparison methods. Dynamic programming methods are used for many aspects of sequence analysis, including the common smith-waterman algorithm and profile or HMM searches. Dynamite is a way of automatically making these algorithms. Probably dynamite is only useful as to a limited number of programmers who want to develop new algorithms based around dynamic programming for sequence comparisons.
[*] New! E-CELL is a modeling and simulation environment for biochemical and genetic processes has been developed. The E-CELL system allows a user to define functions of proteins, protein-protein interactions, protein-DNA interactions, regulation of gene expression and other features of cellular metabolism, as a set of reaction rules.
[GPL] EMBOSS (European Molecular Biology Open Software Suite) is a collaboration of European biological software developers which aims to integrate a range of currently available packages and tools for sequence analysis into a general, publicly available, suite of programs and libraries.
[*] Evalign Evalign is a computerised evaluation method for protein sequence alignment algorithm. It compares computer alignments with structural alignments of protein sequences and tells how correct your sequence alignment program is. It gives additional information on the characteristics of different alignment algorithms and shows types specific failings in alignment. A parameter termed as 'position shift rate' (psr, as a unit) for the comparison of sequence alignments is introduced.
It is derived from the number of incorrect alignment positions in comparison with the correct structural alignments. It is an alternative method for the simple counting of the number of identically positioned residues matched in alignments and conceptually easy to comprehend.
See also ftp://ftp.mrc-lmb.cam.ac.uk/pub/genomes/jong/Software/Evalign
[*] Fasta is a set of tools for biological sequence analysis including programs for library search, local homology, FASTA package provides a modular set of sequence comparison programs that can run on conventional single processor computers or in parallel on multiprocessor computers. Seven different programs: fasta3, fastx3, fasty3, tfastx3, tfasty3, tfasta3, and ssearch3; are currently available. All of the comparison programs share a set of basic command line options; additional options are available for individual comparison functions.
[*] fastDNAml is a program derived from Joseph Felsenstein's version 3.3 DNAML (part of his PHYLIP package). Users should consult the documentation for DNAML before using this program. fastDNAml is an attempt to solve the same problem as DNAML, but to do so faster and using less memory, so that larger trees and/or more bootstrap replicates become tractable. Much of fastDNAml is merely a recoding of the PHYLIP 3.3 DNAML program from PASCAL to C. See also http://ftp.sunet.se/pub/molbio/evolve/
[*] New! gBioSeq is a graphical multiple DNA sequence alignment editor for GTK/Linux. It provides an easy to use interface to manual alignment, seach as SeqApp on Macintosh. Sorry, it is only available in french for the moment.
[*] New! GDE (Genetic Data Environment) is a set of programs for multiple sequence alignment and analysis. The programs use an expandable user interface which allows the addition of external analysis functions without any rewriting of code. Analysis functions can be written in any language ('C',Fortran, PascalBasic, shell scripts), and can be seamlessly merged into the systems menu/dialog box interface.
[*] GEANFAMMER (GEnome ANalysis and Protein FAMily MakER) refers either to a perl5 program, a suite of perl5 programs, a perl5 module or a perl5 subroutine library. It has been developed for the analysis of most of the complete bacterial genomes announced since 1995. It summarises the whole procedure of preparing statistically and biologically more relevant protein (sequence) duplication modules before any more biological analysis like structure and function assignment. With this now anybody can easily analyse the duplication level and types of sequence families in any genome or database.
[CM] GeneMark is a package for finding gene locations within unannotated sequence texts using mathematical models of gene encoding. The package is available for a wide number of computing platforms and features RBS site analysis, sequence frameshift detection, easy to read graphical output and reports, and statistics for a wide array of organisms.
[CM] GeneMine GeneMine[tm] is the first expert bioinformatics data mining system to automatically query multiple independent sources.
[CM] New! GeneMine is the first expert bioinformatics data mining system to automatically query multiple independent sources. Has been designed to provide users with a set of features that can be readily customized to their specifications. Can also be configured to take advantage of new sequence alignment algorithms, protein structure modeling algorithms, and other analysis tools.
[CM] GERM (Genetically Evolved Receptor Models) develop three-dimensional atom-based models of previously uncharacterized receptor sites, starting from a small structure-activity series of active and inactive compounds (3 or more structures). The program is described in J. Medicinal Chem., 37, 2527-2536 (1994). Available from D. Eric Walters, Ph.D., Associate Professor, Biological Chemistry Finch University of Health Sciences/The Chicago Medical School 3333 Green Bay Road, North Chicago, IL 60064 USA (Telephone: + 1 708 578 3000, ext. 498, Fax: + 1 708 578 3240).
[*] GRS : (a graphic tool for Genome Retrieval and Segment analysis) is a graphic tool for retrieval and visualization of genome segments from partially or completely sequenced genomes. To facilitate visual identification of conserved genomic motifs, genes are color-coded according to their presumed functional roles. Aligned genes can be rapidly screened for potential homology by automatic retrieval and alignment of the corresponding protein sequences.
[*] G_neuron is a single neuron simulator for Hoggkin-Huxley model, with many transmembranar currents implemented.
[*] New! G_neuron is a single neuron simulator for Hoggkin-Huxley model, with many transmembranar currents implemented.
[*] HMMER Hidden Markov models (HMMs) are statistical models of the primary structure consensus of a sequence family. Anders Krogh, David Haussler, and co-workers at UC Santa Cruz introduced a form of HMM which is well-suited to protein and DNA sequence analysis, adopting HMM techniques which have been used for years in speech recognition. HMMs had been used in biology before -- notably, for modeling protein structure -- but the Krogh paper had a particularly dramatic impact.
[GPL] Loci Project is an extremely modular and extensible system of clients and servers for distributing bioinformatics programs and databases. We are first of all concentrating on making GUI wrappers for command-line programs and databases that run on UNIX-type systems, of which a countless number exist. We are using the highly object-oriented scripting language Python as the base language.
[*] MOLPHY is a program package for MOLecular PHYlogenetics. It contains several C and Perl programs.
[*] Naccess is a stand alone program that calculates the accessible area of a molecule from a PDB (Protein Data Bank) format file. It can calculate the atomic and residue accessiblities for both proteins and nucleic acids, and is available for free from this site for researchers at academic and non profit-making institutions.
[*] NCBI Toolkit The NCBI Software Development Toolkit was developed for the production and distribution of GenBank, Entrez, BLAST, and related services by NCBI, and it can be used to help writing biology programs.
[*] NeuronC is a computer language for simulating biophysically- based neural circuits, with a focus on circuitry for vision. It allows the user to construct a model of a large neural circuit (up to 100,000 compartments) and set up an experimental protocol including 2D visual stimulus (spots, bars, gratings, wavelength, intensity, time) and recording (voltage clamp, current clamp, channel state, etc.).
[*] NJplot is a tree drawing program able to draw any binary tree expressed in the standard phylogenetic tree format (e.g., the format used by the PHYLIP package). NJplot is especially convenient for rooting the unrooted trees obtained from parsimony, distance or maximum likelihood tree-building methods. Any rooting of the unrooted tree can be interactively specified using the mouse. NJplot also allows zooming, branch swapping, display of bootstrap scores and printing in the PostScript format. NJplot can therefore be used as a graphical extension of any package of phylogenetic program which employs the standard tree format for storing trees (i.e., with most such packages).
[*] PAML is a program package for phylogenetic analyses of DNA or protein sequences using maximum likelihood.
[*] PHYLIP (the PHYLogeny Inference Package) is a package of programs for inferring phylogenies (evolutionary trees).
[*] PHYLO_WIN is a graphic tool for molecular phylogenetic inferences. It displays a sequence alignment and allows an easy selection of sequences and sites to analyse. Trees are recovered according to most usual tree-making algorithms including maximum parsimony, maximum likelihood and numerous distance based methods. See also the original site
[*] Protein Structure Software : various programs and libraries for both general applicability and for use with protein structure. Here are the listings for programs:
  • Torsions -- A C program for calculating phi, psi and omega torsions and C-alpha pseudo-torsions.
  • ProFit -- Designed to be the ultimate(?) protein least squares fitting program.
  • QTree -- A program for generating CPK, Ball and Stick and worms pictures of molecules using the quad-tree algorithm.
  • MINT -- A graphical user interface to Andrej Sali's Modeller program.
  • Cluster -- A program for performing cluster analysis on an arbitrary set of vectors.
  • cdoc -- automatic generation of HTML documentation of C source code.
The author also developed an extensive library of C routines for the manipulation of protein structures. The routines are extensively documented within the code, but no separate documentation is currently available. The routines include reading and writing of PDB and CSSR files, rotation, translation, resequencing, calculations, fitting, adding hydrogens, ordering, sequence reading and alignment, etc., etc.
[*] PUZZLE is a computer program to reconstruct phylogenetic trees from molecular sequence data by maximum likelihood. It implements a fast tree search algorithm, quartet-puzzling, that allows analysis of large data sets and automatically assigns estimations of support to each internal branch. PUZZLE also computes pairwise maximum likelihood distances as well as branch lengths for user specified trees. In addition, PUZZLE offers a novel method, likelihood-mapping, to investigate the support of a hypothesized internal branch without computing an overall tree. The models of substitution provided by PUZZLE are TN, HKY, F84, Km, JC, SH for nucleotides and Dayhoff, JTT, mtREV24 for amino acids. Rate heterogeneity is modelled by a discrete Gamma distribution and by allowing invariable sites. The corresponding parameters can be inferred from the data set.
[*] RASSE is a structrue-based method for de novo drug design. The detailed description of methodology can be found at J.Chem.Inf.Comput.Sci., 1996,36, 1187-1194.
[*] SAM (System for Assembling Markers) takes as input a file of clones and their marker content. The output is a graphical display which shows the order of the markers and the alignment of the clones to the markers. A solution can be saved in a file which can subsequently be loaded back into SAM.
[*] SCWRL (SideChain placement With a Rotamer Library) is a program for adding sidechains to a protein backbone based on the backbone-dependent rotamer library. The library provides lists of chi1-chi2-chi3-chi4 values and their relative probabilities for residues at given phi-psi values, and explores these conformations to minimize sidechain-backbone clashes and sidechain-sidechain clashes. You may get output from the program at any of three steps (best library rotamers, no clashes relieved; backbone clashes relieved; backbone and sidechain-sidechain clashes relieved). The current version of the program is able to predict 80.0% of chi1 correct for all residues on native backbones in a test-set of 316 proteins (See results). There are several methods for entering the desired sequence of residues, as well as ways of indicating that pieces of the structure are to remain static.
[*] SC_Rate (Sequence id Composition id rate) is a general method to evaluate the accuracy of protein sequence alignment with very low sequence identity. High composition identity at relatively low sequence identity is found to have an adverse effect on protein sequence alignment accuracy. A reliability index derived from sequence and composition identity (defined as SC rate) was evaluated against the accuracy of each alignment position in a database of sequence alignments compared with corresponding structural alignments. This showed that there is a correlation between SC rate and the alignment error rate for each alignment position, indicating that it can be used to predict the alignment's reliability systematically. SC_rate is a Perl5 program (available both in Perl and C binary format), which produces lines of predictions for the reliability of any alignment (both pairwise and multiple).
[*] SEALS (A System for Easy Analysis of Lots of Sequences) is a software package expressly designed for large-scale research projects in bioinformatics. Using a friendly, scalable command-line user interface, SEALS provides dozens of commands to help the user quickly implement standard sequence analysis protocols, design new investigations, and generally Get Things Done with dispatch.
[*] SeqPup is a biological sequence editor and analysis program. It includes links to network services and external analysis programs. It is usable on common computer systems, including Macintosh, MS-Windows and X-Windows.
  • multiple sequence alignment and single sequence editing
  • read and write several sequence file formats
  • sequence feature editing, and marking in prints translate dna/amino, consensus, reverse/complement, distance and such functions pretty print of alignments and sequences with boxed and shaded regions
  • Internet sequence analysis (fetch sequeces using SRS keyword search, NCBI-BLAST similarity search)
[*] Sequin is a program designed to aid in the submission of sequences to the GenBank, EMBL, and DDBJ sequence databases. It is capable of handling simple submissions which contain a single short mRNA sequence, and complex submissions containing long sequences, multiple annotations, segmented sets of DNA, or phylogenetic and population studies.
[*] Swiss PDBViewer is an application that provides a user friendly interface allowing to analyse several proteins at the same time. The proteins can be superimposed in order to deduce structural alignments and compare their active sites or any other relevant parts. Amino acid mutations, H-bonds, angles, energy minimization, and distances between atoms are easy to obtain thanks to the intuitive graphic and menu interface.
[*] tacg is a character-based, command line tool for the restriction enzyme analysis of DNA for unix-like operating systems. Have a look of WWWtacg, the WWW interfaces to tcag.
[*] Wise2 (The new Wise Package) is a series of modules written in C for biological sequence processing Much of Wisetools is based around profile matching to DNA sequences. Wisetools includes an algorithm that allows a protein profile to be compared to a DNA sequence using all six translation frames and allowing for frame shifts in the forward and backward frames.