EMBOSS: Applications not found in GCG

 

This is a rough list of the EMBOSS applications that have functions which are not covered by any GCG program.

Many EMBOSS programs have been excluded from this list because they do most of what a GCG programs does, even though they have some extra functionality that GCG doesn't have. For example, 'showseq' has been excluded because most of what it does is very similar to the GCG program 'publish', even though it can mark the sequence with annotations from the feature table on the sequence display which the GCG program cannot.

It is therefore worth investigating the corresponding EMBOSS program carefully to see what extra things it can do that the GCG program cannot.

Note that all of the EMBOSS programs can read in sequences that are in any of the known, standard sequence formats. The EMBOSS program will test the sequence it reads in to determine what the format is, so you can use sequences in the formats GCG, MFS, GenBank, EMBL, FASTA, etc, etc.

By default, EMBOSS programs write the sequences in fasta format, but you can explicitly tell an EMBOSS program to write the sequence in a specified format, or you can set the default format for all EMBOSS programs to use when writing out.

The full list of EMBOSS applications is here
and the applications grouped by function is here.

The EMBOSS developers are always interested in hearing your requirements and suggestions for improvements. They can be contacted via the mailing lists.

Note also that the sources of the EMBOSS programs are available for your local experts to inspect and change to meet your local requirements.

Program name Description
abiview Reads ABI file and display the trace
antigenic Finds antigenic sites in proteins
banana Bending and curvature plot in B-DNA
biosed Replace or delete sequence sections
btwisted Calculates the twisting in a B-DNA sequence
cai CAI codon adaptation index
chaos Create a chaos game representation plot for a sequence
charge Protein charge plot
checktrans Reports STOP codons and ORF statistics of a protein
chips Codon usage statistics
coderet Extract CDS, mRNA and translations from feature tables
cpgplot Plot CpG rich areas
cpgreport Reports all CpG rich regions
cutseq Removes a specified section from a sequence
dbiblast Index a BLAST database
dbifasta Index a fasta database
dbiflat Index a flat file database
degapseq Removes gap characters from sequences
descseq Alter the name or description of a sequence
diffseq Find differences between nearly identical sequences
dotpath Displays a non-overlapping wordmatch dotplot of two sequences
dreg regular expression search of a nucleotide sequence
embossversion Writes the current EMBOSS version number
emowse Protein identification by mass spectrometry
entret Reads and writes (returns) flatfile entries
est2genome Align EST and genomic DNA sequences
extractfeat Extract features from a sequence
extractseq Extract regions from a sequence
findkm Find Km and Vmax for an enzyme reaction by a Hanes/Woolf plot
fuzztran Protein pattern search after translation
garnier Predicts protein secondary structure
geecee Calculates the fractional GC content of nucleic acid sequences
getorf Finds and extracts open reading frames (ORFs)
infoalign Information on a multiple sequence alignment
infoseq Displays some simple information about sequences
isochore Plots isochores in large DNA sequences
listor Writes a list file of the logical OR of two sets of sequences
marscan Finds MAR/SAR sites in nucleic sequences
maskfeat Mask off features of a sequence
maskseq Mask off regions of a sequence
megamerger Merge two large overlapping nucleic acid sequences
merger Merge two overlapping nucleic acid sequences
mwcontam Shows molwts that match across a set of files
mwfilter Filter noisy molwts from mass spec output
newcpgreport Report CpG rich areas
newcpgseek Reports CpG rich regions
notseq Excludes a set of sequences and writes out the remaining ones
nthseq Writes one sequence from a multiple set of sequences
octanol Displays protein hydropathy
oddcomp Finds protein sequence regions with a biased composition
pasteseq Insert one sequence into another
pepinfo Plots simple amino acid properties in parallel
pepnet Displays proteins as a helical net
pepstats Protein statistics
pepwindow Displays protein hydropathy
pepwindowall Displays protein hydropathy of a set of sequences
pestfind Finds PEST motifs as potential proteolytic cleavage sites
polydot Displays all-against-all dotplots of a set of sequences
preg Regular expression search of a protein sequence
pscan Scans proteins using PRINTS
recoder Remove restriction sites but maintain the same translation
redata Search REBASE for enzyme name, references, suppliers etc
restover Finds restriction enzymes that produce a specific overhang
seealso Finds programs sharing group names
seqmatchall Does an all-against-all comparison of a set of sequences
seqretsplit Reads and writes (returns) sequences in individual files
showdb Displays information on the currently available databases
showfeat Show features of a sequence
sigcleave Reports protein signal cleavage sites
silent Silent mutation restriction enzyme scan
sirna Finds siRNA duplexes in mRNA
skipseq Reads and writes (returns) sequences, skipping the first few
splitter Split a sequence into (overlapping) smaller sequences
stssearch Searches a DNA database for matches with a set of STS primers
supermatcher Finds a match of a large sequence against one or more sequences
tmap Displays membrane spanning regions
tranalign Align nucleic coding regions given the aligned proteins
trimest Trim poly-A tails off EST sequences
trimseq Trim ambiguous bits off the ends of sequences
twofeat Finds neighbouring pairs of features in sequences
vectorstrip Strips out DNA between a pair of vector sequences
wossname Finds programs by keywords in their one-line documentation