We optimized the GenePainter web application for Firefox and Google Chrome. The application’s functionality might be reduced in other browsers. To take advantage of the full functionality GenePainter has to offer, please start a recent version of Firefox or Google Chrome and visit this page again.

Command Line Tool

Usage
Options
Changes between versions

Usage Back to top

$ ruby gene_painter.rb -i <alignment> -p <yaml_files> [<options>]

-i or --input	Path to fasta-formatted multiple sequence alignment
-p or --path	Path to folder containing gene structures in YAML or GFF format
Standard output format	Mark exons by '-' and introns by '\|'

Options Back to top

Text-based output format

--intron-phase	Mark introns by their phase instead of '\|'
--phylo	Mark exons by '0' and introns by '1'
--spaces	Mark exons by space (' ') instead of '-'
--no-standard-output	Specify to skip standard output format.
--alignment	Output the alignment file with additional lines containing intron phases
--fuzzy N	Introns at most N base pairs apart from each other are aligned

Graphical output format

--svg	Draw a graphical representation of genes in SVG format.
--svg-format FORMAT	Switch between different formats. FORMAT must be one of "normal", "reduced" or "both"] "normal" draws details of aligned exons and introns [default] "reduced" focuses on common introns only "both" draws both formats
--pdb FILE	Mark consensus or merged gene structure in pdb FILE Consenus gene structure contains introns conserved in N % of all genes Specify N with option --consensus N; [default: 80%] Two scripts for execution in PyMol are provided: 'color_exons.py' to mark consensus exons 'color_splicesites.py' to mark splice junctions of consensus exons
--pdb-chain CHAIN	Mark gene structures for chain CHAIN. [default: Use chain A]
--pdb-ref-prot PROT	Use protein PROT as reference for alignment with pdb sequence. [default: First protein in alignment]
--pdb-ref-prot-struct	Color only intron positions occuring in the reference protein structure.
--tree	Generate newick tree file and SVG representation

Meta information and statistics

--consensus N	Mark all introns conserved in N % genes. Specify N as decimal number between 0 and 1.
--merge	Merge all introns into a single exon intron pattern
--statistics	Output additional file with statistics about common introns. To include information about taxomony, specify options --taxomony and ‑‑taxonomy‑to‑fasta.

Taxonomy

--taxonomy FILE	Use this option to mark introns by taxonomy. NCBI taxonomy database dump file FILE OR Excerpt of NCBI taxonomy. Lineage must be semicolon-separated list of taxa from root to species.
--taxonomy-to-fasta FILE	Text-based file mapping gene structure file names to species names. One or more genes given as semicolon-separated list and species name. Delimiter between gene list and species name must be a colon. The species name itself must be enclosed by double quotes like this "SPECIES"
--taxonomy-common-to X,Y,Z	Mark introns common to taxa X,Y,Z. List must consist of at least one NCBI taxon (scientific name)
--[no-]exclusively-in-taxa	Mark introns occuring (not) exclusively in listed taxa. [default: not exclusively]
--introns-per-taxon	Mark newly gained introns for every inner node in taxonomy.

Parse NCBI taxonomy

--no-grep	Read the NCBI taxomony dump into RAM. This will require some additional hundert MBs of RAM. [default: taxomony dump is parsed with grep calls]
--nice	Run grep calls with lower priority. Please make sure to have nice in your executable path when using this option.

Analysis and output of all or subset of data

--analyse-all-output-all	Analyse all data and provide full output [default]
--analyse-all-output-selection	Analyse all data and provide text-based and graphical output for selection only. All introns are analysed, including those not present in selection
‑‑analyse‑selection-output‑selection	Analyse selected data and provide output for selection only
‑‑analyse‑selection‑on‑all‑data-output‑selection	Analyse intron positions of selected data in all data and provide output for selection only. Introns present in selection are analysed in all data

Selection criteria for data and output selection

--select-all	No selection applied (default)
--selection-based-on-regex "REGEX"	Regular expression applied on gene structure file names. Regex must be enclosed by double quotes
--selection-based-on-list X,Y,Z	List of gene structures to be used
--selection-based-on-species SPECIES	Use all gene structures associated with species. Specify also --taxonomy-to-fasta to map gene structure file names to species names

General options

-o or --outfile FILENAME	Prefix of the output files.
--path-to-output PATH	Path to the location where output files should be stored.
--range START,STOP	Restrict genes to range START-STOP in alignment
--[no-]delete-range	(Not) Delete specified range
--keep-common-gaps	Keep common gaps in alignment. This option effects only output of --alignment
--no-best-position-introns	Plot introns always onto beginning of a gap. Default: Align introns if their position differs by alignment gaps only
--[no-]separate-introns-in-textbased-output	(Not) Separate each consecutive pair of introns by an exon placeholder in text-based output formats. Default: Separate introns unless the output lines get too long.
-h or --help	List all options available.

For a complete list of all options available, please refer to the documentation.`

Changes in command line parameters from v.1.0 to v.2.0 Back to top

v.1.0 parameter	v.2.0 parameter
-a	--alignment
-n	--intron-phase
-phylo	--phylo
-s	--spaces
-svg WIDTH,HEIGHT FORMAT	--svg and --svg-format
-start START and -stop STOP	--range START,STOP
-pdb	--pdb
-pdb_prot	--pdb-ref-prot
-ref_prot_struct	--pdb-ref-prot-struct
-consensus	--consensus, no longer restricted to combination with -pdb
-f and -penalize_endgaps	obsolete