ermineJ CLI

This page is intended for users who are already comfortable using a command line shell.

In addition to providing a scriptable interface to the software, the ErmineJ CLI provides access to a few less-used features that are not accessible through the graphical user interface (GUI). The CLI can also be used to start the GUI.

To access the CLI, you need to have installed the generic bundle. See the instructions.

Your environment has to define the JAVA_HOME variable (where Java is installed, e.g. /usr/lib/java) and also ERMINEJ_HOME (point to the installation directory). You may want to put $ERMINEJ_HOME/bin in your path so you can run the scripts with less typing.

Once you have set up the package, you should be able to access ermineJ by running ermineJ.bat (Windows) or ermineJ.sh. The rest of the instructions assume you are using a *nix platform (e.g. Linux, MacOSX), but using the included ermineJ.bat file would be analogous.

Usage

See further down the page for an example

 $ERMINEJ_HOME/bin/ermineJ.sh  [-A] [-a <file>] [-b] [-batch <scoreFileList>] [-C <config
       file>] [-c <file>] [-d <directory>] [-e <integer>] [-F] [-f <directory>] [-G] [-g <BEST|MEAN>] [-h]
       [-i <iterations>] [-j] [-l] [-M <value>] [-m <option>] [-n <value>] [-nomf] [-o <output file>] [-q
       <quantile>] [-r <data file>] [-S <file>] [-s <score file>] [-t <threshold>] [-x <maxClassSize>] [-y
       <minClassSize>]

Options

The following options are supported:

 -A,--affy                     Affymetrix annotation file format
 -a,--annots <file>            Annotation file to be used [required unless using GUI]
 -b                            Sets 'big is better' option for gene scores to true [default = false]
 -batch <scoreFileList>        Batch process score files from a list, one per line. Incompatible
                               with -o, -s, -G
 -C,--config <config file>     Configuration file to use (saves typing); additional options given on
                               the command line override those in the file. If you don't use this option, no configuration file
                               will be used.
 -c,--classFile <file>         Gene set ('class') file, e.g. GO XML file [required unless using GUI]
 -d <directory>                Data directory; default is your ermineJ.data directory
 -e,--scoreCol <integer>       Column for scores in input file
 -F,--filterNonSpecific        Filter out non-specific probes (default annotation format only),
                               default=true
 -f <directory>                Directory where custom gene set are located
 -G,--gui                      Launch the GUI.
 -g,--reps <value>             What to do when genes have multiple scores in input file (due to
                               multiple probes per gene): BEST = best of replicates; MEAN = mean of replicates; default=MEAN
 -h,--help                     Print this message
 -i,--iters <integer>          Number of iterations (GSR and CORR methods only)
 -j,--genesOut                 Output should include gene symbols for all gene sets (default=don't
                               include symbols)
 -l,--logTrans                 Log transform the scores (and change sign; recommended for p-values),
                               default=true
 -M,--mtc <value>              Multiple test correction method: BONFERONNI = Bonferonni FWE,
                               WESTFALLYOUNG = Westfall-Young (slow), BENJAMINIHOCHBERG = Benjamini-Hochberg FDR [default]
 -m,--stats <value>            Method for computing raw class statistics (used for test=GSR only):
                               MEAN (mean),  QUANTILE (quantile), or  MEAN (mean above quantile), or PRECISIONRECALL (area under
                               the precision-recall curve); default=ORA
 -n,--test <value>             Method for computing gene set significance:  ORA (ORA),  GSR
                               (resampling of gene scores; use with -m to choose algorithm),  CORR (profile correlation),  ROC
                               (ROC)
 -nomf                         Disable multifunctionality correction (default: on)
 -o,--output <file>            Output file name; if omitted, results are written to standard out
 -q,--quantile <integer>       quantile to use, only used for 'MEAN_ABOVE_QUANTILE', default=50
                               (median)
 -r,--rawData <file>           Raw data file, only needed for profile correlation analysis
 -S,--saveconfig <file>        Save preferences in the specified file
 -s,--scoreFile <file>         Score file, required for all but profile correlation method
 -t,--threshold <value>        Score threshold, only used for ORA; default = 0.001
 -x,--maxClassSize <integer>   Sets the maximum class size; default = 100
 -y,--minClassSize <integer>   Sets the minimum class size; default = 10

Example

Minimal command line, using defaults except for the three key input files and the choice of method (ORA) and the threshold for score selection (0.0001).

ermineJ.sh -s geneScores.txt -c ~/ermineJ.data/go_daily-termdb.rdf-xml.gz \
    -a ~/ermineJ.data/Generic_human_noParents.an.txt.gz -n ORA -t 0.0001 \
     > results.txt

Configuration files and the command line

For GUI users, the configuration file refers to the “settings” file, normally called “erminej.properties” and stored in the user’s home directory. The CLI permits the use of a configuration file (identified with the -C option) instead of setting parameters as arguments to the shell command. This section describes how the CLI interprets this option.

Note that this behavior has changed in recent versions of ermineJ. Please note that when running ermineJ from Webstart or the Windows installed version, changes in the GUI (graphical user interface) are immediately reflected in the default configuration file stored in your home directory.

  • If you don’t specify a configuration file, all the options must be supplied on the command line. In previous versions, the configuration file would be read in by default.
  • If you do specify a configuration file, options can be overridden on the command line, but they will not be written into the config file.
  • If you don’t specify a configuration file and use -G to start the GUI, the default configuration file will be written and used as usual: it will be modified by other options you pass in or change in the GUI.
  • If you do specify a configuration file and use -G to start the GUI, the specified config file will NOT be modified.

This allows a consistent reuse of the customized config files, if so desired.