Your IP : 18.119.14.248


Current Path : /proc/9785/root/usr/share/doc/parallel-20160222/
Upload File :
Current File : //proc/9785/root/usr/share/doc/parallel-20160222/parallel_tutorial.texi

\input texinfo
@setfilename GNU_Parallel_Tutorial.info

@documentencoding utf-8

@settitle GNU Parallel Tutorial

@node Top
@top GNU Parallel Tutorial

@menu
* GNU Parallel Tutorial::
* Prerequisites::
* Input sources::
* Building the command line::
* Controlling the output::
* Controlling the execution::
* Remote execution::
* Saving to an SQL base (advanced)::
* --pipe::
* Shebang::
* Semaphore::
* Informational::
* Profiles::
* Spread the word::
@end menu

@node GNU Parallel Tutorial
@chapter GNU Parallel Tutorial

This tutorial shows off much of GNU @strong{parallel}'s functionality. The
tutorial is meant to learn the options in GNU @strong{parallel}.  The tutorial
is not to show realistic examples from the real world.

Spend an hour walking through the tutorial. Your command line will
love you for it.

@node Prerequisites
@chapter Prerequisites

To run this tutorial you must have the following:

@table @asis
@item parallel >= version 20140622
@anchor{parallel >= version 20140622}

Install the newest version with:

@verbatim
  (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
@end verbatim

This will also install the newest version of the tutorial:

@verbatim
  man parallel_tutorial
@end verbatim

Most of the tutorial will work on older versions, too.

@item abc-file:
@anchor{abc-file:}

The file can be generated by:

@verbatim
  parallel -k echo ::: A B C > abc-file
@end verbatim

@item def-file:
@anchor{def-file:}

The file can be generated by:

@verbatim
  parallel -k echo ::: D E F > def-file
@end verbatim

@item abc0-file:
@anchor{abc0-file:}

The file can be generated by:

@verbatim
  perl -e 'printf "A\0B\0C\0"' > abc0-file
@end verbatim

@item abc_-file:
@anchor{abc_-file:}

The file can be generated by:

@verbatim
  perl -e 'printf "A_B_C_"' > abc_-file
@end verbatim

@item tsv-file.tsv
@anchor{tsv-file.tsv}

The file can be generated by:

@verbatim
  perl -e 'printf "f1\tf2\nA\tB\nC\tD\n"' > tsv-file.tsv
@end verbatim

@item num8
@anchor{num8}

The file can be generated by:

@verbatim
  perl -e 'for(1..8){print "$_\n"}' > num8
@end verbatim

@item num128
@anchor{num128}

The file can be generated by:

@verbatim
  perl -e 'for(1..128){print "$_\n"}' > num128
@end verbatim

@item num30000
@anchor{num30000}

The file can be generated by:

@verbatim
  perl -e 'for(1..30000){print "$_\n"}' > num30000
@end verbatim

@item num1000000
@anchor{num1000000}

The file can be generated by:

@verbatim
  perl -e 'for(1..1000000){print "$_\n"}' > num1000000
@end verbatim

@item num_%header
@anchor{num_%header}

The file can be generated by:

@verbatim
  (echo %head1; echo %head2; perl -e 'for(1..10){print "$_\n"}') > num_%header
@end verbatim

@item For remote running: ssh login on 2 servers with no password in $SERVER1 and $SERVER2
@anchor{For remote running: ssh login on 2 servers with no password in $SERVER1 and $SERVER2}

@verbatim
  SERVER1=server.example.com
  SERVER2=server2.example.net
@end verbatim

You must be able to:

@verbatim
  ssh $SERVER1 echo works
  ssh $SERVER2 echo works
@end verbatim

It can be setup by running 'ssh-keygen -t dsa; ssh-copy-id $SERVER1'
and using an empty pass phrase.

@end table

@node Input sources
@chapter Input sources

GNU @strong{parallel} reads input from input sources. These can be files, the
command line, and stdin (standard input or a pipe).

@menu
* A single input source::
* Multiple input sources::
* Changing the argument separator.::
* Changing the argument delimiter::
* End-of-file value for input source::
* Skipping empty lines::
@end menu

@node A single input source
@section A single input source

Input can be read from the command line:

@verbatim
  parallel echo ::: A B C
@end verbatim

Output (the order may be different because the jobs are run in
parallel):

@verbatim
  A
  B
  C
@end verbatim

The input source can be a file:

@verbatim
  parallel -a abc-file echo
@end verbatim

Output: Same as above.

STDIN (standard input) can be the input source:

@verbatim
  cat abc-file | parallel echo
@end verbatim

Output: Same as above.

@node Multiple input sources
@section Multiple input sources

GNU @strong{parallel} can take multiple input sources given on the command
line. GNU @strong{parallel} then generates all combinations of the input
sources:

@verbatim
  parallel echo ::: A B C ::: D E F
@end verbatim

Output (the order may be different):

@verbatim
  A D
  A E
  A F
  B D
  B E
  B F
  C D
  C E
  C F
@end verbatim

The input sources can be files:

@verbatim
  parallel -a abc-file -a def-file echo
@end verbatim

Output: Same as above.

STDIN (standard input) can be one of the input sources using @strong{-}:

@verbatim
  cat abc-file | parallel -a - -a def-file echo 
@end verbatim

Output: Same as above.

Instead of @strong{-a} files can be given after @strong{::::}:

@verbatim
  cat abc-file | parallel echo :::: - def-file
@end verbatim

Output: Same as above.

::: and :::: can be mixed:

@verbatim
  parallel echo ::: A B C :::: def-file
@end verbatim

Output: Same as above.

@menu
* Matching arguments from all input sources::
@end menu

@node Matching arguments from all input sources
@subsection Matching arguments from all input sources

With @strong{--xapply} you can get one argument from each input source:

@verbatim
  parallel --xapply echo ::: A B C ::: D E F
@end verbatim

Output (the order may be different):

@verbatim
  A D
  B E
  C F
@end verbatim

If one of the input sources is too short, its values will wrap:

@verbatim
  parallel --xapply echo ::: A B C D E ::: F G
@end verbatim

Output (the order may be different):

@verbatim
  A F
  B G
  C F
  D G
  E F
@end verbatim

@node Changing the argument separator.
@section Changing the argument separator.

GNU @strong{parallel} can use other separators than @strong{:::} or @strong{::::}. This is
typically useful if @strong{:::} or @strong{::::} is used in the command to run:

@verbatim
  parallel --arg-sep ,, echo ,, A B C :::: def-file
@end verbatim

Output (the order may be different):

@verbatim
  A D
  A E
  A F
  B D
  B E
  B F
  C D
  C E
  C F
@end verbatim

Changing the argument file separator:

@verbatim
  parallel --arg-file-sep // echo ::: A B C // def-file
@end verbatim

Output: Same as above.

@node Changing the argument delimiter
@section Changing the argument delimiter

GNU @strong{parallel} will normally treat a full line as a single argument: It
uses @strong{\n} as argument delimiter. This can be changed with @strong{-d}:

@verbatim
  parallel -d _ echo :::: abc_-file
@end verbatim

Output (the order may be different):

@verbatim
  A
  B
  C
@end verbatim

NULL can be given as @strong{\0}:

@verbatim
  parallel -d '\0' echo :::: abc0-file
@end verbatim

Output: Same as above.

A shorthand for @strong{-d '\0'} is @strong{-0} (this will often be used to read files
from @strong{find ... -print0}):

@verbatim
  parallel -0 echo :::: abc0-file
@end verbatim

Output: Same as above.

@node End-of-file value for input source
@section End-of-file value for input source

GNU @strong{parallel} can stop reading when it encounters a certain value:

@verbatim
  parallel -E stop echo ::: A B stop C D
@end verbatim

Output:

@verbatim
  A
  B
@end verbatim

@node Skipping empty lines
@section Skipping empty lines

Using @strong{--no-run-if-empty} GNU @strong{parallel} will skip empty lines.

@verbatim
  (echo 1; echo; echo 2) | parallel --no-run-if-empty echo
@end verbatim

Output:

@verbatim
  1
  2
@end verbatim

@node Building the command line
@chapter Building the command line

@menu
* No command means arguments are commands::
* Replacement strings::
* More than one argument::
* Quoting::
* Trimming space::
@end menu

@node No command means arguments are commands
@section No command means arguments are commands

If no command is given after parallel the arguments themselves are
treated as commands:

@verbatim
  parallel ::: ls 'echo foo' pwd
@end verbatim

Output (the order may be different):

@verbatim
  [list of files in current dir]
  foo
  [/path/to/current/working/dir]
@end verbatim

The command can be a script, a binary or a Bash function if the function is
exported using @strong{export -f}:

@verbatim
  # Only works in Bash
  my_func() {
    echo in my_func $1
  }
  export -f my_func
  parallel my_func ::: 1 2 3
@end verbatim

Output (the order may be different):

@verbatim
  in my_func 1
  in my_func 2
  in my_func 3
@end verbatim

@node Replacement strings
@section Replacement strings

@menu
* The 7 predefined replacement strings::
* Changing the replacement strings::
* Perl expression replacement string::
* Positional replacement strings::
* Positional perl expression replacement string::
* Input from columns::
* Header defined replacement strings::
* More pre-defined replacement strings::
@end menu

@node The 7 predefined replacement strings
@subsection The 7 predefined replacement strings

GNU @strong{parallel} has several replacement strings. If no replacement
strings are used the default is to append @strong{@{@}}:

@verbatim
  parallel echo ::: A/B.C
@end verbatim

Output:

@verbatim
  A/B.C
@end verbatim

The default replacement string is @strong{@{@}}:

@verbatim
  parallel echo {} ::: A/B.C
@end verbatim

Output:

@verbatim
  A/B.C
@end verbatim

The replacement string @strong{@{.@}} removes the extension:

@verbatim
  parallel echo {.} ::: A/B.C
@end verbatim

Output:

@verbatim
  A/B
@end verbatim

The replacement string @strong{@{/@}} removes the path:

@verbatim
  parallel echo {/} ::: A/B.C
@end verbatim

Output:

@verbatim
  B.C
@end verbatim

The replacement string @strong{@{//@}} keeps only the path:

@verbatim
  parallel echo {//} ::: A/B.C
@end verbatim

Output:

@verbatim
  A
@end verbatim

The replacement string @strong{@{/.@}} removes the path and the extension:

@verbatim
  parallel echo {/.} ::: A/B.C
@end verbatim

Output:

@verbatim
  B
@end verbatim

The replacement string @strong{@{#@}} gives the job number:

@verbatim
  parallel echo {#} ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  1
  2
  3
@end verbatim

The replacement string @strong{@{%@}} gives the job slot number (between 1 and
number of jobs to run in parallel):

@verbatim
  parallel -j 2 echo {%} ::: A B C
@end verbatim

Output (the order may be different and 1 and 2 may be swapped):

@verbatim
  1
  2
  1
@end verbatim

@node Changing the replacement strings
@subsection Changing the replacement strings

The replacement string @strong{@{@}} can be changed with @strong{-I}:

@verbatim
  parallel -I ,, echo ,, ::: A/B.C
@end verbatim

Output:

@verbatim
  A/B.C
@end verbatim

The replacement string @strong{@{.@}} can be changed with @strong{--extensionreplace}:

@verbatim
  parallel --extensionreplace ,, echo ,, ::: A/B.C
@end verbatim

Output:

@verbatim
  A/B
@end verbatim

The replacement string @strong{@{/@}} can be replaced with @strong{--basenamereplace}:

@verbatim
  parallel --basenamereplace ,, echo ,, ::: A/B.C
@end verbatim

Output:

@verbatim
  B.C
@end verbatim

The replacement string @strong{@{//@}} can be changed with @strong{--dirnamereplace}:

@verbatim
  parallel --dirnamereplace ,, echo ,, ::: A/B.C
@end verbatim

Output:

@verbatim
  A
@end verbatim

The replacement string @strong{@{/.@}} can be changed with @strong{--basenameextensionreplace}:

@verbatim
  parallel --basenameextensionreplace ,, echo ,, ::: A/B.C
@end verbatim

Output:

@verbatim
  B
@end verbatim

The replacement string @strong{@{#@}} can be changed with @strong{--seqreplace}:

@verbatim
  parallel --seqreplace ,, echo ,, ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  1
  2
  3
@end verbatim

The replacement string @strong{@{%@}} can be changed with @strong{--slotreplace}:

@verbatim
  parallel -j2 --slotreplace ,, echo ,, ::: A B C
@end verbatim

Output (the order may be different and 1 and 2 may be swapped):

@verbatim
  1
  2
  1
@end verbatim

@node Perl expression replacement string
@subsection Perl expression replacement string

When predefined replacement strings are not flexible enough a perl
expression can be used instead. One example is to remove two
extensions: foo.tar.gz becomes foo

@verbatim
  parallel echo '{= s:\.[^.]+$::;s:\.[^.]+$::; =}' ::: foo.tar.gz
@end verbatim

Output:

@verbatim
  foo
@end verbatim

In @strong{@{= =@}} you can access all of GNU @strong{parallel}'s internal functions
and variables. A few are worth mentioning.

@strong{total_jobs()} returns the total number of jobs:

@verbatim
  parallel echo Job {#} of {= '$_=total_jobs()' =} ::: {1..5}
@end verbatim

Output:

@verbatim
  Job 1 of 5
  Job 2 of 5
  Job 3 of 5
  Job 4 of 5
  Job 5 of 5
@end verbatim

@strong{Q(...)} shell quotes the string:

@verbatim
  parallel echo {} shell quoted is {= '$_=Q($_)' =} ::: '*/!#$'
@end verbatim

Output:

@verbatim
  */!#$ shell quoted is \*/\!\#\$
@end verbatim

@strong{$job-}>@strong{skip()} skips the job:

@verbatim
  parallel echo {= 'if($_==3) { $job->skip() }' =} ::: {1..5}
@end verbatim

Output:

@verbatim
  1
  2
  4
  5
@end verbatim

@strong{@@arg} contains the input source variables:

@verbatim
  parallel echo {= 'if($arg[1]==$arg[2]) { $job->skip() }' =} ::: {1..3} ::: {1..3}
@end verbatim

Output:

@verbatim
  1 2
  1 3
  2 1
  2 3
  3 1
  3 2
@end verbatim

If the strings @strong{@{=} and @strong{=@}} cause problems they can be replaced with @strong{--parens}:

@verbatim
  parallel --parens ,,,, echo ',, s:\.[^.]+$::;s:\.[^.]+$::; ,,' ::: foo.tar.gz
@end verbatim

Output: Same as above.

To define a shorthand replacement string use @strong{--rpl}:

@verbatim
  parallel --rpl '.. s:\.[^.]+$::;s:\.[^.]+$::;' echo '..' ::: foo.tar.gz
@end verbatim

Output: Same as above.

If the shorthand starts with @strong{@{} it can be used as a positional
replacement string, too:

@verbatim
  parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{..}' ::: foo.tar.gz
@end verbatim

Output: Same as above.

GNU @strong{parallel}'s 7 replacement strings are implemented as this:

@verbatim
  --rpl '{} '
  --rpl '{#} $_=$job->seq()'
  --rpl '{%} $_=$job->slot()'
  --rpl '{/} s:.*/::'
  --rpl '{//} $Global::use{"File::Basename"} ||= eval "use File::Basename; 1;"; $_ = dirname($_);'
  --rpl '{/.} s:.*/::; s:\.[^/.]+$::;'
  --rpl '{.} s:\.[^/.]+$::'
@end verbatim

@node Positional replacement strings
@subsection Positional replacement strings

With multiple input sources the argument from the individual input
sources can be accessed with @strong{@{}number@strong{@}}:

@verbatim
  parallel echo {1} and {2} ::: A B ::: C D
@end verbatim

Output (the order may be different):

@verbatim
  A and C
  A and D
  B and C
  B and D
@end verbatim

The positional replacement strings can also be modified using @strong{/}, @strong{//}, @strong{/.}, and  @strong{.}:

@verbatim
  parallel echo /={1/} //={1//} /.={1/.} .={1.} ::: A/B.C D/E.F
@end verbatim

Output (the order may be different):

@verbatim
  /=B.C //=A /.=B .=A/B
  /=E.F //=D /.=E .=D/E
@end verbatim

If a position is negative, it will refer to the input source counted
from behind:

@verbatim
  parallel echo 1={1} 2={2} 3={3} -1={-1} -2={-2} -3={-3} ::: A B ::: C D ::: E F
@end verbatim

Output (the order may be different):

@verbatim
  1=A 2=C 3=E -1=E -2=C -3=A
  1=A 2=C 3=F -1=F -2=C -3=A
  1=A 2=D 3=E -1=E -2=D -3=A
  1=A 2=D 3=F -1=F -2=D -3=A
  1=B 2=C 3=E -1=E -2=C -3=B
  1=B 2=C 3=F -1=F -2=C -3=B
  1=B 2=D 3=E -1=E -2=D -3=B
  1=B 2=D 3=F -1=F -2=D -3=B
@end verbatim

@node Positional perl expression replacement string
@subsection Positional perl expression replacement string

To use a perl expression as a positional replacement string simply
prepend the perl expression with number and space:

@verbatim
  parallel echo '{=2 s:\.[^.]+$::;s:\.[^.]+$::; =} {1}' ::: bar ::: foo.tar.gz
@end verbatim

Output:

@verbatim
  foo bar
@end verbatim

If shorthand defined using @strong{--rpl} starts with @strong{@{} it can be used as
a positional replacement string, too:

@verbatim
  parallel --rpl '{..} s:\.[^.]+$::;s:\.[^.]+$::;' echo '{2..} {1}' ::: bar ::: foo.tar.gz
@end verbatim

Output: Same as above.

@node Input from columns
@subsection Input from columns

The columns in a file can be bound to positional replacement strings
using @strong{--colsep}. Here the columns are separated by TAB (\t):

@verbatim
  parallel --colsep '\t' echo 1={1} 2={2} :::: tsv-file.tsv
@end verbatim

Output (the order may be different):

@verbatim
  1=f1 2=f2
  1=A 2=B
  1=C 2=D
@end verbatim

@node Header defined replacement strings
@subsection Header defined replacement strings

With @strong{--header} GNU @strong{parallel} will use the first value of the input
source as the name of the replacement string. Only the non-modified
version @strong{@{@}} is supported:

@verbatim
  parallel --header : echo f1={f1} f2={f2} ::: f1 A B ::: f2 C D
@end verbatim

Output (the order may be different):

@verbatim
  f1=A f2=C
  f1=A f2=D
  f1=B f2=C
  f1=B f2=D
@end verbatim

It is useful with @strong{--colsep} for processing files with TAB separated values:

@verbatim
  parallel --header : --colsep '\t' echo f1={f1} f2={f2} :::: tsv-file.tsv
@end verbatim

Output (the order may be different):

@verbatim
  f1=A f2=B
  f1=C f2=D
@end verbatim

@node More pre-defined replacement strings
@subsection More pre-defined replacement strings

@strong{--plus} adds the replacement strings @strong{@{+/@} @{+.@} @{+..@} @{+...@} @{..@}  @{...@}
@{/..@} @{/...@} @{##@}}. The idea being that @strong{@{+foo@}} matches the opposite of @strong{@{foo@}}
and @strong{@{@}} = @strong{@{+/@}}/@strong{@{/@}} = @strong{@{.@}}.@strong{@{+.@}} = @strong{@{+/@}}/@strong{@{/.@}}.@strong{@{+.@}} = @strong{@{..@}}.@strong{@{+..@}} =
@strong{@{+/@}}/@strong{@{/..@}}.@strong{@{+..@}} = @strong{@{...@}}.@strong{@{+...@}} = @strong{@{+/@}}/@strong{@{/...@}}.@strong{@{+...@}}.

@verbatim
  parallel --plus echo {} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {+/}/{/} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {.}.{+.} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {+/}/{/.}.{+.} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {..}.{+..} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {+/}/{/..}.{+..} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {...}.{+...} ::: dir/sub/file.ext1.ext2.ext3
  parallel --plus echo {+/}/{/...}.{+...} ::: dir/sub/file.ext1.ext2.ext3
@end verbatim

Output:

@verbatim
  dir/sub/file.ext1.ext2.ext3
@end verbatim

@strong{@{##@}} is simply the number of jobs:

@verbatim
  parallel --plus echo Job {#} of {##} ::: {1..5}
@end verbatim

Output:

@verbatim
  Job 1 of 5
  Job 2 of 5
  Job 3 of 5
  Job 4 of 5
  Job 5 of 5
@end verbatim

@node More than one argument
@section More than one argument

With @strong{--xargs} GNU @strong{parallel} will fit as many arguments as possible on a
single line:

@verbatim
  cat num30000 | parallel --xargs echo | wc -l
@end verbatim

Output (if you run this under Bash on GNU/Linux):

@verbatim
  2
@end verbatim

The 30000 arguments fitted on 2 lines.

The maximal length of a single line can be set with @strong{-s}. With a maximal
line length of 10000 chars 17 commands will be run:

@verbatim
  cat num30000 | parallel --xargs -s 10000 echo | wc -l
@end verbatim

Output:

@verbatim
  17
@end verbatim

For better parallelism GNU @strong{parallel} can distribute the arguments
between all the parallel jobs when end of file is met.

Below GNU @strong{parallel} reads the last argument when generating the second
job. When GNU @strong{parallel} reads the last argument, it spreads all the
arguments for the second job over 4 jobs instead, as 4 parallel jobs
are requested.

The first job will be the same as the @strong{--xargs} example above, but the
second job will be split into 4 evenly sized jobs, resulting in a
total of 5 jobs:

@verbatim
  cat num30000 | parallel --jobs 4 -m echo | wc -l
@end verbatim

Output (if you run this under Bash on GNU/Linux):

@verbatim
  5
@end verbatim

This is even more visible when running 4 jobs with 10 arguments. The
10 arguments are being spread over 4 jobs:

@verbatim
  parallel --jobs 4 -m echo ::: 1 2 3 4 5 6 7 8 9 10
@end verbatim

Output:

@verbatim
  1 2 3
  4 5 6
  7 8 9
  10
@end verbatim

A replacement string can be part of a word. @strong{-m} will not repeat the context:

@verbatim
  parallel --jobs 4 -m echo pre-{}-post ::: A B C D E F G
@end verbatim

Output (the order may be different):

@verbatim
  pre-A B-post
  pre-C D-post
  pre-E F-post
  pre-G-post
@end verbatim

To repeat the context use @strong{-X} which otherwise works like @strong{-m}:

@verbatim
  parallel --jobs 4 -X echo pre-{}-post ::: A B C D E F G
@end verbatim

Output (the order may be different):

@verbatim
  pre-A-post pre-B-post
  pre-C-post pre-D-post
  pre-E-post pre-F-post
  pre-G-post
@end verbatim

To limit the number of arguments use @strong{-N}:

@verbatim
  parallel -N3 echo ::: A B C D E F G H
@end verbatim

Output (the order may be different):

@verbatim
  A B C
  D E F
  G H
@end verbatim

@strong{-N} also sets the positional replacement strings:

@verbatim
  parallel -N3 echo 1={1} 2={2} 3={3} ::: A B C D E F G H
@end verbatim

Output (the order may be different):

@verbatim
  1=A 2=B 3=C
  1=D 2=E 3=F
  1=G 2=H 3=
@end verbatim

@strong{-N0} reads 1 argument but inserts none:

@verbatim
  parallel -N0 echo foo ::: 1 2 3
@end verbatim

Output:

@verbatim
  foo
  foo
  foo
@end verbatim

@node Quoting
@section Quoting

Command lines that contain special characters may need to be protected from the shell.

The @strong{perl} program @strong{print "@@ARGV\n"} basically works like @strong{echo}.

@verbatim
  perl -e 'print "@ARGV\n"' A
@end verbatim

Output:

@verbatim
  A
@end verbatim

To run that in parallel the command needs to be quoted:

@verbatim
  parallel perl -e 'print "@ARGV\n"' ::: This wont work
@end verbatim

Output:

@verbatim
  [Nothing]
@end verbatim

To quote the command use @strong{-q}:

@verbatim
  parallel -q perl -e 'print "@ARGV\n"' ::: This works
@end verbatim

Output (the order may be different):

@verbatim
  This
  works
@end verbatim

Or you can quote the critical part using @strong{\'}:

@verbatim
  parallel perl -e \''print "@ARGV\n"'\' ::: This works, too
@end verbatim

Output (the order may be different):

@verbatim
  This
  works,
  too
@end verbatim

GNU @strong{parallel} can also \-quote full lines. Simply run this:

@verbatim
  parallel --shellquote
  parallel: Warning: Input is read from the terminal. Only experts do this on purpose. Press CTRL-D to exit.
  perl -e 'print "@ARGV\n"'
  [CTRL-D]
@end verbatim

Output:

@verbatim
  perl\ -e\ \'print\ \"@ARGV\\n\"\'
@end verbatim

This can then be used as the command:

@verbatim
  parallel perl\ -e\ \'print\ \"@ARGV\\n\"\' ::: This also works
@end verbatim

Output (the order may be different):

@verbatim
  This
  also
  works
@end verbatim

@node Trimming space
@section Trimming space

Space can be trimmed on the arguments using @strong{--trim}:

@verbatim
  parallel --trim r echo pre-{}-post ::: ' A '
@end verbatim

Output:

@verbatim
  pre- A-post
@end verbatim

To trim on the left side:

@verbatim
  parallel --trim l echo pre-{}-post ::: ' A '
@end verbatim

Output:

@verbatim
  pre-A -post
@end verbatim

To trim on the both sides:

@verbatim
  parallel --trim lr echo pre-{}-post ::: ' A '
@end verbatim

Output:

@verbatim
  pre-A-post
@end verbatim

@node Controlling the output
@chapter Controlling the output

The output can prefixed with the argument:

@verbatim
  parallel --tag echo foo-{} ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  A       foo-A
  B       foo-B
  C       foo-C
@end verbatim

To prefix it with another string use @strong{--tagstring}:

@verbatim
  parallel --tagstring {}-bar echo foo-{} ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  A-bar   foo-A
  B-bar   foo-B
  C-bar   foo-C
@end verbatim

To see what commands will be run without running them use @strong{--dryrun}:

@verbatim
  parallel --dryrun echo {} ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  echo A
  echo B
  echo C
@end verbatim

To print the command before running them use @strong{--verbose}:

@verbatim
  parallel --verbose echo {} ::: A B C
@end verbatim

Output (the order may be different):

@verbatim
  echo A
  echo B
  A
  echo C
  B
  C
@end verbatim

GNU @strong{parallel} will postpone the output until the command completes:

@verbatim
  parallel -j2 'printf "%s-start\n%s" {} {};sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
@end verbatim

Output:

@verbatim
  2-start
  2-middle
  2-end
  1-start
  1-middle
  1-end
  4-start
  4-middle
  4-end
@end verbatim

To get the output immediately use @strong{--ungroup}:

@verbatim
  parallel -j2 --ungroup 'printf "%s-start\n%s" {} {};sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
@end verbatim

Output:

@verbatim
  4-start
  42-start
  2-middle
  2-end
  1-start
  1-middle
  1-end
  -middle
  4-end
@end verbatim

@strong{--ungroup} is fast, but can cause half a line from one job to be mixed
with half a line of another job. That has happend in the second line,
where the line '4-middle' is mixed with '2-start'.

To avoid this use @strong{--linebuffer}:

@verbatim
  parallel -j2 --linebuffer 'printf "%s-start\n%s" {} {};sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
@end verbatim

Output:

@verbatim
  4-start
  2-start
  2-middle
  2-end
  1-start
  1-middle
  1-end
  4-middle
  4-end
@end verbatim

To force the output in the same order as the arguments use @strong{--keep-order}/@strong{-k}:

@verbatim
  parallel -j2 -k 'printf "%s-start\n%s" {} {};sleep {};printf "%s\n" -middle;echo {}-end' ::: 4 2 1
@end verbatim

Output:

@verbatim
  4-start
  4-middle
  4-end
  2-start
  2-middle
  2-end
  1-start
  1-middle
  1-end
@end verbatim

@menu
* Saving output into files::
@end menu

@node Saving output into files
@section Saving output into files

GNU @strong{parallel} can save the output of each job into files:

@verbatim
  parallel --files echo ::: A B C
@end verbatim

Output will be similar to this:

@verbatim
  /tmp/pAh6uWuQCg.par
  /tmp/opjhZCzAX4.par
  /tmp/W0AT_Rph2o.par
@end verbatim

By default GNU @strong{parallel} will cache the output in files in @strong{/tmp}. This
can be changed by setting @strong{$TMPDIR} or @strong{--tmpdir}:

@verbatim
  parallel --tmpdir /var/tmp --files echo ::: A B C
@end verbatim

Output will be similar to this:

@verbatim
  /var/tmp/N_vk7phQRc.par
  /var/tmp/7zA4Ccf3wZ.par
  /var/tmp/LIuKgF_2LP.par
@end verbatim

Or:

@verbatim
  TMPDIR=/var/tmp parallel --files echo ::: A B C
@end verbatim

Output: Same as above.

The output files can be saved in a structured way using @strong{--results}:

@verbatim
  parallel --results outdir echo ::: A B C
@end verbatim

Output:

@verbatim
  A
  B
  C
@end verbatim

These files were also generated containing the standard output
(stdout), standard error (stderr), and the sequence number (seq):

@verbatim
  outdir/1/A/seq
  outdir/1/A/stderr
  outdir/1/A/stdout
  outdir/1/B/seq
  outdir/1/B/stderr
  outdir/1/B/stdout
  outdir/1/C/seq
  outdir/1/C/stderr
  outdir/1/C/stdout
@end verbatim

@strong{--header :} will take the first value as name and use that in the
directory structure. This is useful if you are using multiple input
sources:

@verbatim
  parallel --header : --results outdir echo ::: f1 A B ::: f2 C D
@end verbatim

Generated files:

@verbatim
  outdir/f1/A/f2/C/seq
  outdir/f1/A/f2/C/stderr
  outdir/f1/A/f2/C/stdout
  outdir/f1/A/f2/D/seq
  outdir/f1/A/f2/D/stderr
  outdir/f1/A/f2/D/stdout
  outdir/f1/B/f2/C/seq
  outdir/f1/B/f2/C/stderr
  outdir/f1/B/f2/C/stdout
  outdir/f1/B/f2/D/seq
  outdir/f1/B/f2/D/stderr
  outdir/f1/B/f2/D/stdout
@end verbatim

The directories are named after the variables and their values.

@node Controlling the execution
@chapter Controlling the execution

@menu
* Number of simultaneous jobs::
* Shuffle job order::
* Interactivity::
* A terminal for every job::
* Timing::
* Progress information::
* Termination::
* Limiting the resources::
@end menu

@node Number of simultaneous jobs
@section Number of simultaneous jobs

The number of concurrent jobs is given with @strong{--jobs}/@strong{-j}:

@verbatim
  /usr/bin/time parallel -N0 -j64 sleep 1 :::: num128
@end verbatim

With 64 jobs in parallel the 128 @strong{sleep}s will take 2-8 seconds to run -
depending on how fast your machine is.

By default @strong{--jobs} is the same as the number of CPU cores. So this:

@verbatim
  /usr/bin/time parallel -N0 sleep 1 :::: num128
@end verbatim

should take twice the time of running 2 jobs per CPU core:

@verbatim
  /usr/bin/time parallel -N0 --jobs 200% sleep 1 :::: num128
@end verbatim

@strong{--jobs 0} will run as many jobs in parallel as possible:

@verbatim
  /usr/bin/time parallel -N0 --jobs 0 sleep 1 :::: num128
@end verbatim

which should take 1-7 seconds depending on how fast your machine is.

@strong{--jobs} can read from a file which is re-read when a job finishes:

@verbatim
  echo 50% > my_jobs
  /usr/bin/time parallel -N0 --jobs my_jobs sleep 1 :::: num128 &
  sleep 1
  echo 0 > my_jobs
  wait
@end verbatim

The first second only 50% of the CPU cores will run a job. Then @strong{0} is
put into @strong{my_jobs} and then the rest of the jobs will be started in
parallel.

Instead of basing the percentage on the number of CPU cores 
GNU @strong{parallel} can base it on the number of CPUs:

@verbatim
  parallel --use-cpus-instead-of-cores -N0 sleep 1 :::: num8
@end verbatim

@node Shuffle job order
@section Shuffle job order

If you have many jobs (e.g. by multiple combinations of input
sources), it can be handy to shuffle the jobs, so you get different
values run. Use @strong{--shuf} for that:

@verbatim
  parallel --shuf echo ::: 1 2 3 ::: a b c ::: A B C
@end verbatim

Output:

@verbatim
  All combinations but different order for each run.
@end verbatim

@node Interactivity
@section Interactivity

GNU @strong{parallel} can ask the user if a command should be run using @strong{--interactive}:

@verbatim
  parallel --interactive echo ::: 1 2 3
@end verbatim

Output:

@verbatim
  echo 1 ?...y
  echo 2 ?...n
  1
  echo 3 ?...y
  3
@end verbatim

GNU @strong{parallel} can be used to put arguments on the command line for an
interactive command such as @strong{emacs} to edit one file at a time:

@verbatim
  parallel --tty emacs ::: 1 2 3
@end verbatim

Or give multiple argument in one go to open multiple files:

@verbatim
  parallel -X --tty vi ::: 1 2 3
@end verbatim

@node A terminal for every job
@section A terminal for every job

Using @strong{--tmux} GNU @strong{parallel} can start a terminal for every job run:

@verbatim
  seq 10 20 | parallel --tmux 'echo start {}; sleep {}; echo done {}'
@end verbatim

This will tell you to run something similar to:

@verbatim
  tmux -S /tmp/tmsrPrO0 attach
@end verbatim

Using normal @strong{tmux} keystrokes (CTRL-b n or CTRL-b p) you can cycle
between windows of the running jobs. When a job is finished it will
pause for 10 seconds before closing the window.

@node Timing
@section Timing

Some jobs do heavy I/O when they start. To avoid a thundering herd GNU
@strong{parallel} can delay starting new jobs. @strong{--delay} @emph{X} will make
sure there is at least @emph{X} seconds between each start:

@verbatim
  parallel --delay 2.5 echo Starting {}\;date ::: 1 2 3
@end verbatim

Output:

@verbatim
  Starting 1
  Thu Aug 15 16:24:33 CEST 2013
  Starting 2
  Thu Aug 15 16:24:35 CEST 2013
  Starting 3
  Thu Aug 15 16:24:38 CEST 2013
@end verbatim

If jobs taking more than a certain amount of time are known to fail,
they can be stopped with @strong{--timeout}. The accuracy of @strong{--timeout} is
2 seconds:

@verbatim
  parallel --timeout 4.1 sleep {}\; echo {} ::: 2 4 6 8
@end verbatim

Output:

@verbatim
  2
  4
@end verbatim

GNU @strong{parallel} can compute the median runtime for jobs and kill those
that take more than 200% of the median runtime:

@verbatim
  parallel --timeout 200% sleep {}\; echo {} ::: 2.1 2.2 3 7 2.3
@end verbatim

Output:

@verbatim
  2.1
  2.2
  3
  2.3
@end verbatim

@node Progress information
@section Progress information

Based on the runtime of completed jobs GNU @strong{parallel} can estimate the
total runtime:

@verbatim
  parallel --eta sleep ::: 1 3 2 2 1 3 3 2 1
@end verbatim

Output:

@verbatim
  Computers / CPU cores / Max jobs to run
  1:local / 2 / 2

  Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
  ETA: 2s 0left 1.11avg  local:0/9/100%/1.1s 
@end verbatim

GNU @strong{parallel} can give progress information with @strong{--progress}:

@verbatim
  parallel --progress sleep ::: 1 3 2 2 1 3 3 2 1
@end verbatim

Output:

@verbatim
  Computers / CPU cores / Max jobs to run
  1:local / 2 / 2

  Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
  local:0/9/100%/1.1s
@end verbatim

A progress bar can be shown with @strong{--bar}:

@verbatim
  parallel --bar sleep ::: 1 3 2 2 1 3 3 2 1
@end verbatim

And a graphic bar can be shown with @strong{--bar} and @strong{zenity}:

@verbatim
  seq 1000 | parallel -j10 --bar '(echo -n {};sleep 0.1)' 2> >(zenity --progress --auto-kill)
@end verbatim

A logfile of the jobs completed so far can be generated with @strong{--joblog}:

@verbatim
  parallel --joblog /tmp/log exit  ::: 1 2 3 0 
  cat /tmp/log
@end verbatim

Output:

@verbatim
  Seq     Host    Starttime       Runtime Send    Receive Exitval Signal  Command
  1       :       1376577364.974  0.008   0       0       1       0       exit 1
  2       :       1376577364.982  0.013   0       0       2       0       exit 2
  3       :       1376577364.990  0.013   0       0       3       0       exit 3
  4       :       1376577365.003  0.003   0       0       0       0       exit 0
@end verbatim

The log contains the job sequence, which host the job was run on, the
start time and run time, how much data was transferred, the exit
value, the signal that killed the job, and finally the command being
run.

With a joblog GNU @strong{parallel} can be stopped and later pickup where it
left off. It it important that the input of the completed jobs is
unchanged.

@verbatim
  parallel --joblog /tmp/log exit  ::: 1 2 3 0 
  cat /tmp/log
  parallel --resume --joblog /tmp/log exit  ::: 1 2 3 0 0 0
  cat /tmp/log
@end verbatim

Output:

@verbatim
  Seq     Host    Starttime       Runtime Send    Receive Exitval Signal  Command
  1       :       1376580069.544  0.008   0       0       1       0       exit 1
  2       :       1376580069.552  0.009   0       0       2       0       exit 2
  3       :       1376580069.560  0.012   0       0       3       0       exit 3
  4       :       1376580069.571  0.005   0       0       0       0       exit 0

  Seq     Host    Starttime       Runtime Send    Receive Exitval Signal  Command
  1       :       1376580069.544  0.008   0       0       1       0       exit 1
  2       :       1376580069.552  0.009   0       0       2       0       exit 2
  3       :       1376580069.560  0.012   0       0       3       0       exit 3
  4       :       1376580069.571  0.005   0       0       0       0       exit 0
  5       :       1376580070.028  0.009   0       0       0       0       exit 0
  6       :       1376580070.038  0.007   0       0       0       0       exit 0
@end verbatim

Note how the start time of the last 2 jobs is clearly different from the second run.

With @strong{--resume-failed} GNU @strong{parallel} will re-run the jobs that failed:

@verbatim
  parallel --resume-failed --joblog /tmp/log exit  ::: 1 2 3 0 0 0
  cat /tmp/log
@end verbatim

Output:

@verbatim
  Seq     Host    Starttime       Runtime Send    Receive Exitval Signal  Command
  1       :       1376580069.544  0.008   0       0       1       0       exit 1
  2       :       1376580069.552  0.009   0       0       2       0       exit 2
  3       :       1376580069.560  0.012   0       0       3       0       exit 3
  4       :       1376580069.571  0.005   0       0       0       0       exit 0
  5       :       1376580070.028  0.009   0       0       0       0       exit 0
  6       :       1376580070.038  0.007   0       0       0       0       exit 0
  1       :       1376580154.433  0.010   0       0       1       0       exit 1
  2       :       1376580154.444  0.022   0       0       2       0       exit 2
  3       :       1376580154.466  0.005   0       0       3       0       exit 3
@end verbatim

Note how seq 1 2 3 have been repeated because they had exit value
different from 0.

@strong{--retry-failed} does almost the same as @strong{--resume-failed}. Where
@strong{--resume-failed} reads the commands from the command line (and
ignores the commands in the joblog), @strong{--retry-failed} ignores the
command line and reruns the commands mentioned in the joblog.

@verbatim
  parallel --resume-failed --joblog /tmp/log
  cat /tmp/log
@end verbatim

Output:

@verbatim
  Seq     Host    Starttime       Runtime Send    Receive Exitval Signal  Command
  1       :       1376580069.544  0.008   0       0       1       0       exit 1
  2       :       1376580069.552  0.009   0       0       2       0       exit 2
  3       :       1376580069.560  0.012   0       0       3       0       exit 3
  4       :       1376580069.571  0.005   0       0       0       0       exit 0
  5       :       1376580070.028  0.009   0       0       0       0       exit 0
  6       :       1376580070.038  0.007   0       0       0       0       exit 0
  1       :       1376580154.433  0.010   0       0       1       0       exit 1
  2       :       1376580154.444  0.022   0       0       2       0       exit 2
  3       :       1376580154.466  0.005   0       0       3       0       exit 3
  1       :       1376580164.633  0.010   0       0       1       0       exit 1
  2       :       1376580164.644  0.022   0       0       2       0       exit 2
  3       :       1376580164.666  0.005   0       0       3       0       exit 3
@end verbatim

@node Termination
@section Termination

For certain jobs there is no need to continue if one of the jobs fails
and has an exit code different from 0. GNU @strong{parallel} will stop spawning new jobs
with @strong{--halt soon,fail=1}:

@verbatim
  parallel -j2 --halt soon,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
@end verbatim

Output:

@verbatim
  0
  0
  1
  parallel: Starting no more jobs. Waiting for 2 jobs to finish. This job failed:
  echo 1; exit 1
  2
  parallel: Starting no more jobs. Waiting for 1 jobs to finish. This job failed:
  echo 2; exit 2
@end verbatim

With @strong{--halt now,fail=1} the running jobs will be killed immediately:

@verbatim
  parallel -j2 --halt now,fail=1 echo {}\; exit {} ::: 0 0 1 2 3
@end verbatim

Output:

@verbatim
  0
  0
  1
  parallel: This job failed:
  echo 1; exit 1
@end verbatim

If @strong{--halt} is given a percentage this percentage of the jobs must fail
before GNU @strong{parallel} stops spawning more jobs:

@verbatim
  parallel -j2 --halt soon,fail=20% echo {}\; exit {} ::: 0 1 2 3 4 5 6 7 8 9
@end verbatim

Output:

@verbatim
  0
  1
  parallel: This job failed:
  echo 1; exit 1
  2
  parallel: This job failed:
  echo 2; exit 2
  parallel: Starting no more jobs. Waiting for 1 jobs to finish.
  3
  parallel: This job failed:
  echo 3; exit 3
@end verbatim

If you are looking for success instead of failures, you can use
@strong{success}. This will finish as soon as the first job succeeds:

@verbatim
  parallel -j2 --halt now,success=1 echo {}\; exit {} ::: 1 2 3 0 4 5 6
@end verbatim

Output:

@verbatim
  1
  2
  3
  0
  parallel: This job succeeded:
  echo 0; exit 0
@end verbatim

GNU @strong{parallel} can retry the command with @strong{--retries}. This is useful if a
command fails for unknown reasons now and then.

@verbatim
  parallel -k --retries 3 'echo tried {} >>/tmp/runs; echo completed {}; exit {}' ::: 1 2 0
  cat /tmp/runs
@end verbatim

Output:

@verbatim
  completed 1
  completed 2
  completed 0

  tried 1
  tried 2
  tried 1
  tried 2
  tried 1
  tried 2
  tried 0
@end verbatim

Note how job 1 and 2 were tried 3 times, but 0 was not retried because it had exit code 0.

@menu
* Termination signals (advanced)::
@end menu

@node Termination signals (advanced)
@subsection Termination signals (advanced)

Using @strong{--termseq} you can control which signals are sent when killing
children. Normally children will be killed by sending them @strong{SIGTERM},
waiting 200 ms, then another @strong{SIGTERM}, waiting 100 ms, then another
@strong{SIGTERM}, waiting 50 ms, then a @strong{SIGKILL}, finally waiting 25 ms
before giving up. It looks like this:

@verbatim
  show_signals() {
    perl -e 'for(keys %SIG) { $SIG{$_} = eval "sub { print \"Got $_\\n\"; }";} while(1){sleep 1}' 
  }
  export -f show_signals
  echo | parallel --termseq TERM,200,TERM,100,TERM,50,KILL,25 -u --timeout 1 show_signals
@end verbatim

Output:

@verbatim
  Got TERM
  Got TERM
  Got TERM
@end verbatim

Or just:

@verbatim
  echo | parallel -u --timeout 1 show_signals
@end verbatim

Output: Same as above.

You can change this to @strong{SIGINT}, @strong{SIGTERM}, @strong{SIGKILL}:

@verbatim
  echo | parallel --termseq INT,200,TERM,100,KILL,25 -u --timeout 1 show_signals
@end verbatim

Output:                               

@verbatim
  Got INT
  Got TERM
@end verbatim

The @strong{SIGKILL} does not show because it cannot be caught, and thus the child dies.

@node Limiting the resources
@section Limiting the resources

To avoid overloading systems GNU @strong{parallel} can look at the system load
before starting another job:

@verbatim
  parallel --load 100% echo load is less than {} job per cpu ::: 1 
@end verbatim

Output:

@verbatim
  [when then load is less than the number of cpu cores]
  load is less than 1 job per cpu
@end verbatim

GNU @strong{parallel} can also check if the system is swapping. 

@verbatim
  parallel --noswap echo the system is not swapping ::: now
@end verbatim

Output:

@verbatim
  [when then system is not swapping]
  the system is not swapping now
@end verbatim

Some jobs need a lot of memory, and should only be started when there
is enough memory free. Using @strong{--memfree} GNU @strong{parallel} can check if
there is enough memory free. Additionally, GNU @strong{parallel} will kill
off the youngest job if the memory free falls below 50% of the
size. The killed job will put back on the queue and retried later.

@verbatim
  parallel --memfree 1G echo will run if more than 1 GB is ::: free
@end verbatim

GNU @strong{parallel} can run the jobs with a nice value. This will work both
locally and remotely.

@verbatim
  parallel --nice 17 echo this is being run with nice -n ::: 17
@end verbatim

Output:

@verbatim
  this is being run with nice -n 17
@end verbatim

@node Remote execution
@chapter Remote execution

GNU @strong{parallel} can run jobs on remote servers. It uses @strong{ssh} to
communicate with the remote machines. 

@menu
* Sshlogin::
* Transferring files::
* Working dir::
* Avoid overloading sshd::
* Ignore hosts that are down::
* Running the same commands on all hosts::
* Transferring environment variables and functions::
* Showing what is actually run::
@end menu

@node Sshlogin
@section Sshlogin

The most basic sshlogin is @strong{-S} @emph{host}:

@verbatim
  parallel -S $SERVER1 echo running on ::: $SERVER1
@end verbatim

Output:

@verbatim
  running on [$SERVER1]
@end verbatim

To use a different username prepend the server with @emph{username@@}:

@verbatim
  parallel -S username@$SERVER1 echo running on ::: username@$SERVER1
@end verbatim

Output:

@verbatim
  running on [username@$SERVER1]
@end verbatim

The special sshlogin @strong{:} is the local machine:

@verbatim
  parallel -S : echo running on ::: the_local_machine
@end verbatim

Output:

@verbatim
  running on the_local_machine
@end verbatim

If @strong{ssh} is not in $PATH it can be prepended to $SERVER1:

@verbatim
  parallel -S '/usr/bin/ssh '$SERVER1 echo custom ::: ssh
@end verbatim

Output:

@verbatim
  custom ssh
@end verbatim

The @strong{ssh} command can also be given using @strong{--ssh}:

@verbatim
  parallel --ssh /usr/bin/ssh -S $SERVER1 echo custom ::: ssh
@end verbatim

or by setting @strong{$PARALLEL_SSH}:

@verbatim
  export PARALLEL_SSH=/usr/bin/ssh
  parallel -S $SERVER1 echo custom ::: ssh
@end verbatim

Several servers can be given using multiple @strong{-S}:

@verbatim
  parallel -S $SERVER1 -S $SERVER2 echo ::: running on more hosts
@end verbatim

Output (the order may be different):

@verbatim
  running
  on
  more
  hosts
@end verbatim

Or they can be separated by @strong{,}:

@verbatim
  parallel -S $SERVER1,$SERVER2 echo ::: running on more hosts
@end verbatim

Output: Same as above.

Or newline:

@verbatim
  # This gives a \n between $SERVER1 and $SERVER2
  SERVERS="`echo $SERVER1; echo $SERVER2`"
  parallel -S "$SERVERS" echo ::: running on more hosts
@end verbatim

They can also be read from a file (replace @emph{user@@} with the user on @strong{$SERVER2}):

@verbatim
  echo $SERVER1 > nodefile
  # Force 4 cores, special ssh-command, username
  echo 4//usr/bin/ssh user@$SERVER2 >> nodefile
  parallel --sshloginfile nodefile echo ::: running on more hosts
@end verbatim

Output: Same as above.

Every time a job finished, the @strong{--sshloginfile} will be re-read, so
it is possible to both add and remove hosts while running.

The special @strong{--sshloginfile ..} reads from @strong{~/.parallel/sshloginfile}.

To force GNU @strong{parallel} to treat a server having a given number of CPU
cores prepend the number of core followed by @strong{/} to the sshlogin:

@verbatim
  parallel -S 4/$SERVER1 echo force {} cpus on server ::: 4
@end verbatim

Output:

@verbatim
  force 4 cpus on server
@end verbatim

Servers can be put into groups by prepending @emph{@@groupname} to the
server and the group can then be selected by appending @emph{@@groupname} to
the argument if using @strong{--hostgroup}:

@verbatim
  parallel --hostgroup -S @grp1/$SERVER1 -S @grp2/$SERVER2 echo {} ::: \
    run_on_grp1@grp1 run_on_grp2@grp2
@end verbatim

Output:

@verbatim
  run_on_grp1
  run_on_grp2
@end verbatim

A host can be in multiple groups by separating the groups with @strong{+}, and
you can force GNU @strong{parallel} to limit the groups on which the command
can be run with @strong{-S} @emph{@@groupname}:

@verbatim
  parallel -S @grp1 -S @grp1+grp2/$SERVER1 -S @grp2/SERVER2 echo {} ::: \
    run_on_grp1 also_grp1
@end verbatim

Output:

@verbatim
  run_on_grp1
  also_grp1
@end verbatim

@node Transferring files
@section Transferring files

GNU @strong{parallel} can transfer the files to be processed to the remote
host. It does that using rsync.

@verbatim
  echo This is input_file > input_file
  parallel -S $SERVER1 --transferfile {} cat ::: input_file 
@end verbatim

Output:

@verbatim
  This is input_file
@end verbatim

If the files are processed into another file, the resulting file can be
transferred back:

@verbatim
  echo This is input_file > input_file
  parallel -S $SERVER1 --transferfile {} --return {}.out cat {} ">"{}.out ::: input_file 
  cat input_file.out
@end verbatim

Output: Same as above.

To remove the input and output file on the remote server use @strong{--cleanup}:

@verbatim
  echo This is input_file > input_file
  parallel -S $SERVER1 --transferfile {} --return {}.out --cleanup cat {} ">"{}.out ::: input_file 
  cat input_file.out
@end verbatim

Output: Same as above.

There is a shorthand for @strong{--transferfile @{@} --return --cleanup} called @strong{--trc}:

@verbatim
  echo This is input_file > input_file
  parallel -S $SERVER1 --trc {}.out cat {} ">"{}.out ::: input_file 
  cat input_file.out
@end verbatim

Output: Same as above.

Some jobs need a common database for all jobs. GNU @strong{parallel} can
transfer that using @strong{--basefile} which will transfer the file before the
first job:

@verbatim
  echo common data > common_file
  parallel --basefile common_file -S $SERVER1 cat common_file\; echo {} ::: foo
@end verbatim

Output:

@verbatim
  common data
  foo
@end verbatim

To remove it from the remote host after the last job use @strong{--cleanup}.

@node Working dir
@section Working dir

The default working dir on the remote machines is the login dir. This
can be changed with @strong{--workdir} @emph{mydir}.

Files transferred using @strong{--transferfile} and @strong{--return} will be relative
to @emph{mydir} on remote computers, and the command will be executed in
the dir @emph{mydir}.

The special @emph{mydir} value @strong{...} will create working dirs under
@strong{~/.parallel/tmp} on the remote computers. If @strong{--cleanup} is given
these dirs will be removed.

The special @emph{mydir} value @strong{.} uses the current working dir.  If the
current working dir is beneath your home dir, the value @strong{.} is
treated as the relative path to your home dir. This means that if your
home dir is different on remote computers (e.g. if your login is
different) the relative path will still be relative to your home dir.

@verbatim
  parallel -S $SERVER1 pwd ::: ""
  parallel --workdir . -S $SERVER1 pwd ::: ""
  parallel --workdir ... -S $SERVER1 pwd ::: ""
@end verbatim

Output:

@verbatim
  [the login dir on $SERVER1]
  [current dir relative on $SERVER1]
  [a dir in ~/.parallel/tmp/...]
@end verbatim

@node Avoid overloading sshd
@section Avoid overloading sshd

If many jobs are started on the same server, @strong{sshd} can be
overloaded. GNU @strong{parallel} can insert a delay between each job run on
the same server:

@verbatim
  parallel -S $SERVER1 --sshdelay 0.2 echo ::: 1 2 3
@end verbatim

Output (the order may be different):

@verbatim
  1
  2
  3
@end verbatim

@strong{sshd} will be less overloaded if using @strong{--controlmaster}, which will
multiplex ssh connections:

@verbatim
  parallel --controlmaster -S $SERVER1 echo ::: 1 2 3
@end verbatim

Output: Same as above.

@node Ignore hosts that are down
@section Ignore hosts that are down

In clusters with many hosts a few of them are often down. GNU @strong{parallel}
can ignore those hosts. In this case the host 173.194.32.46 is down:

@verbatim
  parallel --filter-hosts -S 173.194.32.46,$SERVER1 echo ::: bar 
@end verbatim

Output:

@verbatim
  bar
@end verbatim

@node Running the same commands on all hosts
@section Running the same commands on all hosts

GNU @strong{parallel} can run the same command on all the hosts:

@verbatim
  parallel --onall -S $SERVER1,$SERVER2 echo ::: foo bar
@end verbatim

Output (the order may be different):

@verbatim
  foo
  bar
  foo
  bar
@end verbatim

Often you will just want to run a single command on all hosts with out
arguments. @strong{--nonall} is a no argument @strong{--onall}:

@verbatim
  parallel --nonall -S $SERVER1,$SERVER2 echo foo bar
@end verbatim

Output:

@verbatim
  foo bar
  foo bar
@end verbatim

When @strong{--tag} is used with @strong{--nonall} and @strong{--onall} the @strong{--tagstring} is the host:

@verbatim
  parallel --nonall --tag -S $SERVER1,$SERVER2 echo foo bar
@end verbatim

Output (the order may be different):

@verbatim
  $SERVER1 foo bar
  $SERVER2 foo bar
@end verbatim

@strong{--jobs} sets the number of servers to log in to in parallel.

@node Transferring environment variables and functions
@section Transferring environment variables and functions

Using @strong{--env} GNU @strong{parallel} can transfer an environment variable to the
remote system.

@verbatim
  MYVAR='foo bar'
  export MYVAR
  parallel --env MYVAR -S $SERVER1 echo '$MYVAR' ::: baz
@end verbatim

Output:

@verbatim
  foo bar baz
@end verbatim

This works for functions, too, if your shell is Bash:

@verbatim
  # This only works in Bash
  my_func() {
    echo in my_func $1
  }
  export -f my_func
  parallel --env my_func -S $SERVER1 my_func ::: baz
@end verbatim

Output:

@verbatim
  in my_func baz
@end verbatim

GNU @strong{parallel} can copy all defined variables and functions to the
remote system. It just needs to record which ones to ignore in
@strong{~/.parallel/ignored_vars}. Do that by running this once:

@verbatim
  parallel --record-env
  cat ~/.parallel/ignored_vars
@end verbatim

Output:

@verbatim
  [list of variables to ignore - including $PATH and $HOME]
@end verbatim

Now all new variables and functions defined will be copied when using
@strong{--env _}:

@verbatim
  # The function is only copied if using Bash
  my_func2() {
    echo in my_func2 $VAR $1
  }
  export -f my_func2
  VAR=foo
  export VAR

  parallel --env _ -S $SERVER1 'echo $VAR; my_func2' ::: bar
@end verbatim

Output:

@verbatim
  foo
  in my_func2 foo bar
@end verbatim

@node Showing what is actually run
@section Showing what is actually run

@strong{--verbose} will show the command that would be run on the local
machine. When a job is run on a remote machine, this is wrapped with
@strong{ssh} and possibly transferring files and environment variables, setting
the workdir, and setting @strong{--nice} value. @strong{-vv} shows all of this.

@verbatim
  parallel -vv -S $SERVER1 echo ::: bar
@end verbatim

Output:

@verbatim
  ssh lo -- exec perl -e \''@GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
  eval"@GNU_Parallel";my$eval;$eval=decode_base64(join"",@ARGV);eval$eval;'\' 
  JEVOVnsiUEFSQUxMRUxfUElEIn09IjI3MzQiOyRFTlZ7IlBBUkFMTEVMX1NFUSJ9PSIx
  IjskYmFzaGZ1bmMgPSAiIjtAQVJHVj0iZWNobyBiYXIiOyRzaGVsbD0iJEVOVntTSEVM
  TH0iOyR0bXBkaXI9Ii90bXAiOyRuaWNlPTA7ZG97JEVOVntQQVJBTExFTF9UTVB9PSR0
  bXBkaXIuIi9wYXIiLmpvaW4iIixtYXB7KDAuLjksImEiLi4ieiIsIkEiLi4iWiIpW3Jh
  bmQoNjIpXX0oMS4uNSk7fXdoaWxlKC1lJEVOVntQQVJBTExFTF9UTVB9KTskU0lHe0NI
  TER9PXN1YnskZG9uZT0xO307JHBpZD1mb3JrO3VubGVzcygkcGlkKXtzZXRwZ3JwO2V2
  YWx7c2V0cHJpb3JpdHkoMCwwLCRuaWNlKX07ZXhlYyRzaGVsbCwiLWMiLCgkYmFzaGZ1
  bmMuIkBBUkdWIik7ZGllImV4ZWM6JCFcbiI7fWRveyRzPSRzPDE/MC4wMDErJHMqMS4w
  MzokcztzZWxlY3QodW5kZWYsdW5kZWYsdW5kZWYsJHMpO311bnRpbCgkZG9uZXx8Z2V0
  cHBpZD09MSk7a2lsbChTSUdIVVAsLSR7cGlkfSl1bmxlc3MkZG9uZTt3YWl0O2V4aXQo
  JD8mMTI3PzEyOCsoJD8mMTI3KToxKyQ/Pj44KQ==;
  bar
@end verbatim

When the command gets more complex, the output is so hard to read, that it is only useful for debugging:

@verbatim
  my_func3() {
    echo in my_func $1 > $1.out
  }
  export -f my_func3
  parallel -vv --workdir ... --nice 17 --env _ --trc {}.out -S $SERVER1 my_func3 {} ::: abc-file
@end verbatim

Output will be similar to:

@verbatim
  ( ssh lo -- mkdir -p ./.parallel/tmp/hk-3492-1;rsync --protocol 30
  -rlDzR -essh ./abc-file lo:./.parallel/tmp/hk-3492-1 );ssh lo --
  exec perl -e \''@GNU_Parallel=("use","IPC::Open3;","use","MIME::Base64");
  eval"@GNU_Parallel";my$eval;$eval=decode_base64(join"",@ARGV);eval$eval;'\'
  c3lzdGVtKCJta2RpciIsIi1wIiwiLS0iLCIucGFyYWxsZWwvdG1wL2hrLTM0OTItMSIp
  OyBjaGRpciAiLnBhcmFsbGVsL3RtcC9oay0zNDkyLTEiIHx8cHJpbnQoU1RERVJSICJw
  YXJhbGxlbDogQ2Fubm90IGNoZGlyIHRvIC5wYXJhbGxlbC90bXAvaGstMzQ5Mi0xXG4i
  KSAmJiBleGl0IDI1NTskRU5WeyJHUEdfQUdFTlRfSU5GTyJ9PSIvdG1wL2dwZy10WjVI
  U0QvUy5ncGctYWdlbnQ6MjM5NzoxIjskRU5WeyJQQVJBTExFTF9TRVEifT0iMSI7JEVO
  VnsiU1FMSVRFVEJMIn09InNxbGl0ZTM6Ly8vJTJGdG1wJTJGcGFyYWxsZWwuZGIyL3Bh
  cnNxbDIiOyRFTlZ7IlBBUkFMTEVMX1BJRCJ9PSIzNDkyIjskRU5WeyJTUUxJVEUifT0i
  c3FsaXRlMzovLy8lMkZ0bXAlMkZwYXJhbGxlbC5kYjIiOyRFTlZ7IlBBUkFMTEVMX1BJ
  RCJ9PSIzNDkyIjskRU5WeyJQQVJBTExFTF9TRVEifT0iMSI7QGJhc2hfZnVuY3Rpb25z
  PXF3KG15X2Z1bmMzKTsgaWYoJEVOVnsiU0hFTEwifT1+L2NzaC8pIHsgcHJpbnQgU1RE
  RVJSICJDU0gvVENTSCBETyBOT1QgU1VQUE9SVCBuZXdsaW5lcyBJTiBWQVJJQUJMRVMv
  RlVOQ1RJT05TLiBVbnNldCBAYmFzaF9mdW5jdGlvbnNcbiI7IGV4ZWMgImZhbHNlIjsg
  fSAKJGJhc2hmdW5jID0gIm15X2Z1bmMzKCkgeyAgZWNobyBpbiBteV9mdW5jIFwkMSA+
  IFwkMS5vdXQKfTtleHBvcnQgLWYgbXlfZnVuYzMgPi9kZXYvbnVsbDsiO0BBUkdWPSJt
  eV9mdW5jMyBhYmMtZmlsZSI7JHNoZWxsPSIkRU5We1NIRUxM
  fSI7JHRtcGRpcj0iL3RtcCI7JG5pY2U9MTc7ZG97JEVOVntQQVJBTExFTF9UTVB9PSR0
  bXBkaXIuIi9wYXIiLmpvaW4iIixtYXB7KDAuLjksImEiLi4ieiIsIkEiLi4iWiIpW3Jh
  bmQoNjIpXX0oMS4uNSk7fXdoaWxlKC1lJEVOVntQQVJBTExFTF9UTVB9KTskU0lHe0NI
  TER9PXN1YnskZG9uZT0xO307JHBpZD1mb3JrO3VubGVzcygkcGlkKXtzZXRwZ3JwO2V2
  YWx7c2V0cHJpb3JpdHkoMCwwLCRuaWNlKX07ZXhlYyRzaGVsbCwiLWMiLCgkYmFzaGZ1
  bmMuIkBBUkdWIik7ZGllImV4ZWM6JCFcbiI7fWRveyRzPSRzPDE/MC4wMDErJHMqMS4w
  MzokcztzZWxlY3QodW5kZWYsdW5kZWYsdW5kZWYsJHMpO311bnRpbCgkZG9uZXx8Z2V0
  cHBpZD09MSk7a2lsbChTSUdIVVAsLSR7cGlkfSl1bmxlc3MkZG9uZTt3YWl0O2V4aXQo
  JD8mMTI3PzEyOCsoJD8mMTI3KToxKyQ/Pj44KQ==;_EXIT_status=$?;
  mkdir -p ./.; rsync --protocol 30 --rsync-path=cd\
  ./.parallel/tmp/hk-3492-1/./.\;\ rsync -rlDzR -essh
  lo:./abc-file.out ./.;ssh lo -- \(rm\ -f\
  ./.parallel/tmp/hk-3492-1/abc-file\;\ sh\ -c\ \'rmdir\
  ./.parallel/tmp/hk-3492-1/\ ./.parallel/tmp/\ ./.parallel/\
  2\>/dev/null\'\;rm\ -rf\ ./.parallel/tmp/hk-3492-1\;\);ssh lo --
  \(rm\ -f\ ./.parallel/tmp/hk-3492-1/abc-file.out\;\ sh\ -c\ \'rmdir\
  ./.parallel/tmp/hk-3492-1/\ ./.parallel/tmp/\ ./.parallel/\
  2\>/dev/null\'\;rm\ -rf\ ./.parallel/tmp/hk-3492-1\;\);ssh lo -- rm
  -rf .parallel/tmp/hk-3492-1; exit $_EXIT_status;
@end verbatim

@node Saving to an SQL base (advanced)
@chapter Saving to an SQL base (advanced)

GNU @strong{parallel} can save into an SQL base. Point GNU @strong{parallel} to a
table and it will put the joblog there together with the variables and
the outout each in their own column.

GNU @strong{parallel} uses a DBURL to address the table. A DBURL has this format:

@verbatim
  vendor://[[user][:password]@][host][:port]/[database[/table]
@end verbatim

Example:

@verbatim
  mysql://scott:tiger@my.example.com/mydatabase/mytable
  postgresql://scott:tiger@pg.example.com/mydatabase/mytable
  sqlite3:///%2Ftmp%2Fmydatabase/mytable
@end verbatim

To refer to @strong{/tmp/mydatabase} with @strong{sqlite} you need to encode the @strong{/} as @strong{%2F}.

Run a job using @strong{sqlite} on @strong{mytable} in @strong{/tmp/mydatabase}:

@verbatim
  DBURL=sqlite3:///%2Ftmp%2Fmydatabase
  DBURLTABLE=$DBURL/mytable
  parallel --sqlandworker $DBURLTABLE echo ::: foo bar ::: baz quuz
@end verbatim

To see the result:

@verbatim
  sql $DBURL 'SELECT * FROM mytable ORDER BY Seq;'
@end verbatim

Output will be similar to:

@verbatim
  Seq|Host|Starttime|JobRuntime|Send|Receive|Exitval|_Signal|Command|V1|V2|Stdout|Stderr
  1|:|1451619638.903|0.806||8|0|0|echo foo baz|foo|baz|foo baz
  |
  2|:|1451619639.265|1.54||9|0|0|echo foo quuz|foo|quuz|foo quuz
  |
  3|:|1451619640.378|1.43||8|0|0|echo bar baz|bar|baz|bar baz
  |
  4|:|1451619641.473|0.958||9|0|0|echo bar quuz|bar|quuz|bar quuz
  |
@end verbatim

The first columns are well known from @strong{--joblog}. @strong{V1} and @strong{V2} are
data from the input sources. @strong{Stdout} and @strong{Stderr} are standard
output and standard error, respectively.

@menu
* Using multiple workers::
@end menu

@node Using multiple workers
@section Using multiple workers

Using an SQL base as storage costs a lot of performance.

One of the situations where it makes sense is if you have multiple
workers.

You can then have a single master machine that submits jobs to the SQL
base (but does not do any of the work):

@verbatim
  parallel --sql $DBURLTABLE echo ::: foo bar ::: baz quuz
@end verbatim

On the worker machines you run exactly the same command except you
replace @strong{--sql} with @strong{--sqlworker}.

@verbatim
  parallel --sqlworker $DBURLTABLE echo ::: foo bar ::: baz quuz
@end verbatim

To run a master and a worker on the same machine use @strong{--sqlandworker}
as shown earlier.

@node --pipe
@chapter --pipe

The @strong{--pipe} functionality puts GNU @strong{parallel} in a different mode:
Instead of treating the data on stdin (standard input) as arguments
for a command to run, the data will be sent to stdin (standard input)
of the command.

The typical situation is:

@verbatim
  command_A | command_B | command_C
@end verbatim

where command_B is slow, and you want to speed up command_B.

@menu
* Chunk size::
* Records::
* Record separators::
* Header::
* --pipepart::
@end menu

@node Chunk size
@section Chunk size

By default GNU @strong{parallel} will start an instance of command_B, read a
chunk of 1 MB, and pass that to the instance. Then start another
instance, read another chunk, and pass that to the second instance.

@verbatim
  cat num1000000 | parallel --pipe wc
@end verbatim

Output (the order may be different):

@verbatim
  165668  165668 1048571
  149797  149797 1048579
  149796  149796 1048572
  149797  149797 1048579
  149797  149797 1048579
  149796  149796 1048572
   85349   85349  597444
@end verbatim

The size of the chunk is not exactly 1 MB because GNU @strong{parallel} only
passes full lines - never half a line, thus the blocksize is only
average 1 MB. You can change the block size to 2 MB with @strong{--block}:

@verbatim
  cat num1000000 | parallel --pipe --block 2M wc
@end verbatim

Output (the order may be different):

@verbatim
  315465  315465 2097150
  299593  299593 2097151
  299593  299593 2097151
   85349   85349  597444
@end verbatim

GNU @strong{parallel} treats each line as a record. If the order of record is
unimportant (e.g. you need all lines processed, but you do not care
which is processed first), then you can use @strong{--round-robin}. Without
@strong{--round-robin} GNU @strong{parallel} will start a command per block; with
@strong{--round-robin} only the requested number of jobs will be started
(@strong{--jobs}). The records will then be distributed between the running
jobs:

@verbatim
  cat num1000000 | parallel --pipe -j4 --round-robin wc
@end verbatim

Output will be similar to:

@verbatim
  149797  149797 1048579
  299593  299593 2097151
  315465  315465 2097150
  235145  235145 1646016
@end verbatim

One of the 4 instances got a single record, 2 instances got 2 full
records each, and one instance got 1 full and 1 partial record.

@node Records
@section Records

GNU @strong{parallel} sees the input as records. The default record is a single
line.

Using @strong{-N140000} GNU @strong{parallel} will read 140000 records at a time:

@verbatim
  cat num1000000 | parallel --pipe -N140000 wc
@end verbatim

Output (the order may be different):

@verbatim
  140000  140000  868895
  140000  140000  980000
  140000  140000  980000
  140000  140000  980000
  140000  140000  980000
  140000  140000  980000
  140000  140000  980000
   20000   20000  140001
@end verbatim

Notice that the last job could not get the full 140000 lines, but only
20000 lines.

If a record is 75 lines @strong{-L} can be used:

@verbatim
  cat num1000000 | parallel --pipe -L75 wc
@end verbatim

Output (the order may be different):

@verbatim
  165600  165600 1048095
  149850  149850 1048950
  149775  149775 1048425
  149775  149775 1048425
  149850  149850 1048950
  149775  149775 1048425
   85350   85350  597450
      25      25     176
@end verbatim

Notice GNU @strong{parallel} still reads a block of around 1 MB; but instead of
passing full lines to @strong{wc} it passes full 75 lines at a time. This
of course does not hold for the last job (which in this case got 25
lines).

@node Record separators
@section Record separators

GNU @strong{parallel} uses separators to determine where two records split.

@strong{--recstart} gives the string that starts a record; @strong{--recend} gives the
string that ends a record. The default is @strong{--recend '\n'} (newline).

If both @strong{--recend} and @strong{--recstart} are given, then the record will only
split if the recend string is immediately followed by the recstart
string.

Here the @strong{--recend} is set to @strong{', '}:

@verbatim
  echo /foo, bar/, /baz, qux/, | parallel -kN1 --recend ', ' --pipe echo JOB{#}\;cat\;echo END
@end verbatim

Output:

@verbatim
  JOB1
  /foo, END
  JOB2
  bar/, END
  JOB3
  /baz, END
  JOB4
  qux/,
  END
@end verbatim

Here the @strong{--recstart} is set to @strong{/}:

@verbatim
  echo /foo, bar/, /baz, qux/, | parallel -kN1 --recstart / --pipe echo JOB{#}\;cat\;echo END
@end verbatim

Output:

@verbatim
  JOB1
  /foo, barEND
  JOB2
  /, END
  JOB3
  /baz, quxEND
  JOB4
  /,
  END
@end verbatim

Here both @strong{--recend} and @strong{--recstart} are set:

@verbatim
  echo /foo, bar/, /baz, qux/, | parallel -kN1 --recend ', ' --recstart / --pipe echo JOB{#}\;cat\;echo END
@end verbatim

Output:

@verbatim
  JOB1
  /foo, bar/, END
  JOB2
  /baz, qux/,
  END
@end verbatim

Note the difference between setting one string and setting both strings.

With @strong{--regexp} the @strong{--recend} and @strong{--recstart} will be treated as a regular expression:

@verbatim
  echo foo,bar,_baz,__qux, | parallel -kN1 --regexp --recend ,_+ --pipe echo JOB{#}\;cat\;echo END
@end verbatim

Output:

@verbatim
  JOB1
  foo,bar,_END
  JOB2
  baz,__END
  JOB3
  qux,
  END
@end verbatim

GNU @strong{parallel} can remove the record separators with @strong{--remove-rec-sep}/@strong{--rrs}:

@verbatim
  echo foo,bar,_baz,__qux, | parallel -kN1 --rrs --regexp --recend ,_+ --pipe echo JOB{#}\;cat\;echo END
@end verbatim

Output:

@verbatim
  JOB1
  foo,barEND
  JOB2
  bazEND
  JOB3
  qux,
  END
@end verbatim

@node Header
@section Header

If the input data has a header, the header can be repeated for each
job by matching the header with @strong{--header}. If headers start with
@strong{%} you can do this:

@verbatim
  cat num_%header | parallel --header '(%.*\n)*' --pipe -N3 echo JOB{#}\;cat
@end verbatim

Output (the order may be different):

@verbatim
  JOB1
  %head1
  %head2
  1
  2
  3
  JOB2
  %head1
  %head2
  4
  5
  6
  JOB3
  %head1
  %head2
  7
  8
  9
  JOB4
  %head1
  %head2
  10
@end verbatim

If the header is 2 lines, @strong{--header} 2 will work:

@verbatim
  cat num_%header | parallel --header 2 --pipe -N3 echo JOB{#}\;cat
@end verbatim

Output: Same as above.

@node --pipepart
@section --pipepart

@strong{--pipe} is not very efficient. It maxes out at around 500
MB/s. @strong{--pipepart} can easily deliver 5 GB/s. But there are a few
limitations. The input has to be a normal file (not a pipe) given by
@strong{-a} or @strong{::::} and @strong{-L}/@strong{-l}/@strong{-N} do not work.

@verbatim
  parallel --pipepart -a num1000000 --block 3m wc
@end verbatim

Output (the order may be different):

@verbatim
 444443  444444 3000002
 428572  428572 3000004
 126985  126984  888890
@end verbatim

@node Shebang
@chapter Shebang

@menu
* Input data and parallel command in the same file::
* Parallelizing existing scripts::
@end menu

@node Input data and parallel command in the same file
@section Input data and parallel command in the same file

GNU @strong{parallel} is often called as this:

@verbatim
  cat input_file | parallel command
@end verbatim

With @strong{--shebang} the @emph{input_file} and @strong{parallel} can be combined into the same script.

UNIX shell scripts start with a shebang line like this:

@verbatim
  #!/bin/bash
@end verbatim

GNU @strong{parallel} can do that, too. With @strong{--shebang} the arguments can be
listed in the file. The @strong{parallel} command is the first line of the
script:

@verbatim
  #!/usr/bin/parallel --shebang -r echo

  foo
  bar
  baz
@end verbatim

Output (the order may be different):

@verbatim
  foo
  bar
  baz
@end verbatim

@node Parallelizing existing scripts
@section Parallelizing existing scripts

GNU @strong{parallel} is often called as this:

@verbatim
  cat input_file | parallel command
  parallel command ::: foo bar
@end verbatim

If @strong{command} is a script, @strong{parallel} can be combined into a single
file so this will run the script in parallel:

@verbatim
  cat input_file | command
  command foo bar
@end verbatim

This @strong{perl} script @strong{perl_echo} works like @strong{echo}:

@verbatim
  #!/usr/bin/perl

  print "@ARGV\n"
@end verbatim

It can be called as this:

@verbatim
  parallel perl_echo ::: foo bar
@end verbatim

By changing the @strong{#!}-line it can be run in parallel:

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/perl

  print "@ARGV\n"
@end verbatim

Thus this will work:

@verbatim
  perl_echo foo bar
@end verbatim

Output (the order may be different):

@verbatim
  foo
  bar
@end verbatim

This technique can be used for:

@table @asis
@item Perl:
@anchor{Perl:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/perl
  
  print "Arguments @ARGV\n";
@end verbatim

@item Python:
@anchor{Python:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/python
  
  import sys
  print 'Arguments', str(sys.argv)
@end verbatim

@item Bash/sh/zsh/Korn shell:
@anchor{Bash/sh/zsh/Korn shell:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /bin/bash
  
  echo Arguments "$@"
@end verbatim

@item csh:
@anchor{csh:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /bin/csh
  
  echo Arguments "$argv"
@end verbatim

@item Tcl:
@anchor{Tcl:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/tclsh
  
  puts "Arguments $argv"
@end verbatim

@item R:
@anchor{R:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/Rscript --vanilla --slave
  
  args <- commandArgs(trailingOnly = TRUE)
  print(paste("Arguments ",args))
@end verbatim

@item GNUplot:
@anchor{GNUplot:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap ARG={} /usr/bin/gnuplot
  
  print "Arguments ", system('echo $ARG')
@end verbatim

@item Ruby:
@anchor{Ruby:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/ruby
  
  print "Arguments "
  puts ARGV
@end verbatim

@item Octave:
@anchor{Octave:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/octave
  
  printf ("Arguments");
  arg_list = argv ();
  for i = 1:nargin
    printf (" %s", arg_list{i});
  endfor
  printf ("\n");
@end verbatim

@item Common LISP:
@anchor{Common LISP:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/clisp
  
  (format t "~&~S~&" 'Arguments)
  (format t "~&~S~&" *args*)
@end verbatim

@item PHP:
@anchor{PHP:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/php
  <?php
  echo "Arguments";
  foreach(array_slice($argv,1) as $v)
  {
    echo " $v";
  }
  echo "\n";
  ?>
@end verbatim

@item Node.js:
@anchor{Node.js:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/node

  var myArgs = process.argv.slice(2);
  console.log('Arguments ', myArgs);
@end verbatim

@item LUA:
@anchor{LUA:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap /usr/bin/lua
  
  io.write "Arguments"
  for a = 1, #arg do
    io.write(" ")
    io.write(arg[a])
  end
  print("")
@end verbatim

@item C#:
@anchor{C#:}

@verbatim
  #!/usr/bin/parallel --shebang-wrap ARGV={} /usr/bin/csharp
  
  var argv = Environment.GetEnvironmentVariable("ARGV");
  print("Arguments "+argv);
  
@end verbatim

@end table

@node Semaphore
@chapter Semaphore

GNU @strong{parallel} can work as a counting semaphore. This is slower and less
efficient than its normal mode.

A counting semaphore is like a row of toilets. People needing a toilet
can use any toilet, but if there are more people than toilets, they
will have to wait for one of the toilets to be available.

An alias for @strong{parallel --semaphore} is @strong{sem}.

@strong{sem} will follow a person to the toilets, wait until a toilet is
available, leave the person in the toilet and exit.

@strong{sem --fg} will follow a person to the toilets, wait until a toilet is
available, stay with the person in the toilet and exit when the person
exits.

@strong{sem --wait} will wait for all persons to leave the toilets.

@strong{sem} does not have a queue discipline, so the next person is chosen
randomly.

@strong{-j} sets the number of toilets.

@menu
* Mutex::
* Counting semaphore::
* Timeout::
@end menu

@node Mutex
@section Mutex

The default is to have only one toilet (this is called a mutex). The
program is started in the background and @strong{sem} exits immediately. Use
@strong{--wait} to wait for all @strong{sem}s to finish:

@verbatim
  sem 'sleep 1; echo The first finished' &&
    echo The first is now running in the background &&
    sem 'sleep 1; echo The second finished' &&
    echo The second is now running in the background
  sem --wait
@end verbatim

Output:

@verbatim
  The first is now running in the background
  The first finished
  The second is now running in the background
  The second finished
@end verbatim

The command can be run in the foreground with @strong{--fg}, which will only
exit when the command completes:

@verbatim
  sem --fg 'sleep 1; echo The first finished' &&
    echo The first finished running in the foreground &&
    sem --fg 'sleep 1; echo The second finished' &&
    echo The second finished running in the foreground
  sem --wait
@end verbatim

The difference between this and just running the command, is that a
mutex is set, so if other @strong{sem}s were running in the background only one
would run at a time.

To tell the difference between which semaphore is used, use
@strong{--semaphorename}/@strong{--id}. Run this in one terminal:

@verbatim
  sem --id my_id -u 'echo First started; sleep 10; echo The first finished'
@end verbatim

and simultaneously this in another terminal:

@verbatim
  sem --id my_id -u 'echo Second started; sleep 10; echo The second finished'
@end verbatim

Note how the second will only be started when the first has finished.

@node Counting semaphore
@section Counting semaphore

A mutex is like having a single toilet: When it is in use everyone
else will have to wait. A counting semaphore is like having multiple
toilets: Several people can use the toilets, but when they all are in
use, everyone else will have to wait.

@strong{sem} can emulate a counting semaphore. Use @strong{--jobs} to set the number of
toilets like this:

@verbatim
  sem --jobs 3 --id my_id -u 'echo First started; sleep 5; echo The first finished' &&
  sem --jobs 3 --id my_id -u 'echo Second started; sleep 6; echo The second finished' &&
  sem --jobs 3 --id my_id -u 'echo Third started; sleep 7; echo The third finished' &&
  sem --jobs 3 --id my_id -u 'echo Fourth started; sleep 8; echo The fourth finished' &&
  sem --wait --id my_id
@end verbatim

Output:

@verbatim
  First started
  Second started
  Third started
  The first finished
  Fourth started
  The second finished
  The third finished
  The fourth finished
@end verbatim

@node Timeout
@section Timeout

With @strong{--semaphoretimeout} you can force running the command anyway after
a period (postive number) or give up (negative number):

@verbatim
  sem --id foo -u 'echo Slow started; sleep 5; echo Slow ended' &&
  sem --id foo --semaphoretimeout 1 'echo Force this running after 1 sec' &&
  sem --id foo --semaphoretimeout -2 'echo Give up after 1 sec'
  sem --id foo --wait
@end verbatim

Output:

@verbatim
  Slow started
  parallel: Warning: Semaphore timed out. Stealing the semaphore.
  Force this running after 1 sec
  Slow ended
  parallel: Warning: Semaphore timed out. Exiting.
@end verbatim

Note how the 'Give up' was not run.

@node Informational
@chapter Informational

GNU @strong{parallel} has some options to give short information about the
configuration.

@strong{--help} will print a summary of the most important options:

@verbatim
  parallel --help
@end verbatim

Output:

@verbatim
  Usage:
  parallel [options] [command [arguments]] < list_of_arguments
  parallel [options] [command [arguments]] (::: arguments|:::: argfile(s))...
  cat ... | parallel --pipe [options] [command [arguments]]
  
  -j n           Run n jobs in parallel
  -k             Keep same order
  -X             Multiple arguments with context replace
  --colsep regexp      Split input on regexp for positional replacements
  {} {.} {/} {/.} {#}  Replacement strings
  {3} {3.} {3/} {3/.}  Positional replacement strings
  
  -S sshlogin    Example: foo@server.example.com
  --slf ..       Use ~/.parallel/sshloginfile as the list of sshlogins
  --trc {}.bar   Shorthand for --transfer --return {}.bar --cleanup
  --onall        Run the given command with argument on all sshlogins
  --nonall       Run the given command with no arguments on all sshlogins
  
  --pipe         Split stdin (standard input) to multiple jobs.
  --recend str   Record end separator for --pipe.
  --recstart str Record start separator for --pipe.
  
  See 'man parallel' for details
  
  When using GNU Parallel for a publication please cite:
  
  O. Tange (2011): GNU Parallel - The Command-Line Power Tool,
  ;login: The USENIX Magazine, February 2011:42-47.

    When asking for help, always report the full output of this:

  parallel --version
@end verbatim

Output:

@verbatim
  GNU parallel 20130822
  Copyright (C) 2007,2008,2009,2010,2011,2012,2013 Ole Tange and Free Software Foundation, Inc.
  License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
  This is free software: you are free to change and redistribute it.
  GNU parallel comes with no warranty.
  
  Web site: http://www.gnu.org/software/parallel
  
  When using GNU Parallel for a publication please cite:
  
  O. Tange (2011): GNU Parallel - The Command-Line Power Tool, 
  ;login: The USENIX Magazine, February 2011:42-47.
@end verbatim

In scripts @strong{--minversion} can be used to ensure the user has at least
this version:

@verbatim
  parallel --minversion 20130722 && echo Your version is at least 20130722.
@end verbatim

Output:

@verbatim
  20130722
  Your version is at least 20130722.
@end verbatim

If using GNU @strong{parallel} for research the BibTeX citation can be
generated using @strong{--bibtex}:

@verbatim
  parallel --bibtex
@end verbatim

Output:

@verbatim
  @article{Tange2011a,
   title = {GNU Parallel - The Command-Line Power Tool},
   author = {O. Tange},
   address = {Frederiksberg, Denmark},
   journal = {;login: The USENIX Magazine},
   month = {Feb},
   number = {1},
   volume = {36},
   url = {http://www.gnu.org/s/parallel},
   year = {2011},
   pages = {42-47}
  }
@end verbatim

With @strong{--max-line-length-allowed} GNU @strong{parallel} will report the maximal
size of the command line:

@verbatim
  parallel --max-line-length-allowed
@end verbatim

Output (may vary on different systems):

@verbatim
  131071
@end verbatim

@strong{--number-of-cpus} and @strong{--number-of-cores} run system specific code to
determine the number of CPUs and CPU cores on the system. On
unsupported platforms they will return 1:

@verbatim
  parallel --number-of-cpus 
  parallel --number-of-cores
@end verbatim

Output (may vary on different systems):

@verbatim
  4
  64
@end verbatim

@node Profiles
@chapter Profiles

The defaults for GNU @strong{parallel} can be changed systemwide by putting the
command line options in @strong{/etc/parallel/config}. They can be changed for
a user by putting them in @strong{~/.parallel/config}.

Profiles work the same way, but have to be referred to with @strong{--profile}:

@verbatim
  echo '--nice 17' > ~/.parallel/nicetimeout
  echo '--timeout 300%' >> ~/.parallel/nicetimeout
  parallel --profile nicetimeout echo ::: A B C
@end verbatim

Output:

@verbatim
  A
  B
  C
@end verbatim

Profiles can be combined:

@verbatim
  echo '-vv --dry-run' > ~/.parallel/dryverbose
  parallel --profile dryverbose --profile nicetimeout echo ::: A B C
@end verbatim

Output:

@verbatim
  \nice -n17 /bin/bash -c echo\ A
  \nice -n17 /bin/bash -c echo\ B
  \nice -n17 /bin/bash -c echo\ C
@end verbatim

@node Spread the word
@chapter Spread the word

I hope you have learned something from this tutorial.

If you like GNU @strong{parallel}:

@itemize
@item (Re-)walk through the tutorial if you have not done so in the past year
(http://www.gnu.org/software/parallel/parallel_tutorial.html)

@item Give a demo at your local user group/team/colleagues

@item Post the intro videos and the tutorial on Reddit, Diaspora*,
forums, blogs, Identi.ca, Google+, Twitter, Facebook, Linkedin,
mailing lists

@item Request or write a review for your favourite blog or magazine
(especially if you do something cool with GNU @strong{parallel})

@item Invite me for your next conference

@end itemize

If you use GNU @strong{parallel} for research:

@itemize
@item Please cite GNU @strong{parallel} in you publications (use @strong{--bibtex})

@end itemize

If GNU @strong{parallel} saves you money:

@itemize
@item (Have your company) donate to FSF or become a member
https://my.fsf.org/donate/

@end itemize

(C) 2013,2014,2015,2016 Ole Tange, GPLv3

@bye