Batch processing of data recorded in each file is sometimes performed by shell scripts.
If there were a large number of files, it could be very time consuming to process them sequentially, waiting for one file to finish before processing the next one.
I was looking for a better way and found "GNU parallel", which is a shell script that can execute commands in parallel, so I am using it now.
install
Install as distributed in a package
$ sudo apt-get parallel
treatment
Although there are many ways to write this function, we will introduce the most general-purpose way of passing a list of arguments via a pipe.
<argument list> | parallel '<execute command>'
The part where the argument is embedded should be written in <Execute command>.
and the part where the argument is embedded should be written in {}
.
$ seq 5 -1 1 | parallel --no-notice 'sleep {} && echo {}' 4 5 2 3 1
- When parallel is invoked, a cautionary statement is displayed each time.
- The argument is embedded in the
{}
position. By default, the program runs concurrently on as many CPU cores as there are cores.
- Since the execution machine on hand had 2 cores, the results were displayed in the order of "4 5", "2 3", and "1", with 2 cores being executed in parallel at a time.
Number of concurrent parallel executions
The number of concurrent executions can be specified with the -j
option.
$ seq 5 -1 1 | parallel --no-notice -j 10 'sleep {} && echo {}' 1 2 3 4 5
Since the number of concurrent executions is set to 10, all 5 will be executed concurrently and displayed in order from the first one that finishes early.
Execution result acquisition
The execution result is stored in the variable $?
(0 success, 1 failure).
All succeeded
$ echo -e "1\n2\n3" | parallel --no-notice 'sleep {} && echo {}' 1 2 3 $ echo $? 0
Failure of any one of these
$ echo -e "1\nA\n3" | parallel --no-notice 'sleep {} && echo {}' sleep: `A': 無効な時間間隔です Try 'sleep --help' for more information. 1 3 $ echo $? 1