Parallel Compression

How to use multi-core CPU to acclerate the tar

1
2
3
tar cf - paths-to-archive | pigz > archive.tar.gz
tar -c --use-compress-program=pigz -f tar.file dir_to_zip 

for each dir create a tar file

1
2
~/VRust$ for i in *; do tar -c --use-compress-program=pigz -f ../VRust_Compress/$i.tar.gz $i; done

Ref: https://stackoverflow.com/questions/15936003/for-each-dir-create-a-tar-file

78

The script that you wrote will not work if you have some spaces in a directory name, because the name will be split, and also it will tar files if they exist on this level.

You can use this command to list directories not recursively:

1
find . -maxdepth 1 -mindepth 1 -type d

and this one to perform a tar on each one:

1
2
find . -maxdepth 1 -mindepth 1 -type d -exec tar cvf {}.tar {}  \;

Progress bar

  • Extract

Ref: https://stackoverflow.com/questions/19372373/how-to-add-progress-bar-to-a-somearchive-tar-xz-extract

Firstly, you’ll need to install pv, which on macOS can be done with:

1
brew install pv

On Debian or Ubuntu, it can be done with: apt install pv.

Pipe the compressed file with pv to the tar command:

1
pv mysql.tar.gz | tar -xz   
  • Compress

https://superuser.com/questions/168749/is-there-a-way-to-see-any-tar-progress-per-file

1
tar cf - /folder-with-big-files -P | pv -s $(du -sb /folder-with-big-files | awk '{print $1}') | gzip > big-files.tar.gz

For OSX (from Kenji’s answer)

1
tar cf - /folder-with-big-files -P | pv -s $(($(du -sk /folder-with-big-files | awk '{print $1}') * 1024)) | gzip > big-files.tar.gz

Explanation:

1
2
3
4
5
tar tarball tool
cf create file
- use stdout instead of a file (to be able to pipe the output to the next command)
/folder-with-big-files The input folder to zip
-P use absolute paths (not necessary, see comments)

pipe to

1
2
3
4
5
6
7
8
9
pv progress monitor tool
-s use the following size as the total data size to transfer (for % calculation)
$(...) evaluate the expression
du -sb /folder-with-big-files disk usage summarize in one line with bytes. Returns eg 8367213097      folder-with-big-files
pipe (|) to awk '{print $1}' which returns only the first part of the du output (the bytes, removing the foldername)
pipe to

gzip gzip compression tool
big-files.tar.gz output file name