HOW TO? Attempt to compress a large BINARY file?

Quick to answer questions about finding your way around Linux Mint as a new user.
Forum rules
There are no such things as "stupid" questions. However if you think your question is a bit stupid, then this is the right place for you to post it. Stick to easy to-the-point questions that you feel people can answer fast. For long and complicated questions use the other forums in the support section.
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
User avatar
Webtest
Level 4
Level 4
Posts: 375
Joined: Sun Feb 21, 2010 4:45 pm
Location: Carlisle, Pennsylvania, USA

HOW TO? Attempt to compress a large BINARY file?

Post by Webtest »

Blessings Esteemed Forum Participants & Lurkers!
I am trying to preserve an image of a Windows 10 partition from a hard drive. I did a file copy of the partition, but there were about 20 files that could not be copied.

I also did a "dd" of the entire partition using "dcfldd" (a GREAT tool to use!). Since only about a third of the partition is used, and in that there are hundreds of thousands of tiny single sector files, I'm thinking that the image might shrink down quite a bit if I can find the right tool.

"zip" has a binary option, but neither the long nor the short version of the option is available for use in Linux Mint. I found a comment that "gzip" shouldn't be used for binary files, but otherwise there is no mention of "binary" in the documentation. What tool can I use to TRY to shrink the large BINARY image file?

I already know that 'most' binary files don't usually shrink significantly, but I want to try on this unusual one.

Thank you for any and all comments, suggestions, and assistance with this question.
Blessings in abundance, all the best,
Art in Carlisle, PA USA
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
BOAT - a hole in the water that you pour money into
LINUX - a hole in your life that you pour TIME into

HP dx2400 Core 2 Duo 8 GB. Mint 13/15/17.x/18.x Mate <on LOCKED SD cards, and Kanguru USB drives> No Hard Drive / No SSD
User avatar
xenopeek
Level 25
Level 25
Posts: 29604
Joined: Wed Jul 06, 2011 3:58 am

Re: HOW TO? Attempt to compress a large BINARY file?

Post by xenopeek »

Do you still have the W10 partition on the hard drive? A disk image backup program may be a better solution; it will be able to make a compressed partition backup. Such tools generally can use the filesystem information to only back up sectors that are in use on the filesystem. While with dd it copies the entire partition, including random data that is in unused sectors. There's no requirement for unused sectors to be zeroed, the filesystem keeps track of which sectors are in use and a disk image backup program would skip unused sectors. dd copies all the data, also unused sectors, which may be harder to compress.

If that's not an option you'll want to use some compression tool that is multithreaded. Else it will take forever. zstd, pixz or lbzip2 are all available on Linux Mint to name a few good multithreaded compressors. pixz (.xz file format) will generally give you the smallest file but it will have the slowest compression and decompression. zstd can get close and is very tunable (highly configurable compression level) and for similar compressed file size will have the fastest decompression speed of all. lbzip2 (.bz2 file format) is the fastest compressor but it won't give as a small a file as pixz or zstd can and it won't decompress as fast as zstd.

I assume faster compression speed with "good enough" compressed size is what you care about, that you don't want to have the compressor running for days, so lbzip2 seems like a candidate? You can install lbzip2 with command apt install lbzip2. To compress a single file in the current directory at best compression you can use command lbzip2 -zk9v filename where you replace filename with the name of the file. It will display progress and create a filename.bz2 compressed file. There are plenty compression tools that can decompress bz2 files so it's future proof. Have enough free disk space before you start it.
Image
User avatar
AndyMH
Level 21
Level 21
Posts: 13736
Joined: Fri Mar 04, 2016 5:23 pm
Location: Wiltshire

Re: HOW TO? Attempt to compress a large BINARY file?

Post by AndyMH »

I would echo xenopeek's suggestion, use an image backup utility. If you want to compress the image you have, install pigz, it's what I use inside foxclone to compress partition images. Why pigz? It's multi-threaded so will use all your cores = faster. I would expect to see anything upto a 50% reduction in image size. Some early testing I did on foxclone:
Screenshot from 2021-12-09 11-50-24.png
Note image size here (zero compression level) is used blocks, not the size of the partition.
Thinkcentre M720Q - LM21.3 cinnamon, 4 x T430 - LM21.3 cinnamon, Homebrew desktop i5-8400+GTX1080 Cinnamon 19.0
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: HOW TO? Attempt to compress a large BINARY file?

Post by rene »

Just a FWIW addition that says little other than both above because, yes, don't use (dcfl)dd for backing up a filesystem, but if you are still the type for lower level than e.g, Foxclone: I backup NTFS partitions with e.g.

Code: Select all

sudo ntfsclone -s -o - /dev/sdz9 | pigz -3 >system-name.sdz9.ntfsclone.gz
Clearly, with the correct partition specifier substituted and it unmounted.
User avatar
AndyMH
Level 21
Level 21
Posts: 13736
Joined: Fri Mar 04, 2016 5:23 pm
Location: Wiltshire

Re: HOW TO? Attempt to compress a large BINARY file?

Post by AndyMH »

And drifting slightly, my testing of pigz suggested that there was not any point in using a compression level greater than 1, the time/size trade-off was not worth it. This was compressing partclone images, suspect ntfsclone images will be similar.
Thinkcentre M720Q - LM21.3 cinnamon, 4 x T430 - LM21.3 cinnamon, Homebrew desktop i5-8400+GTX1080 Cinnamon 19.0
rene
Level 20
Level 20
Posts: 12212
Joined: Sun Mar 27, 2016 6:58 pm

Re: HOW TO? Attempt to compress a large BINARY file?

Post by rene »

AndyMH wrote: Thu Dec 09, 2021 9:31 am And drifting slightly, my testing of pigz suggested that there was not any point in using a compression level greater than 1, the time/size trade-off was not worth it.
Last time you said it was 3, which was in fact the reason I used 3... viewtopic.php?p=1713878#p1713878
User avatar
Webtest
Level 4
Level 4
Posts: 375
Joined: Sun Feb 21, 2010 4:45 pm
Location: Carlisle, Pennsylvania, USA

Re: HOW TO? Attempt to compress a large BINARY file?

Post by Webtest »

Blessings and THANK YOU xenopeek, AndyMH, and rene!
All of your responses were very much to the point and very helpful and support my OP objective. Google searches nearly always pointed back to references to the useless-in-this-case "dd" manpage. The comments about multi-threading were very accurate. I am using an older 'spare' machine for this project - 4-core i5 w/4GB memory and was copying from the W10 system SATA drive to a new Seagate 2TB SATA drive. The caja/system file copy AND the dcfldd command are both SINGLE THREADED. They use all four cores, but only ONE AT A TIME! The file copy operation took at least 5 hours for 35 GB ... I ran it over night and it was hung up when I went back to it with the next-to-the-last un-copyable file. The "time to completion" progress bar was a TOTAL JOKE - it never got above "21 minutes". The dcfldd operation only took about an hour for 159 GB.

I can't use anything "Windows" because I can not get into the Win10 OS. It has been running single-user/admin account with NO password ever set or used for a couple of years at a friend's house, but in the process of (unsuccessfully!) trying to set up wired network printer sharing with his OLD Vista system I changed the firewall "style" setting from "Business" to "Home". After the machine was rebooted it had suddenly decided that it could NOT do ANYTHING without the ?correct password? ??? We tried a lot of them, but no password had ever been set AFAIK. I guess I'm going to have to have W10 reloaded. It was trivial to plug in and use my Locked LiveMedia LM18.3 USB system. He's happiily running a W10 "junk box" core-2 machine for now.

All of your comments were GREAT and thanks for the suggested compression programs! I will look into them all and mark the post RESOLVED when I get one working.
Blessings in abundance, all the best,
Art in Carlisle, PA USA
BOAT - a hole in the water that you pour money into
LINUX - a hole in your life that you pour TIME into

HP dx2400 Core 2 Duo 8 GB. Mint 13/15/17.x/18.x Mate <on LOCKED SD cards, and Kanguru USB drives> No Hard Drive / No SSD
User avatar
AndyMH
Level 21
Level 21
Posts: 13736
Joined: Fri Mar 04, 2016 5:23 pm
Location: Wiltshire

Re: HOW TO? Attempt to compress a large BINARY file?

Post by AndyMH »

rene wrote: Thu Dec 09, 2021 9:54 am Last time you said it was 3, which was in fact the reason I used 3... viewtopic.php?p=1713878#p1713878
I got it wrong :( . Did more testing after that post.
Thinkcentre M720Q - LM21.3 cinnamon, 4 x T430 - LM21.3 cinnamon, Homebrew desktop i5-8400+GTX1080 Cinnamon 19.0
Locked

Return to “Beginner Questions”