View Full Version : My compression tests
I've been testing out some compression programs. If anyone is interested in the results, the page is here (http://members.optushome.com.au/dogg01/compression/index.htm)
I'm constantly updating it
JoNty
07-04-2004, 08:03 AM
I used to test a lot of different file compressors and archivers a while ago. I'll have to get back into it again. Will give your results a read a bit later.
-
JoNty
sweet good to know this stuff for next time i use it. Thanks for that
thingy
07-04-2004, 09:43 AM
Back in the good ol' modem days before the 'net became big, compression was a big thing as it it took 15 - 20 minutes to download 1 floppy disk worth of data.
These days, whatever works works. It's more about getting something that everyone else will be compatible with seeing as so many people have computers and it's harder to convince everyone to use the same program as you (or at least have it installed).
Thanx for the positive feedback.
Yea thingy, the hardest is getting people to use the best program, which i find 7-Zip to be so far. It consistently gets good results and doesnt take forever to compress.
I find bzip2 works best on linux (quick, good compression) or should I be using something else?
Sbutter
07-04-2004, 06:00 PM
I find myself using winace a lot, but i just may have to try out 7-zip.
Ace, Rar, or 7-zip, with 7z being the best. I'm not sure if any are available on linux though
Drakin
08-04-2004, 12:04 PM
Try DURILCA 0.4
As seen here http://www.maximumcompression.com/data/summary_sf.php
Although it depends what your after, these people are into the compression. Some people are into the speed.
svvampy
08-04-2004, 12:26 PM
The time involved would be interesting.
You may also want to check out a different approach, 'lossy compression' with Lzip (http://lzip.sourceforge.net/).
i'll stick to winrar - still got that rusty old version running from like a year ago just haven't been half arsed to change to the latest - there's a saying... if it aint broken don't fix it.
tikdoph
08-04-2004, 01:33 PM
Just out of curiosity, any particular reason why you didn't include Winzip? I would have been interested to see how it rates compared to the others, especially seeing as how it's pretty much the most popular compression program in the world.
Originally posted by WiTT
... if it aint broken don't fix it.
Amen, brother. ;)
That theory works best here at work i tell you the amount of times people wanted the new windows installed or the new office or some fukn upgrade that didn't do jack shit - i just want to hurt those people with a blunt axe. Oh yeah and hand greandes everyone loves hand grenades especially those old school ones that look like a olympic torch.
Originally posted by Drakin
Try DURILCA 0.4
As seen here http://www.maximumcompression.com/data/summary_sf.php
Although it depends what your after, these people are into the compression. Some people are into the speed.
As stated on the page, i'm only testing GUI based programs. I just can't be bothered doing command line stuff. Plus most people wouldn't have a clue how to use a non GUI program. My goal is to encourage usage of the more efficient formats.
I'm definitely after compression. DURILCA looks good. I visit maximumcompression too.
Originally posted by svvampy
The time involved would be interesting.
You may also want to check out a different approach, 'lossy compression' with Lzip.
Yes i've thought about the time. But there's really no accurate way to measure besides sitting there with a stopwatch. I'd have to be running a pure environment, that is windows with minimal services running and not being touched during compression. Also, fragmentation of HDD would have an impact too. Considering that some tests can take up to 8 hours, i have no intention of monitoring the time closely. Plus i use the computer while it's compressing which greatly influences performance and thus time taken. The best i could do is time how long each program takes to get to 5% and make a 'fast' 'slow' 'super slow' type of rating system.
I have no interest in lossy compression. Lossy means that you lose data in the compression. So when you uncompress, not all of the original data is there. This would never work for file compression. It's better suited to images and sound where the original can be imitated with much less data.
Originally posted by tikdoph
Just out of curiosity, any particular reason why you didn't include Winzip? I would have been interested to see how it rates compared to the others, especially seeing as how it's pretty much the most popular compression program in the world.
I used 7-Zip to compress in standard zip format. It's listed in the results under 'Zip'. 7-Zip also claims 2-10% better compression than Winzip in the zip format. The major disadvantage of the zip format is that it compresses each file individually and stores the compressed files in one big file. Other programs compress the files as a whole, allowing them to share data thus achieving much greater efficiency.
druid
08-04-2004, 10:33 PM
The results are interesting, I hadn't heard of some of the programs. Selecting GUI based programs makes sense but running command line would give you more power with easier automation.
It would be nice and illustrative if you could plot the results to graphs. And I must agree that the time factor is rather essential in compression even if it's just a ballpark estimate. I think it should be easily measured with some utility. Isn't it even possible to make most programs create a log of their timestamped actions?
Another significant avenue in compression is memory usage. That's harder to measure though and even more so in a GUI environment. If one would want to be exact with the measures he might want to use a genuine single user system. There's too many processes in Windows that disturb measurements by causing cache pollution etc. The imprecision in the system clock is a problem in all systems which means that you have to do several runs to get values with reasonable confidence intervals.
Going single user usually means command line and different implementations of the algorithms though. Speaking of which, I put up some of the results* of the compression tests (http://www.niksula.cs.hut.fi/~jkaarlas/labra/) we did as sort of research. The algorithms used were different Lempel-Ziv (Zip) variations which we (poorly as we later found out) implemented ourselves in C to get a good testing platform. The test progams were compiled to work in MS-DOS which we used without any extensions to get a genuine single user platform. At first the system clock was reprogrammed to give us a ~1 microsecond accuracy but it caused so much instability that we eventually set it to ~100 microseconds' accuracy.
To get accurate results the tests were repeated 1-10 times depending on the file size (less reruns for bigger files). Different size and algorithm combinations for following formats were tested: random data, plain ASCII text, HTML, binary files (executables), JPEG, GIF, BMP, WAV. With all the different combinations a total of 33 688 tests were run and they took about 51 hours on a K6-2 400 MHz with 192 megs of RAM. It is noteworthy that our implementations didn't get even close to gzip or WinZip compression ratios.
Unfortunately the timing and statistical data or the final report are not available in English at this time.
* sorry about the colors if you are color blind.
P.S. The Lzip thing is obviously a joke and a rather old one at that.
Wow looks like you went to quite a bit of effort there.
I agree that compression time would be nice factor to log. However there are many difficulties in getting accurate measurements as you mentioned. My machine is used quite often, not just by myself. And im sharing internet through it so there are too many factors which influence the results. Yes, some programs have logs. Others don't. Some don't even display percentage or time passed, only a progress bar. Right now it's simply too hard. But i may make a rating system in the future.
I considered adding RAM usage. But i didn't deem it necessary because i stated at the start that no more than 512 megabytes will ever be used. 512 is pretty much the standard these days. The most i ever use anyhow is around 450 to allow a bit of room for windows itself. 7-zip has a memory usage display while selecting formats/settings. I've suggested it to the WinRK author and he said that he will implement it one day as well as fix the bugs i encountered while using his program.
Unless you have an established system in place, you may find yourself repeating the tests to get additional statistics. That's something i'm not really keen to do.
Good idea about the graphs. I may do it some day. Currently i'm just after the statistics. Plus the page is temporary until i come up with a new design. I'm writing it in notepad which i've never done before as well so it may take a bit.
druid
09-04-2004, 12:02 AM
On second thought the machine being used occasionally isn't a bad thing since you are concentrating on GUI progs. If you ever clock the tests you could just make a footnote stating that it's an approximation of real world usage (as it is).
I agree that the memory usage might only be interesting academically. The point about 512 megs is good as long as there isn't much (any) swapping going on.
P.S. I forgot to add that the results from the 33 688 tests were written into text files and later read into a database for easier analysis.
The thing is that sometimes i'm surfing while its compressing. Sometimes i'm playing online games. So theres quite alot of CPU usage there.
Yeah i totally avoid swapping. It can make compression times 20 times longer. And i doubt it's healthy for the HDD.
I should put my thing in an spreadsheet but i dont have excel installed. Guess ill have to do it manually later on.
I'd like to have all kinds of statistics but it's not possible and can make it too complex sometimes. I reckon compression ratio and time are the most important factors.
Well in case anyone is interested (highly doubtful), i've updated the site with new results and programs. I'm constantly adding new stuff. Also it has a new layout. It's also got a new location:
http://members.optushome.com.au/dogg01/index.htm
vBulletin® v3.7.2, Copyright ©2000-2008, Jelsoft Enterprises Ltd.