In my previous blog entry, I discussed how JPEG is widely known as a lossy format and the two causes of the loss: coloring and quantization (Q) tables. The Q tables are what lead to continual data loss every time you resave an image. However, not everyone understands how the data loss from Q tables impacts the image.
But I Saw It On YouTube!
Chris Hanson recently pointed me to a YouTube video that claims to show what happens after a JPEG image is resaved 500 times.
The video starts with a picture of my next wife, actress Alyson Hannigan, and shows it seriously degrade over the course of 500 resaves.
There's a problem here: the visible artifacts. This isn't how JPEGs works. The video, which claims to have resaved the JPEG image 500 times, is doing something other than "JPEG".
Converting to Frequency
To understand the kind of data loss from JPEG Q tables, you need to understand how Q tables work.
The image is divided into 8x8 pixel squares. The 8x8 squares are converted into scalars for 64 frequencies. The 64 frequency basis functions look like these:
(I didn't make these values up -- they come from the red channel 8x8 square at 216x152 to 223x159 in the image below -- her eye.)
So what this means: take the first basis frequency (solid white) and scale it by -49. Add to it the second basis frequency (white/black) multiplied by -145, and so on. The total sum of scaled basis functions yields the actual color.
Q Tables
The top-left basis function (solid white) represents the lowest frequency range. In contrast, the bottom right (checkerboard) is the highest frequency range. Since the human eye is not very sensitive to high frequencies, Q tables are used to reduce the values.
(Again, not made up. It comes from the image below.)
To apply it, divide each scalar by the associated Q value. For example, -49/12 = -4.08. Since JPEGs use integer math, this becomes -4. The total table becomes:
From a compression viewpoint, this is exciting. Most 8x8 pixel squares can be reduced to a bunch of low numbers and zeros -- easy to compress. This is how JPEG compresses data.
To recover the image, we multiply the stored, quantified values by the Q table to recover the set of frequency scalars. In this case, we get:
Now, this isn't exactly like the original data, but when converted from frequencies to pixels, it becomes "close enough." Of course, there are many different table values that generate the same results. So some of those values (including the zeros) may become non-zero. This means that more values will be dropped off the next time we resave and apply quantization tables. In fact, even 100% Q tables (where all values are "1") will yield a little loss because the transformation from pixels to frequencies requires fractional values and JPEG uses integers. (That's why 100% quality is really 99% quality.)
The net result is that multiple resaves will remove high frequency components from the 8x8 squares. What once were crisp edges are now blurs. However, the overall color will remain the same (approximately the average color for the entire 8x8 pixel square).
Finally, there is the 8x8 grid. Every 8x8 square is treated independently. A huge distortion in one square will not impact any neighboring squares. With one exception: subsampling. Depending on how the JPEG was saved, the chrominance components may use an 8x8, 8x16, 16x8, or a 16x16 grid. So let's say that the image uses a 16x16 grid. It means that no distortions in any 16x16 square will impact any adjacent 16x16 squares. They are all still independent.
Enough Math!
In theory, JPEGs will constantly get worse with each resave. In practice, JPEGs usually hit a local minima (where there are no more changes) after a few dozen resaves. For example, I found this relatively high-quality picture of Alyson Hannigan:
I resaved the image repeatedly at 99% quality. (Load, save at 99%, reload, resave at 99%, repeat.) At 99% quality, the changes stop after 11 resaves. (Since Q99 takes very tiny steps, it hits a local minima quickly.) Resaved files #11 through #500 all have the exact same sha1 checksum. At 75% quality, it stops after 54 resaves (saves #54 through #500 are identical). Here's the two images (and they really are just a little different):
Resave #500 at 75% quality
Resave #500 at 99% quality
In both cases, the differences from the original are minor. Her hair and sweater are barely less crisp than the original. (And the original that I started with isn't "original".) Let's compare this with frame #497 from the YouTube video:
Since there is no possible way an 8x8 JPEG square can become significantly darker or lighter with a just a resave, there has to be something else going on.
(For fun, I even asked Derek R. to repeat the resave experiment since he uploaded a little script to automate resaving. He wrote back: "I tried compressing the Hannigan pic 500 times, however, I couldn't produce the artifacts in your youtube video of the blocks gradually appearing. I tried a few compression ratios, and it would basically converge after several iterations.")
Thank You, MrGrundleFun
Originally, I was going to blog about how the youTube video was a lie. JPEG doesn't do that! However, I ended up digging a little deeper...
The key clue came from the YouTube video author, MrGrundlefun. In his video's text description, he wrote:
I took the original JPEG photo and opened it in Photoshop. Then I saved over itself as quality level 10 (out of 12). Then I closed the file and reopened it and did it again, 500 times. Each time I saved a copy and numbered them. Then I took every third picture and made this short movie out of them. If I used all 500, the movie would have dragged on too long, and the slow changes would be even harder to notice.
And there it is: Photoshop.
I repeated the experiment manually, using Photoshop. I lost count around 12 (doing it manually and the phone rang), but this is about 20 resaves:
With fewer than two dozen resaves, you can already see parts of the walls getting brighter and darker -- much more than the JPEG algorithm can account for.
Photoshop does some undocumented, proprietary magic to make high frequency areas appear a little sharper. (I think they are trying to mitigate loss from JPEG artifacts.) I've known about this for a few years and call it "rainbowing" -- it is a separation between the red and blue color channels that shows up during an error level analysis. (It's a tell-tale sign that an Adobe application, like Photoshop, was used.) Gimp does rainbowing a little; Photoshop does it a lot.
Now we have multiple JPEG resaves plus something other than JPEG happening between each resave. That "something other than JPEG" from Photoshop is enough to keep the image degradation from terminating after a dozen or more resaves.
Yes, repeatedly saving a JPEG makes the image worse. But repeatedly saving it with Photoshop makes it much worse.
I asked the author of these images exactly what his process was (I knew it couldn't be simply recursive compression). He said that there was an intermediate before re-saving: shifting or scaling the image. This causes a sort of destabilization of the Q tables that results in the effect in those images.
I've tried it myself and gotten similar, but not equivalent results: http://www.flickr.com/photos/kylemcdonald/3821578318/
I still can't quite figure out his exact process, and he doesn't remember. Any insights?
Also, I'd really enjoy seeing some similar analysis of lossily compressed audio formats. I know you're more into visual information, but an analysis of mp3 would be awesome
Really great article - I have a question - do you think PS also does this between conversions to different Color Spaces?
I did a crude comparison (http://www.broadhurst-family.co.uk/lefteye/MainPages/c16-colour_watcher.htm) and could not explain the results, especially RGB to CMYK.
Thank you for sharing
Chris
Add Comment
Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications
This one helps explain the principle that causes another type of glitch/artifact I really appreciate:
http://www.flickr.com/photos/37634994@N05/3471426989/
http://www.flickr.com/photos/37634994@N05/3472236406/in/photostream/
I asked the author of these images exactly what his process was (I knew it couldn't be simply recursive compression). He said that there was an intermediate before re-saving: shifting or scaling the image. This causes a sort of destabilization of the Q tables that results in the effect in those images.
I've tried it myself and gotten similar, but not equivalent results: http://www.flickr.com/photos/kylemcdonald/3821578318/
I still can't quite figure out his exact process, and he doesn't remember. Any insights?
Also, I'd really enjoy seeing some similar analysis of lossily compressed audio formats. I know you're more into visual information, but an analysis of mp3 would be awesome
Each time I read your articles I found image analysis more and more interesting.
I did a crude comparison (http://www.broadhurst-family.co.uk/lefteye/MainPages/c16-colour_watcher.htm) and could not explain the results, especially RGB to CMYK.
Thank you for sharing
Chris