The DC3 just posted all of the
solutions submitted this year (click on "Challenge Results" and then "View" next to the team's name). [
Update: The links have been taken down temporarily, but will return shortly.
Update: They're back!]
Challenge #402 was advanced image analysis. Most teams focused strictly on meta data. However, a few teams did use the
Error Level Analyser. (Brag: Error Level Analyser was created by Noah and is based on the Error Level Analysis description found in my
Black Hat paper.) One team, APWG, used Error Level Analyser and their own variant of the same algorithm.
JPEG Ballistics
One way to tell if an image is "original" is to see if it came from a known original source.
JPEG Fingerprinting (also called "ballistics") attempts to match meta data and file structure with a known device. For example, the JPEG meta data may include a camera make and model. Using programs like
Exiftool and
JPEGsnoop, this meta information can be quickly extracted.
However, there is a problem with analysis based strictly on meta data: meta data can lie. Here's a fun test: Start with a picture from a digital camera. Exiftool should list lots of juicy meta data that identifies the camera. Load the image in Photoshop, draw on it, then resave it. The new picture has your drawing on it, but meta data that still identifies the camera.
While Exiftool only extracts meta data, other tools, including JPEGsnoop, evaluate a variety of image attributes.
Size Matters
JPEG Fingerprinting goes much further than just the meta data. For example, the Fujifilm Finepix F10 6.3MP Digital Camera can only take pictures at 3024x2016, 2848x2136, 2048x1536, 1600x1200, or 640x480. If you have a picture at any other resolution, then you know it did not come directly from this camera -- even if the meta data says that this camera was involved in the picture creation.
Q Table Matching
More wicked is quantization tables analysis. Quantization tables (Q tables) are used by JPEGs to reduce the signal level and increase compression. Ideally the Q tables should be optimized for each image. However, this is computationally expensive -- imagine clicking on "Save As Jpeg" and then waiting 2 minutes for it to save. It is even worse with digital cameras; most cameras lack the resources for high intensity computations.
Rather than computing the Q tables as needed, most applications and cameras use hard-coded Q tables. If your camera has three quality levels (high, medium, low) then you are actually selecting one of three hard-coded Q tables stored in the camera. Photoshop actually has 13 different hard-coded Q tables (I'm ignoring Save For Web, which has even more options). And different versions of Photoshop have different Q tables.
Most applications and devices use custom Q tables. For cameras, the tables may be optimized for the CCD, manufacturer's color space, or image size. Most cameras have different Q tables -- they can even be different between different cameras in the same product line.
Unfortunately, Q tables are not always unique. Many open source tools use the same Q tables (why derive when you can reuse?), and practically everyone does "99% quality" the same way (the Q tables are all "1 1 1 1 1 ...."). But even between different cameras, the same Q tables may appear. For example, the Nikon E2500 has Q tables that match 92% quality. The exact same Q tables are used by the Seiko Epson Corp. PhotoPC 3000Z.
All Together
Ideally, you want to match the meta data, image size, and Q tables (and anything else you can find) to known samples from a real camera. If something does not match, then you know it did not come directly from the camera. And if you get a perfect match, then it is a strong indicator (but not necessarily proof) that it came from the camera.
Applied Ballistics
Different digital cameras have different quality images. Some pictures may be grainy, others may have trouble with bright colors, and still others may not have a very white "white".
Let's say that you want to evaluate pictures from a digital camera before deciding on buying it. There is a web site called
Digital Photography Review (DPR) that offers a gallery of pictures taken with different digital cameras. They have the images scaled to 50% (not original, but good for viewing) as well as the original images from the camera.
Ah, "original images". This is where the problem comes in. They have literally hundreds of high quality pictures that really came from the cameras. These are "original". However, of the 300+ original photos, I have found nearly 10% are not "original".
In most cases, the non-originals were saved using Photoshop. Now keep in mind, I am certainly
not saying that these pictures have been edited or doctored or are forgeries. They appear to have just been saved using Photoshop. The problem is, JPEGs lose quality each time they are resaved, and these have been resaved -- they are not "original from the camera".
Here are three examples:
- Fujifilm FinePix S100 FS. The first picture shows a golden mirror hanging on a red wall. The original image is huge (3840x2880), but it shows a lot of grain and noise. The original was actually saved using "Adobe Photoshop CS3 Windows". Considering that the resave should remove noise, this is really a horrible picture. Then again, all of the other original gallery pictures from this camera do not show as much noise -- so maybe the noise was added using Photoshop or emphasized by Photoshop. (Probably not, but it is an option since the image is not "original".)
- Kodak DCS Pro SLR/c. The first picture is a wonderfully clean image of a Ferris wheel. The problem is, this original was actually saved using Adobe Photoshop CS Windows.
- Leica M8. The first picture is a stunning black-and-white cityscape. However, it has meta data saying that it was saved using Photoshop 3.0. The other black-and-white sample image also says Photoshop 3.0. However, the color samples each lack this Photoshop meta data. I suspect that Photoshop was used to create the black-and-white images from color images.
Then again, perhaps we should not trust the gallery images provided by the actual camera manufacturers. For example, Leica has a
gallery of sample images taken by their cameras. Every one that I have checked so far was actually saved using tools like "
Adobe Photoshop CS3 Macintosh" and "
CREATOR: gd-jpeg v1.0 (using IJG JPEG v62), quality = 100.". I cannot find a single sample on Leica's web site that actually came directly from a camera.
So as not to pick on Leica, the same can be said for
sample pictures at Nikon. I could find no camera-originals to download, but they have a cool Flash wrapper around post-processed and resaved sample images.
In contrast, Kodak and their
Kodak Gallery has no sample pictures (but lots of things you can buy) and I could find no way to browse images at
http://www.kodakgallery.com/. One would think that with two different web locations that are both titled "Kodak Gallery", that this camera manufacturer want people to see samples from their cameras. One would be wrong.
But all is not lost: galleries at
Canon and
Casio really do have straight-from-the-camera samples.
I don't have a copy of CS3, so I don't know if there is an option to strip out meta info.
However, here are some examples where CS3 was used to resave the image AND all of the other meta data from the camera is still there.
http://a.img-dpreview.com/gallery/fujiS100fs_samples/originals/dscf1043_acr.jpg
http://a.img-dpreview.com/gallery/olympus_12-60_2p8-4_o20_samples/originals/aw020984.jpg
http://a.img-dpreview.com/gallery/pentaxk200d_samples/originals/imgp0833_acr.jpg
The best way to get rid of camera meta data is to open the camera image, select-all, copy, open a new image, and paste. The new image will contain the picture but not the camera's meta information.
I am certainly keen on seeing the challenge question & solutions but it appears that they are not available on the DC3 site when I click-through. Congrats on your win, if I've read that right!
I have a couple minor points to add to your posting:
- With respect to the 3 sample images you referenced as being apparently "original" but revealed as edited: all of these are likely output from Photoshop's Adobe Camera RAW, which is the only practical way of sharing "original" images that have been recorded in RAW mode (not JPEG). This would have also been used to perform the monochrome transformation as well.
- It's always great to find exceptions... as I pointed out on my JPEGsnoop page, here's an example of an "original" sample image on Canon's website that has been modified with Photoshop:
http://www.usa.canon.com/app/images/PowerShot_2006/PS_SD800IS/sample_image_3.JPG . The reason why this is a nice example is that it is not simply a case of ACR being used for export.
- For reference, JPEGsnoop does use a comparison of the 3 major characteristics in its overall assessment: Q-table, EXIF metadata, image dimensions and presence of makernotes. It is worth noting that nearly all image editors will destroy makernote data, so this is a particularly handy double-check for integrity purposes.
- To Jay: It depends on what metadata you are looking for (EXIF vs makernotes). Photoshop provides two JPEG save mechanisms, Save As and Save for Web. Save for Web is designed to drop EXIF metadata. If you want to preserve EXIF metadata, then you will need to use "Save As". However, in both cases you will typically lose the camera makernote metadata info.
Thanks for the great feedback.
I have updated the blog entry to make it clear that Exiftool only uses meta data, while JPEGsnoop evaluates multiple attributes.
I'm sure you are right about ACR doing the conversion. The sad thing is, Adobe always screws up the image by doing a DCT adjustment before encoding. It would be better to see the actual (huge) RAW files, or use something like Image Converter (http://www.imageconverterplus.com/how-to-convert/raw_jpg.html) which does not modify the image contents before saving. (I will probably have a blog entry about what Adobe products do later.)
With regards to the DC3 Challenge, I didn't win, but 3rd place is nothing to sneeze at.
The DC3 took down the links temporarily. (Long story, but I think it was because of a request from me.) It should be back up soon (but at a gov pace, "soon" is arbitrary).
Thanks for the info..Calvin pointed out that EXIF is preserved when you do save as but its not in my case. I had this case where I wanted to identify the camera model but it didn't have that info. Going a bit deeper there is a little thumbnail in the EXIF data which suppose to sometimes contain the thumbnail version of the original image. ALl though the actual image gets altered the thumbnail keeps in tact. When i just tested this by opening up a photo in photoshop cs3 and did some changes and tried to retrieve the thumbnail it basically has the thumbnail of the currently edited photo not the old one. SO its basically overwrites that as well. I gues.. What do you think neal?
EXIF Make/Model: NONE
EXIF Makernotes: NONE
With regards to the Fuji example in the text, this is identified on the page as being from Adobe Camera RAW ("ACR"), and is presumably included as a way of showing how the camera's sensor performs before its own noise reduction routines kick in. I assume the same is the case with the SLR/C image. It looks as if most of the Leica M8 sample images were output with ACR - the text points out that the camera's JPG output is unimpressive, and presumably most people shooting the M8 would be shooting RAW and processing the images in a similar way. imaging-resource.com used to have a small selection of RAW samples, but I expect the cost of hosting 25mb RAW files would not justify the returns.