Saturday, September 15, 2012

The Science of Photography -- Part Twelve

In the last installment of “The Science” I described how the digital photo sensor in your camera converts light into an electrical signal. This article will continue the process, discussing how the electrical signal is converted to a digital signal -- a string of zeros and ones -- and how these “binary” digital signals are used by the various computing processes within the camera.

I’ll begin with a description of the binary number systems and a definition of “bits,” those elemental building blocks of all things digital. We’ll learn what “bit depth” means and how it applies to computer circuitry and digital cameras and also compare the raw digital image to the final processed image in the “Joint Photographic Experts Group” or JPEG format. (There are a few other photo formats found on computers including TIFF and PNG, but most all digital cameras produce JPEG files.)

What may have been apparent to you if you looked closely at the example “RAW” output from the photo sensor on the “Colour Cambridge” web page, it is very green and not appealing. JPEG performs two important functions in a digital camera. First, it reduces the size of the file, a process called “compression,” and, secondly, it converts the digital information into a pleasing and viewable representation of the image captured by the camera. RAW files have a function in the modern digital photographic world, but that function is to be converted to a more pleasing picture.

Let’s start our digital journey by learning the langua-franca of the digital computer: numbers! Numbers expressed in binary notation!!

Binary Numbers

Computers only work with numbers. Wait, you ask, then how does a word processor work? Answer: There is a code that matches numbers to letters. In the ASCII code, the lower case letter “a” is decimal “97” or a binary “0110 0001.” The reason computers use binary numbers is that is the simplest. If a transistor is turned on, for example, we’ll call that a “1.” And if the transistor is shut off, we call it a “0.”

In the decimal number system, for example, "1945" has four places, and each place represents a power of ten. The first place is ten raised to the zero power = 1’s. The second place is ten to the first power = 10. The third place is ten squared or 100 and the fourth place is ten to the 3rd power = 1000. So 1945 is one thousands, plus nine hundreds, plus four tens plus five one’s. It is called a decimal or decimal based system because each place represents a power of ten which is the meaning of the Greek word “decimal.”

In the decimal system there are ten “digits:” 0, 1, 2, 3, 4, 5, 6,7, 8, and 9. When counting up, when you reach nine, you “carry one to the column on the left and then start over counting with zero.

The binary is the same, except there are only two digits (that is what binary means, “two,” like a bicycle), zero and one. So start counting at 0, then 1, then carry to the next column and start over with zero: "10." Here’s a list of the first sixteen binary numbers:

Decimal    Binary
  0                  0
  1                  1
  2                10
  3                11
  4              100
  5              101
  6              110
  7              111
  8            1000
  9            1001
10            1010
11            1011
12            1100
13            1101
14            1110
15            1111
16         1 0000

Binary numbers are powers of two. The first column is two to the zero power which equals “ones” just like with decimal. The second column or place is 2’s, the third column is 4’s, and the fourth column is 8’s. Sixteen decimal is just “1” in the sixteens column.

Notice that digital numbers can be a lot “longer” or contain a lot more digits than decimal, so they can be hard to read. We often break them up into groups of four, separated with a space in like manner to commas in large decimal numbers like 1,000,000.

Note also that the largest decimal number you can have with four digits is 9,999, and represents 10,000 numbers from 0 to 9,999. In a four bit (or column or place) binary number, the largest is 1111 which is "15" decimal, and their are sixteen distinct numbers from 0000 to 1111. (In binary, a bunch of one’s is like in decimal a bunch of nine’s, and it’s like that odometer on your car speedometer: it’s about to “roll over.”

So, back to digital sensors. The various values coming from the sensor are represented as a discrete number. The circuits in computers are designed for specific sizes of numbers, and don’t work well with numbers that are bigger (or “longer”). So the bit size of the number representing the amount of red or blue or green is limited. It may be eight bits or even twelve, but rarely larger. Thus, the binary number representing the intensity of the three primary colors at one pixel point is three, 8-bit or 12-bit numbers combined. That would be a total of 3 x 8 = 24 or 3 x 12 = 36 bits. Then multiply this 24 or 36 bits by the number of megapixels, and you get the total file size.

Actually, recall that most camera manufacturers quote a total count of sensors, rather than the half or one-third of that to represent each individual color. Therefor the count is more like camera MP times eight or twelve bits. Assume 10MP and 8 bits, then that would be 80Mb for a “raw” file. (That's 80 mega-bits. With computers and storage devices, we often refer to a collection of bits, a sort of digital "word." Typcially eight bits is called a "byte," and disk drives and files are usually measured in mega-bytes, not mega-bits. So the file size would be 10MP times 8 bits or one byte for a total of 8MB or mega-bytes. The lower case b = "bits" and the upper case "B" is bytes.)


RAW files (even though it is not an acronym, it is usually written with all upper case) is the actual sensor data with little or no processing. The Bayer transform is performed, combining adjacent and overlapping sensors, so it is typically a “virtual pixel” count, but no other processing is performed. In a sense, therefore, the RAW data is complete. Nothing has been hidden or lost.

Besides the fact that RAW files are large, they are unfinished. They still need a lot of processing to bring out a final image file that computer monitors or printers will display or print in a manner that matches the original image. Plus, these files contain a lot of redundant information. For example, a picture of the blue sky would have thousands or millions of pixels all registering the same color. Imagine representing that as the number 12. Then that would be:


Pretty wasteful of memory space. Suppose, instead, you wrote it as:

3 rows all “12”

Now, in order to do that, you need some symbols for the “3 rows of” and these would have to be additional values beyond just the numbers used for the specific color of a pixel. This can be done by adding digits to the picture information, either adding from, say 8 bit color, plus 2 more bits to give this “repeat instruction” information, or else organize the data differently, adding areas of data that are not interpreted as pixel data, but as instructions for the pixel data. The savings in removing the redundancy would justify the extra data to describe the data. We call data about data, “metadata.”

Furthermore, the JPEG scheme contains knowledge of how the eye works and it actually eliminates some detail that the eye doesn’t notice well anyway in favor of other detail that the eye is very sensitive to. This process of reducing data by removing redundancy is called “compression” and the process of compression that loses some detail is called “lossy compression.”

JPEG is a lossy compression that reduces data size at the expense of some amount of detail. JPEG compression is actually adjustable. On one hand you can go for maximum compression or, less compression and more fidelity.

This is similar to the file size reduction of audio files using the MP3 compression algorithm.

The key to all these compression techniques is that, when you open the file on your computer, the computer program has algorithms to restore the image by “un-doing” the compression, although lossy compression will loose some detail data.

Most small digital cameras convert the RAW data internally and save the pictures in JPEG (or .jpg) format. For one thing, every different model of camera produces different RAW files depending on the very specific size and organization of the sensor.

More expensive cameras will output RAW files, but then you have to have software, such as Adobe Photoshop, that can decode the RAW file and produce viewable images or convert to JPEG. Those that own a copy of Photoshop or any other computer application that can perform this function know that they are continually getting software updates to add the latest RAW formats. By the way, most of the cameras that allow RAW output also come with a program from the manufacturer, Nikon or Canon or whomever, that will convert RAW format. Those often require updating too, because the camera manufactures often improve the software in the camera that processes the digital data. (We call these programs inside the camera “firmware” because they are harder to change than “software” on the desktop or laptop computer.)

The amount of color specific values in a digital pictures depend on the so called “bit depth” or number of bits used to convert the sensor light intensity into a number value. The more bits, the greater the detail of color or “color depth.” It can be confusing, since there are three colors, R,G, and B, you may hear the color depth stated both as 8-bit (per color) or 24-bit (for the combination of the three, 8-bit color values).

The technical term for the number of discrete color values in a a data sample is "dynamic range" although that is more an expression of the highest and lowest possible value. For example, with music, dynamic range is the range from the softest tone to the loudest tone. But the number of discrete steps in that range is what the number of bits will indicate. Of course, the human eye has a limited ability to distinguish colors that are very close to each other, so too many bits would just be wasted since we would not notice the difference anyway.

The dynamic range of an 8-bit number is 256. There are 256 discrete values from 0000 0000 to 1111 1111. Twelve-bit binary numbers have a total of 4096 values from 0000 0000 0000 to 1111 1111 1111. The human eye has a limited dynamic range, but it is higher than 12-bits per color, but, remember, the total "pixel" number is three times that or 36 bits to record all the R, B, and G values.

Photoshop is limited to 16, 24, 32, or 48 bit images. It is very easy to convert lower bit-depth to a higher bit-depth, but that does not add detail to the picture. Still, by converting to a higher number, then you can do all the work in Photoshop, and finally save as a lower bit-depth. That reduces certain mathematical errors in the Photoshop algorithms caused by rounding errors. Just as with sensor size, bigger is better, even if it is later rounded down. Images stored in 32-bit format are called HDR or High Dynamic Range. HDR photographs are typically made from a combination of three individual pictures taken at different exposure settings and combining the 12-bit information into a higher dynamic range.

Here is a list of advantages of RAW vs. JPEG that I found on the Internet. It is a good list of the relative reasons to chose one or the other. But, in most all cases, when you’re done with you editing and processing, you want to produce JPEG. You may want to keep the RAW version too in case you want go back and do more editing.

A Raw file is…
  • Not an image file per se (it will require special software to view, though this software is easy to get).
  • Typically a proprietary format (with the exception of Adobe’s DNG format that isn’t widely used yet).
  • At least 8 bits per color – red, green, and blue (24-bits per X,Y location), though most DSLRs record 12-bit color (36-bits per location).
  • Uncompressed (an 8 megapixel camera will produce a 8 MB Raw file).
  • The complete (lossless) data from the camera’s sensor.
  • Higher in dynamic range (ability to display highlights and shadows).
  • Lower in contrast (flatter, washed out looking).
  • Not as sharp.
  • Not suitable for printing directly from the camera or without post processing.
  • Read only (all changes are saved in an XMP “sidecar” file or to a JPEG or other image format).
  • Sometimes admissible in a court as evidence (as opposed to a changeable image format).
  • Waiting to be processed by your computer.

In comparison a JPEG is…
  • A standard format readable by any image program on the market or available open source.
  • Exactly 8-bits per color (24-bits per location).
  • Compressed (by looking for redundancy in the data like a ZIP file or stripping out what human can’t perceive like a MP3).
  • Fairly small in file size (an 8 megapixel camera will produce JPEG between 1 and 3 MB’s in size).
  • Lower in dynamic range.
  • Higher in contrast.
  • Sharper.
  • Immediately suitable for printing, sharing, or posting on the Web.
  • Not in need of correction most of the time (75% in my experience).
  • Able to be manipulated, though not without losing data each time an edit is made – even if it’s just to rotate the image (the opposite of lossless).
  • Processed by your camera.

Read more:

To most people, the biggest advantage of JPEG is the smaller file size so more pictures can be fit on the memory card in your camera, on the hard file on your computer, or on the Internet when sending email or posting a picture. Further, remember, I said RAW can’t be viewed normally anyway since there are hundreds if not thousands of RAW format files.

Finally, all picture files, be they RAW or JPEG also contain additional “meta data” describing different details of the picture such as its resolution, the model camera that took the pictures, the time and date, the exposure, and tons of other good stuff “about” the picture. This can even include the GPS location data. But that’s enough “metadata” for one episode of “The Science ...” Hopefully now you can count by twos, describe the difference between RAW and JPEG, and explain what is meant by photograph metadata. You’ve learned a lot and it’s time for the lesson to be over.

Recess ...

1 comment:

  1. Hi
    Nice one! I like the outfit of the characters. Wish i could do the same thing too but im not that techie.i like the outfit of “from farmer to warden”.. really interesting <a href="”>portrait photography Seattle</a>