A Guide to Databending Images with Audacity

I’ve recently started playing around with “databending” of images, which is essentially taking a file and editing it using a program not meant for editing that type of file. This can result in some really strange and interesting effects – especially with images.

I’ve been using Audacity, a free program for editing audio, to distort images by applying effects meant for audio to them. I’ve got some interesting results thus far:

“Normalize” effect
“Delay” effect

The first thing about databending images is that you’ll need images in uncompressed formats, like .bmp or .tiff. You may have never heard of these image types before, but they’re very useful for what we’re doing. They differ from the types of images, say, your phone takes pictures in (probably .jpg) in that they aren’t compressed; no algorithm is applied to make the image smaller. This leads to large files sizes (the base image I used for the above two was about 36 MB), but we can reduce that later on. In this tutorial, we’ll be using .bmp files.

To get the required .bmp images, just open the image you want to use in Paint (or any image editor, really), and then export it to .bmp. This will generate a large file, probably several times the size of the original (the image I used went from 3 MB to about 36 MB), which will be our master image. Make a copy and save the master, as we don’t want to accidentally mess it up while we’re playing around with it.

At this point, we’re ready to open the image in Audacity and start playing around. There is, however, one thing to watch out for: the image’s header. The header is data stored at the beginning of the image’s overall data that specifies some information about it. It is vital that the header isn’t touched when applying effects, as doing so will make the computer unable to read the image.

We’ll use a hex editor – a program that allows you to view and edit a file’s raw bytes. I use HxD. If we open the copy of the master image that I used for the two images above in the editor, we see this in the first part of the data:

This is the actual raw data that makes up a small part of the image. When people say that computers store data as ones and zeros, this is what they’re talking about: every one of those sets of two numbers/letters corresponds to a certain 8 digit combination of ones and zeros. We see it here as the numbers and letters because it’s being converted to hexadecimal, a base 16 number system with 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F as characters, in that order. Every individual hexadecimal character here represents half a byte, or 4 bits. One bit represents a state of “0” or “1”, so each of these two hexadecimal digits is eight 0s or 1s, or one byte of data.

So what exactly does this data represent, anyway? There are two parts to this data: the header, and the actual pixels in the image. The header is everything from the start of the first line to the last “00” before the “7F” on the 8th line, while everything after that makes up the start of the pixels.

To break down the header, here’s what each part means. At the start, “42 4D” is a standard set of bytes at the start of all bitmaps. It corresponds to the letters “BM” if converted to ASCII character data, which is just a way of indicating that this is a bitmap image. The next part, “7A 24 2E 02” is the number of bytes the image takes up, stored as a 32-bit integer. Note here that there are four bytes, and this 32 bits (4 bytes times 8 bits per byte), which is why we call this a 32-bit integer. In this case, this number is 36578426, or about 36.58 MB. After that, the “C0 0F” and the “D0 0B” are the resolution of the image, width and height, respectively. Translating those bytes as 16-bit integers gives us 4032 and 3024, the resolution of the image. (Side note: altering the width of the image here can produce interesting effects, however, this value should only be shifted to lower numbers; shifting it higher will make the image unreadable.) After the resolution, there’s an “01”, which is the number of planes and will always be one, and “18 00”, which the bit depth. This is 24 here, though it can also be 1, 4, or 8.

The last bit of note in the header is the “42 47 52 73”, which corresponds to the letters “BGRs” in ASCII text. This is information about how the pixel data is stored, and also a great segway into our next part. Starting at the three bytes “7F 72 74”, right after the end of the “00” bytes (which are just a sort of padding), the actual pixel data begins. We interpret this as thus: each of those bytes is an 8-bit integer corresponding to a color value in the RGB color model, which is to say, it’s a value between 0 and 255 representing intensity of one of the three primary colors. This is where that last string of data in the header comes in, which tells us what order the pixel data is in: the first byte is the blue intensity, the second is the green, and the third is the red, or “BGR”. Thus, to convert to RGB, we simply switch the first and last byte, which gives us “74 72 7F”. Plugging this into a hex to RGB converter gives us a light tan-gray color, or the exact color of the first pixel in the bottom left of the image. From here, each consecutive set of three bytes is another pixel, displayed in the actual image in order from left to right and bottom to top. Thus, while the first three bytes are the bottom leftmost pixel, the last three are the top rightmost.

You can adjust the pixel color with a hex editor. For example, altering the first three bytes from “7F 72 74” to “00 00 FF” will make the pixel pure red. Adding “00” to the start of the pixels without overwriting anything will shift the color palette, and adding another “00” will shift it again. A third “00” will bring the image back to its original colors.

If you want more technical information about bitmaps, go here. To start the process of databending the image, fire up Audacity, and go to File > Import > Raw Data. This will allow us to import arbitrary data into Audacity, which it will then read as audio. This window will appear:

Under encoding, select “A-Law”. A-Law is an encoding scheme that will read our file in as raw data without mangling the header or altering the bytes; picking, for example, “Signed 8-bit PCM” will mangle the header of the bitmap and cause the data to be unreadable by image software on export. “U-Law” can also be used, but it will slightly alter the pixel data and make the image a little distorted without applying any effects. It works, but not like A-Law does. Make sure the byte order is set to “Little-endian”, which is a way the data is read and not terribly relevant here, and that it’s set to have one channel. The next two settings should be left as is; setting the offset to anything other than 0 will cut parts of the header off (offset 0 is the first byte in the image, offset 1 is the second, etc) and importing anything less than the entire audio track will make it unreadable (due to discrepancies between the file size data). I’ve never touched the sample rate, so it may have interesting effects somehow.

From here, we’ll get something like this after the import.

This is the data from the image, read as audio. Go ahead and play it if you want, but be warned that it’s mostly just loud noise. However, sometimes the results are interesting: notably, opening mspaint.exe in the manner above will produce almost melodious audio.

There’s a last thing to know about headers before we start editing: how to find them in Audacity. Because the header is at the start of the image, it makes sense that the header is at the start of the audio track. If we look at the very front of this track, we see this:

This is about the first 15 milliseconds of the track. The first part, with the line that goes under the center, is the data we examined earlier in the header. Everything from where the track starts to that little dip at about a fifth of the way in where the squiggly line moves to the center is the header. If we select this section, export it, and then open it in HxD, we can see this:

This is the exact same data we saw in the header earlier, down to the number of “00” bytes before the pixel data! Thus, everything after this portion is fair game for distortion, as long as this portion isn’t touched. If we were to export the data right after this spot, we’d see the exact same bytes as the pixels, and we do:

To start editing, just select everything past the header, and start applying effects with the “Effects” menu. I recommend trying out “Normalize”, which will change the color palette to one more based on primary colors, “Delay”, which will create strangely colored ghostly imprints across the image, and “Bass and Treble” set to bass boost over 20 dB, which will create odd looking monochrome images. Some effects don’t work (for example, effects that alter the length of the track) and produce an unreadable image on export, but many effects work well, and all that work produce interesting results.

After you’ve applied effects, it’s time to export. Go to File > Export > Export Audio, then name the file whatever.raw, set the header to “RAW (header-less)”, and A-Law encoding, then export. This will export the image’s data in the same way we imported it, and keep the program from imparting a header that will mangle the one we already have.

Once this is done, rename the file to whatever.bmp and open it to see the effects. If it won’t open, make sure you haven’t altered the header or the length of the track, though as I said, some effects just don’t work regardless of this. If it worked, congrats! You’ve successfully databent an image. Repeat as you want – there’s all kinds of ways to combine effects to create ever stranger images. Happy databending!