Last month I decided to revive my long-neglected MP3 collection after relying on Spotify for many years. This meant using Apple Music for the first time since it was called iTunes, and in the course of ripping a few hundred CDs I found a great many bugs. One particularly fun one was that sometimes, the application will just stop accepting new artwork; you will drag an image into the file info dialog as usual but the window then displays a placeholder image instead of the one you just tried to save. The only way I have found to stop this happening is to completely restart the computer.
In the course of investigating the bug, I wondered whether the app might be genuinely adding the artwork to the MP3 files, but failing to display the artwork in the UI. After all, it has several other bugs where the file metadata is correct but the UI displays them incorrectly, for example splitting one album into multiple objects in the library browser, or displaying tracks out of numerical order.
There’s another reason I wanted to check if the artwork was actually being saved into the files. Apple Music has an option to fetch album art for you, but aside from the fact that it’s very bad at finding the correct artwork, it doesn’t actually burn it into the file, it caches it somewhere else on disk. So, if you move the files to another device you lose the artwork. You need to burn the art into the MP3 files if you want to reliably keep it.
Now I don’t know anything about the MP3 file format so I wasn’t sure how to check whether a given file actually contains the expected image data. I assumed that reading the entire JPG file into a string and looking for that string inside the MP3 might not work, because it might not keep a literal copy of the entire image file – it might discard some headers, for example.
My work on Building Git taught me about a very useful compression algorithm called XDelta, which Git uses to compress objects it stores in packfiles and sends over the network. It works by building an index from a source string, and then compresses a target string to produce a list of instructions for rebuilding the target from the source.
index = Pack::XDelta.create_index('the quick brown fox jumps over the lazy dog')
index.compress('the quick brown fox writes ruby')
# => [
# #<struct Pack::Delta::Copy offset=0, size=20>,
# #<struct Pack::Delta::Insert data="writes ruby">
# ]
Here we can recover our target string by copying the first 20 bytes of the
source string, and then appending the new text "writes ruby"
. When serialized,
the Copy
instructions are encoded as just a few bytes each, so this massively
shrinks the size of files that have long strings of bytes in common.
XDelta isn’t a diff algorithm per se, but it is very good at identifying parts of strings that are the same. It’s also much faster than typical diff algorithms when comparing files at the level of individual bytes, rather than line-by-line, so it works great for large binary files.
I wondered whether this could help me determine whether an MP3 file contained the intended image data. I took two copies of the file, one before I tried adding artwork to it, and one after. I also stored a copy of the artwork I was trying to import.
cover = File.read('cover.jpg')
no_art = File.read('no-art.mp3')
with_art = File.read('with-art.mp3')
Now, if we build an index using the “after” file, which we expect to contain the artwork, we can use it to compress both the cover image and the “before” file, showing us which parts of those files are the same as the final MP3.
index = Pack::XDelta.create_index(with_art)
compressed_cover = index.compress(cover)
pp compressed_cover
compressed_audio = index.compress(no_art)
pp compressed_audio
For the cover image, this prints a single Copy
instruction, showing that we
can reconstruct cover.jpg
by just copying one big range of bytes out of the
final MP3 file:
[#<struct Pack::Delta::Copy offset=551, size=361841>]
This is surprisingly simple – the MP3 just contains a verbatim copy of the entire cover image file. When I tested files after Apple Music stopped accepting new artwork, the output would look more like this:
[#<struct Pack::Delta::Insert
data=
"\xFF\xD8\xFF\xE0\x00\x10JFIF\x00\x01\x01\x01\x00H\x00H\x00\x00\xFF\xDB...">,
#<struct Pack::Delta::Insert
data=
"\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E\x1E...">,
...
]
Seeing a lot of Insert
instructions means the target file contains a lot of
data that’s not present in the source file, so this quickly told me that the
application was genuinely not saving the artwork, as opposed to saving it but
not displaying it.
It’s also worth having a look at the compressed audio – the compression of the file without art based on the one with art. This contains a few more instructions but shows that most of the data in the “before” file is present verbatim in the “after” file. The only novel data is the ID3 header at the very beginning of the file.
[#<struct Pack::Delta::Insert data="ID3\x02\x00\x00\x00\x00\x11\"">,
#<struct Pack::Delta::Copy offset=10, size=529>,
#<struct Pack::Delta::Copy offset=362400, size=1681>,
#<struct Pack::Delta::Copy offset=372632, size=15576294>]
We can check a few more things by printing the sizes of all the files:
p \
cover: cover.bytesize,
no_art: no_art.bytesize,
with_art: with_art.bytesize
This gives us:
{:cover=>361841, :no_art=>15578514, :with_art=>15948926}
Notice that the filesize of with-art.mp3
is 15,948,926 bytes, which equals the
value you get by adding the offset
and size
values from the final Copy
instruction: 372,632 + 15,576,294 = 15,948,926. This tells us that the final
15,576,294 bytes of both files – the majority of their content – are
identical; the compression instructions read right up to the end of the source
file. Also, this song is 6:29 long, and encoded at 320kbps, so we’d expect the
audio data to be around 15,560,000 bytes in length, so it looks very likely that
this long range of identical data is the audio.
Finally let’s convert the offset
/size
values from the Copy
instructions
into start/end addresses hex:
compressed_audio.grep(Pack::Delta::Copy).each do |copy|
p \
start: copy.offset.to_s(16),
end: (copy.offset + copy.size - 1).to_s(16)
end
This prints the following:
{:start=>"a", :end=>"21a"}
{:start=>"587a0", :end=>"58e30"}
{:start=>"5af98", :end=>"f35c7d"}
If we look at with-art.mp3
using hexdump
we discover that the first slice
from a
to 21a
contains the song’s metadata. The other slices begin at
587a0
and 5af98
, and hexdump
displays the following around those
addresses:
00058790 a6 ab e2 33 b7 93 ff d9 00 00 00 00 00 00 00 00 |...3............|
000587a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0005af90 00 00 00 00 00 00 00 00 ff fb e2 00 00 00 07 44 |...............D|
0005afa0 90 3a 81 28 63 70 d7 31 e7 60 18 cc 6e 5c 3a 40 |.:.(cp.1.`..n\:@|
So the first slice is just copying a long range of null bytes out of the source
file, not particularly interesting. The second slice is the audio data, and
hexdump
shows that this address is where non-null data starts showing up again
in the source file.
This is just a quick example of how you can use XDelta to compare files when you don’t know much about their internal structure. By revealing similarities and showing you which portions of a file are the same between different versions, you can learn quite a bit about how the files are structured and spot inconsistencies easily. You can also implicitly check which parts of the source file are not copied into the target, and I have previously used this to diagnose a bug where a database was truncating stored files. It’s a very useful tool to have on hand when dealing with binary data.
–
Building Git recently passed 2,000 copies sold and is on sale for 50% off
until the end of January. Use voucher code TWENTYFOUR
at checkout to redeem
the discount.