User Tools

Site Tools


blog:a_word_on_cd_drives

Determining a CD Drive's Read Offset in Linux

CDs are becoming a bit of an historical artefact these days, what with the proliferation of music streaming services! Even my favourite music purveyor now offers much of its catalogue in the form of downloadable digital files, rather than physical CDs.

Nevertheless, you will still come across music which is only available in physical form -and, of course, you may have hundreds or even thousands of physical CDs from your past which need to be converted to digital files at some point (a process called 'cd ripping')

When you come to do that ripping, though, it's important that the computer CD (or DVD) drive you use to do the deed is reading the physical CD accurately… and that's harder to do than it perhaps ought to be!

For starters, an audio CD doesn't allow random access: you have to start somewhere and read sequentially on from that point. Which means it's rather important to know precisely what that starting point is… and most CD players can't reliably tell you that! Nearly every CD player mis-places its reading head by a number of audio samples: so your audio program says “read audio sample 1000” (for example), and the drive will actually return data from sample 1000 + or - something. The “something” tends to vary by manufacturer and/or by model. In other words, it doesn't vary depending on what CD you're playing but by what device is doing the playing. Once you have determined that your player adds 6 samples when playing CD x, you can be fairly certain it will add 6 samples when playing CDs y, z and a, too. Most NEC CD drives will, for example, add 48 to the sample number it's supposed to be reading. This difference between where a drive is meant to be reading and where it's actually reading is called the 'Drive Offset' -and it's important to know it if you genuinely want perfect duplication of CDs.

There are software tools which can work out what the drive offset is for a particular CD (or DVD) drive; unfortunately, they tend to be Windows-based and require tricks to get them working on Linux (Exact Audio Copy, for example, can tell you your drive's offset, but only runs on Linux via Wine, which I dislike using on my desktops unless I really have to). Whipper is a native Linux program that can also determine your drive's offset -but it's difficult to install on OpenSuse 15 (my distro of choice).

So, instead, I've consulted AccurateRip, which is a publicly-accessible list of known CD and DVD drive models and the known drive offsets for each. If you know the model of your drive, you can consult that list and work out what your offset is.

Unfortunately, that's all a bit manual for my tastes! So, I've automated it somewhat :-)

If you create a blank document (say, testcd.sh) and paste this set of commands in to it:

#!/bin/bash
clear
read -p "Which CD/DVD drive should I check? : " DEVICE
if [ $(lsscsi | egrep -i cd/dvd | egrep -i $DEVICE | wc -l) -eq "0" ]; then
  echo "That's not known as a CD or DVD device!"
  exit 1
else 
  wget --quiet https://www.dizwell.com/lib/exe/fetch.php?media=docs:cddriveoffsets.csv -O /tmp/cddriveoffsets.csv
  DEVICENAME=$(lsscsi | egrep -i cd/dvd | egrep -i $DEVICE | awk '{print $5}')
  DRIVE_OFFSET=$(grep $DEVICENAME /tmp/cddriveoffsets.csv | awk -F "|" '{print $2}')
  rm /tmp/cddriveoffsets.csv
  if [ ! -z "$DRIVE_OFFSET" ]; then
    echo "Drive Offset for $DEVICENAME ($DEVICE) is: " $DRIVE_OFFSET
  else 
    echo "Your drive's model name/number is not in the AccurateRip database."
  fi
  exit 0
fi

…you can then run the script with the command:

sh testcd.sh

You'll see output such as this:

Which CD/DVD drive should I check? : /dev/sr1
Drive Offset for DH-16AES (/dev/sr1) is:  6

When prompted, you type in the device name of the CD or DVD drive you want checked (if you type in a device identifier for a normal hard disk or anything else which isn't explicitly known to be a CD- or DVD-ROM drive, you'll be told as the script aborts).

Usually, that would be “/dev/sr0”, but because my PC has two CD drives and I only use the second of them for ripping, I've specified “/dev/sr1” (without the quotes, though). Armed with that information, the script then checks the model name/number for that device using the lsscsi command and then uses plain old grep to check if that model number is found within the AccurateRip database. If it is, it returns the known offset for that model; otherwise, it tells you your drive is not known by AccurateRip.

With only a limited number of CD drives in the house to test it on, I can't be sure whether lsscsi will always return useful and usable model numbers, but it works for me with my other CD drive, for example:

Which CD/DVD drive should I check? :/dev/sr0 
Drive Offset for DS-8D9SH (/dev/sr0) is:  6

…which is a different model number but happens to have the same offset as my main drive. (Note, a manual check of the AccurateRip database confirms the script isn't just outputting the number 6 no matter what input is thrown at it!!)

Once you know your drive's offset, you can use it when ripping a CD to compensate for your drive's inability to read precisely where it's told to read! For example, a basic command to rip track 1 from an audio CD using cdparanoia would be:

cdparanoia -B -z=40 -d /dev/sr1 "1"

Which means “please rip track 1 from device /dev/sr1, and if you encounter a scratch or other defect that makes reading the audio data tricky, take a maximum of 40 goes to do so before giving up”.

Except that command will rip (in my case) 6 fewer audio samples than it should, because my drive is known to mis-read starting points by 6 samples -so my rip won't be accurate.

Let's say I run it anyway, however:

[email protected]:~> cd Music
[email protected]:~/Music> cdparanoia -B -z=40 -d /dev/sr1 "1"
cdparanoia III release 10.2 (September 11, 2008)

Ripping from sector       0 (track  1 [0:00.00])
          to sector   22874 (track  1 [5:04.74])

outputting to track01.cdda.wav

 (== PROGRESS == [                              | 022874 00 ] == :^D * ==)   

Done.

Cdparanoia itself is happy: the “:^D” smiley face at the end there indicates it believes it's just performed a perfect rip. Now let's save that output file with a new name, and then re-rip the same track, but this time telling cdparanoia about our drive's known offset:

[email protected]:~/Music> mv track01.cdda.wav run1.wav
[email protected]:~/Music> cdparanoia -B -z=40 -d /dev/sr1 "1" -O 6
cdparanoia III release 10.2 (September 11, 2008)

Ripping from sector       0 (track  1 [0:00.00])
          to sector   22874 (track  1 [5:04.74])

outputting to track01.cdda.wav

 (== PROGRESS == [                              | 022874 00 ] == :^D * ==)   

Done.

I therefore now have two different audio files in my Music directory:

I can assure you, though obviously you'll have to take my word for it, that both files play just fine and sound identical. But are they, in fact digitally identical?

[email protected]:~/Music> ffmpeg -i run1.wav -map 0:a -f md5 - 2>/dev/null
MD5=5a6c91d7dd0b129da033fc8fc61d5ec7

[email protected]:~/Music> ffmpeg -i track01.cdda.wav -map 0:a -f md5 - 2>/dev/null
MD5=7e6e92ffff034609b5d691c82f7e5ce7

That ffmpeg command calculates an MD5 hash of the audio data from the named file. If the audio data were exactly the same in both files, the MD5 hashes for them would be identical. But you can see that the hashes are different. So in this case, sounds can be deceptive: the files sound the same, but aren't actually the same, and it's the file ripped with the offset parameter that is technically the more accurate replica of the original.

blog/a_word_on_cd_drives.txt · Last modified: 2019/04/04 19:10 by dizwell