Audio Compression
When songs are recorded and stored, it’s standard practice to work with tools (equipment, programs) that are capable of preserving the original quality of the music. Before digital audio, songs were stored in analog format: first vinyl records, then tape cassettes, then eventually CDs at the beginning of the digital age. Digital audio files, unlike analog, are often compressed, using sophisticated computer programs. This was done to achieve smaller file sizes and fit more songs on listening devices which used to have very limited storage. There are multiple types of audio file compression, each with pros and cons.
The most widely used audios across streaming services and personal devices today are compressed lossy audio files. As these files are designed to fit audio data into a smaller file size, some information is removed to shrink the file size. Lossy compression can be adjusted to compress audio a lot, or to compress audio very little and audio file formats strive for an ideal balance between audio quality and file size. Audio compression uses clever algorithms to shrink down an uncompressed 50MB music file into around 7MB, for example. This data reduction is not considered to be a significant detriment to sound quality, as the discarded data is deemed imperceptible. The most common compressed lossy audio formats are AAC (Advanced Audio Coding) and MP3. Compressed MP3 and AAC files are the standardized method of storing and distributing digital audio today.
Bitrate, Sample Rate and Bit Depth
Bitrate describes the amount of data being transferred into audio. It is determined by the sampling rate and the bit depth. A higher bitrate generally means better audio quality although this is not always the case. Uncompressed files will contain more information and therefore have high bitrates while compressed lossy files generally have less information and therefore lower bitrates.
The sample rate is the number of times in a second an audio sample is taken: the number of times that recording equipment is transforming sound into data per second. The Nyquist-Shannon theorem determined that if you double the maximum frequency of the source you can accurately capture the sample. The human hearing range tops out at about 20kHz and by doubling that number it is assumed that nothing relevant is lost when reproducing the original sound; nothing that the human ear would realistically hear. Hi-resolution audio can be recorded at as high as 192kHz. This is often unnecessary although there are instances where a higher sampling rate helps improve the listening experience.
Analog to digital converters have an in-built low pass filter. This filter processes out frequencies that are not within the sampling limit. If the sampling rate is 44.1kHz, anything below half of that (20kHz) will be accurately resolved. Anything above that will introduce noise samples which is where the low-pass filter comes in to filter the noise out. Increasing the sampling rate moves the low-pass filter higher into the frequency range, further from the human hearing range resulting in theoretically cleaner sounding audio. The common sampling rate is 44.1kHz, which is also the sampling rate for audio CDs. This means that the audio is sampled 44,100 times per second during recording and when the audio is played, the hardware then reconstructs the sound 44,100 times per second. These individual samples also vary in the amount of information they have. Bit depth is the number of bits in each sample, or how information-rich each of the 44,100 pieces of audio is.
Audio files with a high sample rate and high bit depth have more detail. Having more detail generally requires a higher bitrate and results in a larger file size.
The right bitrate for a file depends on the use of the file and the means of delivering the audio. In general, a high bitrate means more information is stored and means better sound quality. Audio CD bitrate is always 1,411 kilobits per second (kbps). The MP3 format can range from around 96 to 320Kbps, and some streaming services range from around 96 to 256kbps.
High bitrates appeal to audiophiles, but they are not always the best choice. If listeners will be downloading it or listening to it on physical audio formats where storage space is not an issue, you can afford a high bitrate. If they’re streaming it, you likely want the bitrate to be lower so it can be streamed effectively without lags. However, below about 90kbps the human ear will notice a significant drop in quality. Also, high-quality audio does not matter if it’s not delivered on quality hardware. If users are listening to audio on mass-market earbuds or headphones, they will not be able to get everything that high-fidelity audio could offer. CD-quality bitrate, which is high, sounds its best on a professional stereo system that is able to adequately express the range of frequencies that 1,411kbps is able to accommodate. The average earbuds, and many desktop speakers, will not be able to express those frequencies.
Is the MP3 dead?
The MP3 rose to prominence in the late 1990s and revolutionized the way we listen to music. This compressed audio reduced file sizes by as much as 95 percent, allowing listeners to fit music albums on compact digital devices. Two decades later, the MP3 is dead, according to the people who invented it. The Fraunhofer Institute for Integrated Circuits, a division of the German research institution that developed the MP3 in the late '80s, recently announced that its licensing program for MP3 related patents has been terminated. The institute said that while the MP3 is still popular, it has been outpaced by “more efficient audio formats”. Users will still be able to listen to MP3 files, but without industry support, a shift to more progressive alternatives looks inevitable.
“Most state-of-the-art media services such as streaming or TV and radio broadcasting use modern codecs such as the AAC (Advanced Audio Coding) family...Those can deliver more features and a higher audio quality at lower bitrates compared to MP3,” said Fraunhofer IIS in a press announcement.
The successor of the MP3 format, AAC achieved improved sound quality at the same bitrates and therefore smaller file sizes. AAC at 256kbps is comparable to MP3 at 320kbps. AAC also encodes frequencies beyond the MP3 cutoff of 20kHz. While a better compression format, AAC isn’t as widely adopted as MP3, at least not yet. As it stands, not as many media devices support AAC. If compatibility is a primary concern then MP3 is probably still the better choice. However, the shift toward AAC and other more advanced formats seems inevitable and the decline of MP3 is expected in the coming years.