Dithering and Sample rate conversion before MP3 encoding? Complete Study

Test Flow#2: R8brain SRC and Reaper Dither Output

In this test, the original test tone is down-sampled first from 96 KHz to 44.1 KHz using R8brain. The output bit depth is set constant/unchanged at 24-bit and quality is set to “very high”.

Then the R8brain output is now 24-bit/44.1KHz is dithered using Reaper (File – Batch file/item converter). Sample rate and channels is set to source. Output format is set to wav while Dither and Noise shaping are checked. The WAV bit depth output is 16-bit PCM.

Finally the 16-bit/44.1KHz WAV test tone is then converted to 16-bit/44.1KHz MP3 by setting at maximum bit rate/quality which is 320kbps. This is the result:

Test 2 result

Test 2 result

The Reaper dithering introduces substantial noise around 15000Hz. These are called dithering noise. For songs with critical high frequencies (such as vocals, cymbals), the dithering noise can interfere at this frequency range. However, there are no serious aliasing distortion issues in the MP3.

Test Flow#3: SRC and Dithering by R8brain only

In this experiment, the Voxengo R8brain is used for both dithering (by setting output bit depth to 16-bit) and sample rate conversion. The 16-bit/44.1KHz output is then converted to MP3. This is the result:

Test Flow #3 result

Test Flow #3 result

As expected the noise is spread out throughout the spectrum even to middle frequencies where human ears are sensitive. This is because R8brain free does not have any noise-shaping functions for dithering.

Test Flow#4: SRC by R8brain and no external dithering

In this last test, sample rate conversion is done first by R8brain then the 24-bit/44.1KHz output is directly feed to the Reaper LAME MP3 encoder. The output is now 16-bit/44.1KHz MP3. This is the result:

Test Flow result

Test Flow result

It is amazing to know that there is no significant dithering noise (unlike in previous tests). Aliasing distortion is also very minimal due to the fact that it has been SRC with R8brain before converting to MP3.

Conclusions and Recommendations

Based on the above study, you can conclude that:

a.) LAME mp3 encoder internal sample rate conversion does not perform very well. It is shown by converting from 96 KHz sample rate to 44.1 KHz and there are serious aliasing distortions in the result.

b.) LAME mp3 encoder does a great job in eliminating dithering noise in the mp3 result as shown in Test #4 where a 24-bit/44.1KHz test tone is feed to the LAME encoder yielding a 16-bit/44.1KHz standard MP3.

c.) Using other dithering tools like in Reaper, R8brain, etc. adds more dithering noise than the LAME MP3 encoder itself. Although there are other great dithering algorithms, the study shows that you should not need to dither your high resolution masters to produce a high quality MP3.

d.) Converting a 24-bit/96KHz WAV test tone directly to MP3 would result to a cleaner 16-bit/48KHz 320kbps MP3 (see test#1). However 48 KHz mp3 sample rate is not commonly distributed unlike 44.1 KHz. If it is directly converted to 44.1 KHz mp3 (from 24-bit/96KHz), the MP3 result has significant aliasing distortion.

Therefore below are recommended for best MP3 conversion quality:

1.) Perform sample rate conversion first on your high resolution masters before the MP3 encoding. Use a quality SRC like R8brain (even free version would do). Do not enable the dithering feature.

2.) You do not need to dither the high resolution masters (e.g. your 24-bit masters) before MP3 encoding. LAME MP3 encoder does a great job in eliminating the dither noise brought about by the reduction in bit depth (from 24-bits to 16-bits) as shown on the above test.

3.) If your target audio would be used in video and film projects (where 48KHz sample rate are common), it would appropriate to use the Test#1 procedure where a high resolution audio such as 24-bit/96KHz would have a cleaner MP3 at 16-bit/48KHz 320kbps. This can be done by directly encoding the high resolution master using lame mp3 encoder without passing to external SRC or dithering.

3.) The standard/recommended workflow for most projects would be (your tools would vary):

High resolution audio master == > Sample rate conversion only (convert to 44.1 KHz standard MP3 sample rate, bit depth unaltered) == > LAME MP3 encoder (set to maximum quality) == > High quality 16-bit/44.1KHz 320kbps MP3

4.) An in-depth “listening” test would be recommended for critical audio projects. So after mp3 conversion, you can do an A-B test (dithered or not dithered) and decide for yourself.

Content last updated on June 28, 2012

  • Thanks Eduardo for sharing your findings 🙂

  • Eduardo Ono

    Dear Mr. Maningo.

    I work as a researcher at UNICAMP (State University of Campinas), Brazil, and I like to study some audio technologies in my free time. I have some high resolution audio sources (24-bit/96KHz) that I used to encode in MP3 format. I found your article quite interesting and it has helped me in getting better MP3 encoding. I have made some experiments by myself with your source file (“Swept_24.wav”) and foobar2000 plus LAME 3.99.5 and found almost same conclusions. It seems that it is better to first down sample if compared to a direct input in LAME. Foobar have a “simple mode” and a “ultra mode” for sample rate conversion. In “simple mode” it was better to down sample to 48 KHz instead of to 44.1KHz (once the source file has a sample rate of 96KHz), but I presume that for an original 88.2Khz it is better to down sample to 44.1Khz. In “ultra mode” I found practically no difference for both sample rates. Regarding to dithering, I’ve seen many articles on the Internet saying that dither (or not dither) is irrelevant since there is no “bit-resolution” for MP3 as there exists for WAV or FLAC, for example. Their basic explanation is that MP3 is not encoded in time domain (amplitude x time) but in frequency domain, so “bit-resolution” has no application. Other articles state that MP3 is always encoded in 16-bit resolution. I am not sure what is the correct assumption but I think it deserves a further explanation for the right use (or no use) of dithering.

    Best regards,

    Eduardo Ono

  • Emerson Maningo

    Thank you so much Jeremy M. for sharing your findings! It looks like we both arrive at the same conclusions and recommendations BUT using different tools (e.g. you are using Audiofile Engineering software samples and Izotope Dither tools). This is interesting 🙂 I am sending you a reply about your email. Cheers!

    Emerson

  • Jeremy M.

    Great Article!

    I also reran the same test using a Mac with os Lion (10.7.4). I also used different software just for comparison. I used iZotope’s RX for the spectral images, dithering (MBIT+), and Resampling from 96Hz to 44.1Hz. I then used Sample Manager by Audio Engineering to batch encode to 320kbps MP3 using the LAME encoder. I have the same conclusions. See images below (click the images for zoomed version).

    1.) This is the original Wave File. I used the “Swept_24.wav” from the test signals provided at the top of this article.

    expt1

    2.) This is the Wave File resampled from the 96Hz to 44.1Hz using isotope RX Advanced Resample plug in.

    expt2

    3.) This is the wave file dithered from 24 bit to 16 bit but the sample rate is the same 96Hz.

    expt3

    4.) This is the wave file dithered from 24 bit to 16 bit and the sample rate is resampled to 44.1Hz.

    expt4

    5.) This is the encoded mp3 file using the LAME encoder and Sample Manager by Audiofile Engineering software. This is a straight encode with no dithering nor any resampling.

    expt5

    6.) This is the encoded mp3 file using the LAME encoder and Sample Manager by Audiofile Engineering software. This is MP3 encode had no dithering but was resampled at 44.1Hz.

    expt6

    7.) This is the encoded mp3 file using the LAME encoder and Sample Manager by Audiofile Engineering software. This is MP3 encode which was dithered using iZoptope RX dither module and was resampled to 44.1Hz.

    expt7

    8.) For giggles this MP3 encode was dithered using iZoptope RX dither module and was not resampled to 44.1Hz.

    expt8

    Observations:

    1. It is better to down sample from 96 KHz sample rate to 44.1 KHz before doing mp3 encode
    2. Do not dither as it adds too much noise and artifacts
    3. The LAME MP3 encoder does a form of down sampling already. SO be aware.

    I’ve bounced back on this question for some time -“To dither or not to dither for MP3”. My project is converting my old vinyl to uncompressed audio wave format so I can archive but then also keep a 2nd compressed audio copy (MP3) for playing out on Serato Scratch Live. I’ve recorded some records at 16 bit and others at 24 bit. Going forward Ill rip my vinyl at 24 bit to keep as faithful copy as possible. If I don’t plan on adding to a CD then I assume I can play the higher bit depth and skip the question in the first place. Anyways thought I’d share my findings as well.

  • Emerson Maningo

    Thanks Scotty, this keeps me pondering also. As a background of the above test, I’m not using the Reaper rendering tool. I’m using the Reaper Batch converter that allows the user to set the output sample rate. In test 1 above, I set the output sample rate to be the same as the source file. So yes, I am directly feeding the 24/96 source to the LAME encoder, this is the screenshot:

    Obviously If I tell Reaper that the output sample rate should be the same as the source file, then Reaper should not automatically perform SRC right? Otherwise this Reaper option in the Item Converter would be pretty useless and should be remove from that tool.

    Granting it won’t automatically convert sample rate, the audio data would then be handled by LAME resample function that convert it to MP3 format along with SRC,etc. The first graph in test 1 is the LAME result (no SRC done by REAPER):

    But when I set the Reaper to do the SRC, this is the result:

    That is obviously different from LAME resample processing.

  • Scotty

    Don’t forget that Reaper has built-in sample rate conversion that becomes active as soon as a different sample rate is specified for an output file. Therefore, it is not necessarily LAME that’s adding aliasing, noise, etc., but Reaper itself. Have you tried feeding the 24/96 file directly to LAME and using its own resample function?

  • Emerson Maningo

    Probably, I have not seen the actual LAME MP3 encoding source code. It would be exciting to know some LAME developers commenting on this topic. It could also possible that they implement their own dithering during the MP3 conversion process. To my ears, MP3 created from LAME using high resolution digital audio (e.g. 24-bits) sound better than converting from 16-bit.

  • Ringmod

    Is it possible that LAME does a “great job in eliminating dither noise” because it is not really adding dither at all, but simply truncating when going from a higher to lower bit rate? That could explain the results in #1 and #4.

  • Emerson Maningo

    Hi Scott,
    Yes, at first I didn’t explore much about Adobe Audition 1.5. I did deeper and found out it does include a spectral analysis tool which can be useful in any audio frequency spectrum related study. Thank you for dropping by. Cheers..

  • Scott Petrovic

    Cool. I didn’t know Audition had a spectral analysis tool in it. Good study.