MP3Pro vs. MP3
MP3... this format is now discussed a lot on the
net, as well as others; ways of sound compression are also much
spoken about: which coder is better - LAME or Fraunhofer and whether
there is a noticeable difference between 256 Kbit/s and 320 Kbit/s.
Some are satisfied with the MP3 format, other are inclined to store
music in alternative ones, for example, OGG Vorbis or LQT. But everybody
is still waiting for a better substitute for a standard MP3 which
is already 9 years old.
The improvements should be considered from two
points of view: an increase in quality of a sound after a file is
compressed at a certain bitrate or a smaller size of a compressed
file with the quality being the same. What do the developers aim
at?
What was before?
As you know, a size of data transferred matters
a lot, and audio files are usually quite big. What should we do?
Should we use the WMA format promoted by Microsoft which has bitrates
comparable to MP3 in quality (though only up to 192 Kbit/s) and
makes the same files smaller in size which, at the same time, lacks
for ID-Tags and is copy protected? But don't worry - your favorite
MP3 format has recently (June 14 2001) got its successor named MP3Pro.
Developers
The MP3Pro is developed by Coding
Technologies Inc. founded in 1997. This company deals in development
and marketing of codecs based on the SBR (Spectral
Band Replication) technology. Coding Technologies
collaborates with Fraunhofer
Institute and Thomson
Multimedia and has a number of respectable investors such as
Heinz Gerhauser, Head of Fraunhofer Institute. Therefore, Coding
Technologies has access to all projects of Fraunhofer, and the name
of MP3Pro was given by Thomson Multimedia which was engaged in its
promotion together with RCA.
Nitty-gritty details
MP3Pro is an mp3 codec based on the SBR technology.
This technology was developed as it was necessary to transfer digital
music through the Internet in a real time mode, and it was needed
for mobile computers and various portable digital players. A limited
data rate or a small memory size make possible to use only low bitrates
for sound compression into MP3 or AAC formats. Speedier connection
methods such as ISDN or xDSL do not provide a constant data flow
because of often overloads on the Internet.
More or less acceptable quality is achieved with
compression at 128 Kbit/s or higher. At a low bitrate there can
be some troubles, for example, sometimes it becomes necessary to
reduce a frequency range to transfer audio data, or artifacts sometimes
appear after encoding. This shows that a psychoacoustic model is
not enough for working with bitrates lower than 128 Kbit/s. The
idea of the SBR technology is to narrow a little a frequency range
at the encoding stage by cutting off highs which then can be restored
at the decoding one according to information about lower frequencies.
This means that the SBR is used at the decoding
stage. And where is then the information for recovery of highs stored
(we understand that low frequencies are not enough to get the highs)?
The MP3Pro, unlike the MP3, contains two streams - the first one
is Layer III, and the second carries the information for recovery
of highs. That is why a file compressed into MP3Pro (which will
be also *.mp3) can be reproduced in a usual player as well, but
at the sampling frequency of 22 kHz as the player can see only the
first stream.
In new players which can read the second stream
the sound will comply with the required quality.
The first player available on the market to support
this format is Thomson
mp3PRO Audio Player 1.0.2, it contains a demo-version of a coder
compressing wav-files into MP3Pro (only up to 64 Kbit/s).
Its functions are very simple. It lacks even for
"repeat" and "random", but has a play-list. :)
For those who need more functions and for faithful
users of Winamp there is mp3PROAudioDecoder
0.98 beta 5 plugin for decoding MP3Pro files.
The abovementioned player and plugin can also reproduce
usual MP3 files. But the beta version of the Winamp plugin has two
considerable drawbacks. First of all, it replaces the standard ID3-Tag
Editor with its own one (which can edit tags of only the first version).
Secondly, there is a bug in reproduction of MP3 files compressed
with VBR: when it is set to a position inside a track it starts
playing from the very beginning.
There is also a normal coder in the form of a DLL
library for the Nero 5.5.4.0 which can use all available bitrates
and parameters which determine quality and speed of compression.
Initial files can have extensions *.wav, *.mp3,
*.vqf and *.aif.
Test tools
For examination we used several programs. The
Winamp v2.76 with an MP3 decoder from IIS Fraunhofer (v.2.23) was
used for decoding MP3 files. Today there are no special programs
for decoding MP3Pro files direct into wave ones, and recording with
an external program (e.g. SoundForge) can bring in distortions.
The SpectraLAB 4.32.14 was used to build and analyze AFCs; the Analyzer
2000 was used for studying sonograms; the SoundForge 4.5 was used
to remove noise picked up during encoding in the beginning or in
the end of files.
Used codecs: MP3Pro (a full version from
the Nero Burning ROM v5.5.4.0), MP3 (Lame Encoder v3.89 beta of
10.07.2001), Windows Media Audio 8.0 Final (WM8EUtil of 6.10.2001).
Test equipment:
- Intel Celeron 600@855 MHz (9x95 MHz);
- MB Chaintech 6BJM (Intel 440BX);
- Memory 384 MBytes PC133;
- HDD Fujitsu MPH 40 GBytes;
- Creative SoundBlaster Live 5.1.
The F&D SPS-699 acoustic system and the Pioneer
D290 headphones were also used in the tests.
Tests
First of all, I should note that an analyses of
AFCs and sonograms doesn't work quite well for psychoacoustic algorithms,
that is why the main instrument for us will be our ears.
For the tests we used music of different styles
and synthetic fragments. An original WAVE file was first coded by
every codec, then decoded back to WAVE, cleared in the wave editor,
and after that AFCs of the original and final WAVE files were compared.
We have taken the WMA format from Microsoft as
it is a direct competitor of the MP3Pro and because it is interesting
to look closer at the WMA 8.0.
The MP3 128 Kbit/s will be compared with MP3Pro
and WMA at 64 Kbit/s. The MP3 192 will be considered as a competitor
against MP3Pro and WMA at 96 Kbit/s.
Music types:
- Modern dance music (Gala "Keep The Secret", rich in
stereo effects)
- Jazz with live performers (Joe Cocker "Could You Be Loved",
live music with powerful vocals rich in middle and high frequencies)
- Pop music with vocal (Nek "Laura No Esta", bright vocal and
rich mids)
Let's start with AFC of the dance music, and bitrates
of 128 Kbit/s for MP3 and 64 Kbit/s for MP3Pro and WMA.
At the frequencies below 10 kHz all codecs look
almost identical (except a fall at 30 Hz of the MP3 which, though,
is hardly noticeable). At the lower graph (which demonstrates an
enlarged fragment of the highs) limitations of a frequency range
for each codec are well seen. The MP3Pro@64 (@ is followed by a
bitrate) is limited by 16.3 kHz, the MP3@128 - by 15.5 kHz, and
WMA - by 13.5 kHz. All of them sounded almost equally: only the
old MP3 "treated" all frequencies more correctly and the WMA had
a bit muffled mids (but you will never feel it with plastic speakers
and cheap headphones). The MP3Pro showed quite acceptable quality
even at such a low bitrate whose frequency balance was slightly
nudged out by MP3@128.
Now comes Jazz.
The MP3@128 has the most accurate result. Although
filtering of the upper range starts at 15.5 kHz, the delivery of
the highs until this moment is more accurate than that of the competitors.
The MP3Pro@64 is richer in details of highs, but it is interesting
that some of them are lacking in the original! On the whole, the
sound is good, the lows and the mids are very close to the MP3@128,
and the vocal is accurately delivered.
The character of the AFC of the WMA is greatly
different from that of the MP3 and MP3Pro. It means that Microsoft
changed the coding algorithm in the new version of its codec; and
in jazz the WMA is a real outsider. The mids are indistinct, while
the highs are blurred, the sound is unnatural. The reverberation
is lacking, and while the vocal of Joe Cocker sounds good, the woman's
back vocal is distorted.
However, not a single codec can be given A for
such a complicated composition. The 3D picture is, in fact, damaged,
and highs are not transparent at all.
Now let's take a look at the pop music.
The leader here is MP3Pro, and although its AFC
was much different from the original in highs, the sound, in general,
was excellent.
The MP3 played quite good without "decorating"
the highs. The WMA couldn't avoid artifacts at the mids: the sound
was somewhat hoarse. It is clear that Microsoft decided to cut off
the highs at 15-16 kHz in order to compress lower frequencies better.
And now let me turn to more pleasant high frequencies.
Here I'm not going to show you AFC for Pop and
Dance as at high bitrates the distinction between them disappears
completely.
The MP3Pro and MP3 have the most accurate graphs,
while the WMA just omits a lot of fine details. The MP3Pro takes
the lead in highs. The MP3 performs also good and cuts off its highs
at 18-19 kHz. The WMA has some strange fluctuations at the high
frequencies. But all the fragments sound quite good and hardly discernible
for an "average" ears. The MP3Pro has a better 3D sound, the MP3
has a higher resolution and accuracy, and the WMA is a bit too "resonant".
The stereo panorama of the WMA is broken, like that of the MP3Pro,
while the MP3@192 is much closer to the original.
In the pop music the WMA doesn't distort the vocal
of Nek and sounds at the level of the MP3@128-160. As far as a frequency
balance is concerned, the MP3Pro@96 and MP3@192 go on a par.
It is natural that an increased stream improves
such difficult for encoding music as jazz. Its spectral saturation
"offers" a good work for all codecs.
None of the codecs is close to the original more
than by 80%. Such compositions require compression of, at least,
320 Kbit/s, though their similarity with the original won't exceed
90-95%.
The MP3 has the most realistic sound: I wish, though,
the sound were more transparent. The WMA has the worst sound: the
highs are too loud while the mids are muffled. The MP3Pro sounds
quite good while outpacing the WMA in all parameters and losing
to the MP3@192. However, the latter has again some odd highs lacking
in the original.
The MP3Pro is the closest to the original up to
9 kHz, then it becomes inaccurate (look at the enlarged fragment).
How does the MP3Pro work?
Let's look once more at several enlarged graphs.
As you can see, the MP3Pro and MP3 are very similar
in the lows, up to 7.5-9 kHz. It proves that compression of a lower
frequency range in the MP3Pro is implemented by the methods of the
MP3.
Here, the MP3 is the closest to the original, while
the graph of the MP3Pro much differs from those of the others. At
the same time, the listening shows that the sound is very good for
this bitrate!
However, a static AFC can't explain many things.
Therefore, we will turn to the sonograms of the original signal,
MP3Pro and WMA.
Something is already clear; you can also notice
filtering of the highs - a little over 16 kHz of the MP3Pro@64 and
a bit less than 14 kHz of the WMA (though some splashes can be seen
up to 16 kHz).
The sonogram shows that the WMA treats the highs
more correctly; in the MP3Pro components of the high frequencies
are diffused, while the lows are closer to the original than in
the WMA. But if you listen to the fragments you will find out that
the highs sound better in the MP3Pro and are less irritating than
in the WMA.
If you remember, the idea of the MP3Pro (or rather
of the SBR technology) is to cut off highs so that a frequency range
will be more narrow; further the highs are to be restored at the
decoding stage according to information on lower frequencies. Data
on high frequencies and amplitude/power must be stored in a certain
file, but a detailed algorithm is kept in secret. However, there
are several suggestions. I'd like to consider the most verisimilar
one which is based on the assumption that the high-frequency range
is cut into parts and encoded separately.
In order to prevent other questions let's carry
out a simple synthetic test: I will take a signal with white noise
from 0 to 6-6.5 kHz and add 12-kHz tone which will unveil the codec's
operation after passing through it.
You can see that the tone has turned into noise.
It means that the codec didn't know whether it was a tone or something
else as it had information only on power of the signal. If we assume
that a high-frequency part is cut into parts, the second stream
of the MP3Pro then contains data on the signal's power.
In course of encoding with a stream of 64 Kbit/s
a signal gets divided into 3 parts: 0-8.1 kHz, 8.2-16.3 kHz, and
the third part over 16.3 kHz which is deleted by the coder. Then
it takes a part from 8.2 to 16.3 kHz and cuts it into several pieces
as well, calculates an average power of a signal per frame for each
piece and records it into the same frame, but a player can't see
it. The part of 0-8.1 kHz is compressed the usual way, i.e. by an
MP3 coder. And exactly this part can be seen by ordinary players.
Then comes a decoding stage. The MP3 part is the
first to be decoded, then the decoder selects a middle-frequency
part out of it (4.1-8.1 kHz) which is then raised up to 8.2-16.3
kHz by a pitch. The resulted part is divided into pieces to which
the data from the frames are bound.
Such a great amount of work requires at least a
Pentium 200MMX processor against a Pentium-90 for the MP3.
Performance
The increase of requirements for decoding doesn't
always mean the same for encoding. To make sure we will carry out
a small test of the codecs' performance.
A composition we have taken lasts 4 min. 12 sec.
The data are in the table below.
The WMA takes the lead from 64 to 128 kBit/s. The
MP3Pro has also steady results, and the difference between Quality
and Speed is minimal. The usual MP3 has a great spread in values,
with the performance falling down together with a bitrate.
Conclusion
Well, the MP3Pro and WMA are still unable to replace
the MP3 and will be used only at low bitrates. A possibility to
compress music with decent quality at 64 Kbit/s is great. You should
take into account, though, that such a low bitrate is sensitive
to spectral saturation of a coded sound and the quality at 64 Kbit/s
will greatly depend on a composition. But an ordinary listener of
modern dance music and an owner of equipment of low or average quality
will be quite satisfied with the MP3Pro and WMA at 64 Kbit/s.
The developers of the MP3Pro created a convincing
illusion of high-quality sound. And I think there will be a lot
of people who will be pleased with this format. It also should be
noted that the MP3Pro is not copy protected.
The new format won't substitute the usual MP3 for
music-lovers, first, because of lack of a high bitrates support
and because of the SBR technology which synthesizes high frequencies
rather than recovers them.
The last thing to be pointed out is that encoding
quality at 64 Kbit/s is much better in the full version than in
the demo one.
Highs:
- decent quality of the sound at low bitrates
- low system requirements
- high compression degree
Lows:
- lack of high bitrate support
- syntheses of high frequencies from low ones