Libvorbisenc is an encoding convenience library intended to
encapsulate the elaborate setup that libvorbis requires for encoding.
Libvorbisenc gives easy access to all high-level adjustments an
application may require when encoding and also exposes some low-level
tuning parameters to allow applications to make detailed adjustments
to the encoding process.
All the libvorbisenc routines are declared in "vorbis/vorbisenc.h".
Note: libvorbis and libvorbisenc always
encode in a single pass. Thus, all possible encoding setups will work
properly with live input and produce streams that decode properly when
streamed. See the subsection titled "managed bitrate
modes" for details on setting limits on bitrate usage when Vorbis
streams are used in a limited-bandwidth environment.
Libvorbisenc is used only during encoder setup; its function
is to automate initialization of a multitude of settings in a
vorbis_info structure which libvorbis then uses as a reference
during the encoding process. Libvorbisenc plays no part in the
encoding process after setup.
Encode setup using libvorbisenc consists of three steps:
high-level initialization of a vorbis_info structure by
calling one of vorbis_encode_setup_vbr() or vorbis_encode_setup_managed()
with the basic input audio parameters (rate and channels) and the
basic desired encoded audio output parameters (VBR quality or ABR/CBR
calling vorbis_encode_setup_init() to
finalize the high-level setup into the detailed low-level reference
values needed by libvorbis to encode audio. The vorbis_info
structure is then ready to use for encoding by libvorbis.
The sampling rate (in samples per second) of the input audio. Common examples are 8000 for telephony, 44100 for CD audio and 48000 for DAT. Note that a mono sample (one center value) and a stereo sample (one left value and one right value) both are a single sample.
The number of channels encoded in each input sample. By default,
stereo input modes (two channels) are 'coupled' by Vorbis 1.1 such
that the stereo relationship between the samples is taken into account
when encoding. Stereo coupling my be disabled by using vorbis_encode_ctl() with OV_ECTL_COUPLE_SET.
quality and VBR modes
Vorbis is natively a VBR codec; a user requests a given constant
quality and the encoder keeps the encoding quality constant
while allowing the bitrate to vary. 'Quality' modes (Variable BitRate)
will always produce the most consistent encoding results as well as
the highest quality for the amount of bits used.
A decimal float value requesting a desired quality. Libvorbisenc 1.1 allows quality requests in the range of -0.1 (lowest quality, smallest files) through +1.0 (highest-quality, largest files). Quality -0.1 is intended as an ultra-low setting in which low bitrate is much more important than quality consistency. Quality settings 0.0 and above are intended to produce consistent results at all times.
The maximum allowed bitrate, set in bits
per second. If the bitrate would otherwise rise such that oversized
frames would underflow the bit-reservoir by consuming banked bits,
bitrate management will force the encoder to use fewer bits per frame
by encoding with a more aggressive psychoacoustic model.
setting is a hard limit; the bitstream will never be allowed, under
any circumstances, to increase above the specified bitrate over the
average period set by the reservoir; it may momentarily rise over if
inspected on a granularity much finer than the average period across
the reservoir. Normally, the encoder will conserve bits gracefully by
using more aggressive psychoacoustics to shrink a frame when forced
to. However, if the encoder runs out of means of gracefully shrinking
a frame, it will simply take the smallest frame it can otherwise
generate and truncate it to the maximum allowed length. Note that
this is not an error and although it will obviously adversely affect
audio quality, a Vorbis decoder will be able to decode a truncated
frame into audio.
The average desired bitrate of a stream, set
in bits per second. Average bitrate is tracked via a reservoir like
minimum and maximum bitrate, however the averaging reservior does not
impose a hard limit; it is used to nudge the bitrate toward the
desired average by slowly adjusting the psychoacoustic aggressiveness.
As such, the reservoir size does not affect the average bitrate
behavior. Because this setting alone is not used to impose hard
bitrate limits, the bitrate of a stream produced using only the
average bitrate constraint will track the average over time
but not necessarily adhere strictly to that average for any given
period. Should a strict localized average be required, average
bitrate should be used along with minimum bitrate and
The minimum allowed bitrate, set in bits per second. If
the bitrate would otherwise fall such that undersized frames would
overflow the bit-reservoir with unused bits, bitrate management will
force the encoder to use more bits per frame by encoding with a less
aggressive psychoacoustic model.
This setting is a hard limit; the
bitstream will never be allowed, under any circumstances, to drop
below the specified bitrate over the average period set by the
reservoir; it may momentarily fall under if inspected on a granularity
much finer than the average period across the reservoir. Normally,
the encoder will fill out undersided frames with additional useful
coding information by increasing the perceived quality of the stream.
If the encoder runs out of useful ways to consume more bits, it will
pad frames out with zeroes.
The size of the minimum/maximum bitrate
tracking reservoir, set in bits. The reservoir is used as a 'bit
bank' to average out localized surges and dips in bitrate while
providing predictable, guaranteed buffering behavior for streams to be
used in situations with constrained transport bandwidth. The default
setting is two seconds of average bitrate.
When a single frame is larger than the maximum allowed overall
bitrate, the bits are 'borrowed' from the bitrate reservoir; if the
reservoir contains insufficient bits to cover the defecit, the encoder
must find some way to reduce the frame size.
When a frame is under the minimum limit, the surplus bits are placed
into the reservoir, banking them for future use. If the reservoir is
already full of banked bits, the encoder is forced to find some way to
make the frame larger.
If the frame size is between the minimum and maximum rates (thus
implying the minimum and maximum allowed rates are different), the
reservoir gravitates toward a fill point configured by the
reservoir bias setting described next. If the reservoir is
fuller than the fill point (a 'surplus of surplus'), the encoder will
consume a number bits from the reservoir equal to the number of the
bits by which the frame exceeds minimum size. If the reservoir is
emptier than the fillpoint (a 'surplus of defecit'), bits are returned
to the reservoir equaling the current frame's number of bits under the
maximum frame size. The idea of the fill point is to buffer against
both underruns and overruns, by trying to hold the reservoir to a
Reservoir bias is a setting between 0.0 and 1.0 that biases bitrate
management toward smoothing bitrate spikes (0.0) or bitrate peaks
(1.0); the default setting is 0.1.
Using settings toward 0.0 causes the bitrate manager to hoard bits in
the bit reservoir such that there is a large pool of banked surplus to
draw upon during short spikes in bitrate. As a result, the encoder
will react less aggressively and less drastically to curtail framesize
during brief surges in bitrate.
Using settings toward 1.0 causes the bitrate manager to empty the bit
reservoir such that there is a large buffer available to store surplus
bits during sudden drops in bitrate. As a result, the encoder will
react less aggressively and less drastically to support minimum frame
sizes during drops in bitrate and will tend not to store any extra
bits in the reservoir for future bitrate spikes.
average track damping
A decimal value, in seconds, that controls how quickly the average
bitrate tracker is allowed to slew from enforcing minimum frame sizes
to maximum framesizes and vice versa. Default value is 1.5
When the 'average bitrate' setting is in use, the average bitrate
tracker uses an unbounded reservoir to track overall bitrate-to-date
in the stream. When bitrates are too low, the tracker will try to
nudge bitrates up and when the bitrate is too high, nudge it down.
The damping value regulates the maximum strength of the nudge; it
describes, in seconds, how quickly the tracker may transition from an
extreme nudge in one direction to an extreme nudge in the other.
In Vorbis 1.1, vorbis_encode_ctl() can
adjust the following additional parameters not described elsewhere:
Configures whether or not bitrate
management is in use or not. Normally, this value is set implicitly
during encoding setup; however, the supported means of selecting a
quality mode by bitrate (that is, requesting a true VBR stream, but
doing so by asking for an approximate bitrate) is to use vorbis_encode_setup_managed()
and then to explicitly turn off bitrate management by calling vorbis_encode_ctl() with OV_ECTL_RATEMANAGE2_SET
Stereo encoding (and in the future, surround
encodings) are normally encoded assuming the channels form a stereo
image and that lossy-stereo modelling is appropriate; this is called
'coupling'. Stereo coupling may be explicitly enabled or disabled.
Sets the hard lowpass of a given encoding mode;
this may be used to conserve a few bits in high-rate audio that has
limited bandwidth, or in testing of the encoder's acoustic model. The
encoder is generally already configured with ideal lowpasses (if any
at all) for given modes; use of this parameter is strongly discouraged
if the point is to try to 'improve' a given encoding mode for general
impulse coding aggressiveness
By default, libvorbis
attempts to compromise between preventing wide bitrate swings and
high-resolution impulse coding (which is required for the crispest
possible attacks, but also requires a relatively large momentary
bitrate increase). This parameter allows an application to tune the
compromise or eliminate it; A value of 0.0 indicates normal behavior
while a value of -15.0 requests maximum possible impulse