public class HuffmanCodec extends Object implements PrefixCodec, Serializable
A Huffman coder is built starting from an array of frequencies corresponding to each symbol. Frequency 0 symbols are allowed, but they will degrade the resulting code.
Instances of this class compute a canonical Huffman code (Eugene S. Schwartz and Bruce Kallick, “Generating a Canonical Prefix Encoding”, Commun. ACM 7(3), pages 166−169, 1964), which can by quickly decoded using table lookups. The construction uses the most efficient onepass inplace codelength computation procedure described by Alistair Moffat and Jyrki Katajainen in “InPlace Calculation of MinimumRedundancy Codes”, Algorithms and Data Structures, 4th International Workshop, number 955 in Lecture Notes in Computer Science, pages 393−402, SpringerVerlag, 1995.
We note by passing that this coded uses a CanonicalFast64CodeWordDecoder
, which does not support codelengths above 64.
However, since the worst case for codelengths is given by Fibonacci numbers, and frequencies are to be provided as integers,
no codeword longer than the base[(5^{1/2} + 1)/2] logarithm of 5^{1/2} · 2^{31} (less than 47) will ever be generated.
PrefixCoder
from the
shortest code word, the code word length[], and the symbol[].
Modifier and Type  Class and Description 

static class 
HuffmanCodec.DecoderInputs
Class encapsulates the data necessary to reconstruct a
CanonicalFast64CodeWordDecoder or recreate the code. 
Modifier and Type  Field and Description 

int 
size
The number of symbols of this coder.

Constructor and Description 

HuffmanCodec(int[] frequency)
Creates a new Huffman codec using the given vector of frequencies.

HuffmanCodec(int[] frequency,
HuffmanCodec.DecoderInputs decoderInputs)
Creates a new Huffman codec using the given vector of frequencies.

Modifier and Type  Method and Description 

CodeWordCoder 
coder()
Returns a coder for the compression technique represented by this coded.

BitVector[] 
codeWords()
Returns the vector of prefixfree codewords used by this prefix coder.

Decoder 
decoder()
Returns a decoder for the compression technique represented by this coded.

static PrefixCoder 
newCoder(BitVector shortestCodeWord,
int[] length,
int[] symbol)
(Re)constructs the canonical huffman code from the shortest code word,
the nondecreasing bit lengths of each code word, and the permutation of
the symbols corresponding to those bit lengths.

static PrefixCoder 
newCoder(HuffmanCodec.DecoderInputs decoderInputs)
(Re)constructs the canonical huffman code from the shortest code word,
the nondecreasing bit lengths of each code word, and the permutation of
the symbols corresponding to those bit lengths.

int 
size()
Returns the number of symbols handled by this codec.

public HuffmanCodec(int[] frequency)
frequency
 a vector of nonnnegative frequencies.public HuffmanCodec(int[] frequency, HuffmanCodec.DecoderInputs decoderInputs)
frequency
 a vector of nonnegative frequencies.decoderInputs
 The inputs necessary to reconstruct a
CanonicalFast64CodeWordDecoder
will be set on this
object.public CodeWordCoder coder()
Codec
coder
in interface Codec
coder
in interface PrefixCodec
public Decoder decoder()
Codec
public int size()
Codec
public BitVector[] codeWords()
PrefixCodec
codeWords
in interface PrefixCodec
public static PrefixCoder newCoder(HuffmanCodec.DecoderInputs decoderInputs)
decoderInputs
 This contains the necessary and sufficient information to
recreate the PrefixCoder
.PrefixCoder
instance for the corresponding
canonical huffman code.public static PrefixCoder newCoder(BitVector shortestCodeWord, int[] length, int[] symbol)
shortestCodeWord
 The code word with the shortest bit length.length
 The bit length of each code word in the nondecreasing order
assigned when the code was generated. The length of this array
is the #of symbols in the code.symbol
 The permutation of the symbols in the assigned when the
canonical huffman code was generated. The length of this array
is the #of symbols in the code.PrefixCoder
instance for the corresponding
canonical huffman code.HuffmanCodec.DecoderInputs
Copyright © 2006–2019 SYSTAP, LLC DBA Blazegraph. All rights reserved.