Get the latest tech news

Huffman Coding

coding In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by David A.

Input ( A, W) Symbol ( a i) a b c d e Sum Weights ( w i) 0.10 0.15 0.30 0.16 0.29 = 1 Output C Codewords ( c i) 010011110010 Codeword length (in bits)( l i) 3 3 2 2 2 Contribution to weighted path length( l i w i) 0.30 0.45 0.60 0.32 0.58 L( C) = 2.25 Optimality Probability budget(2 − l i) 1/8 1/8 1/4 1/4 1/4 = 1.00 Information content (in bits)(−log 2 w i) ≈ 3.32 2.74 1.74 2.64 1.79 Contribution to entropy(- w i log 2 w i) 0.332 0.411 0.521 0.423 0.518 H( A) = 2.205 For any code that is biunique, meaning that the code is uniquely decodeable, the sum of the probability budgets across all symbols is always less than or equal to one. If this Huffman code is used to represent the signal, then the average length is lowered to 1.85 bits/symbol; it is still far from the theoretical limit because the probabilities of the symbols are different from negative powers of two.The technique works by creating a binary tree of nodes. Huffman's original algorithm is optimal for a symbol-by-symbol coding with a known input probability distribution, i.e., separately encoding unrelated symbols in such a data stream.

Get the Android app

Or read this on Hacker News