The main intent of the communication system is to transfer the messages from a sender to the receiver. In this process, the signal used to transmit the information should be accordant with the properties of the transmission channel. The information coding is used to transform the message into the signal, i.e., signal information encoding where the message is encoded into codes. The prior difference between the Huffman coding and Shannon fano coding is that the Huffman coding suggests a variable length encoding. Conversely, in Shannon fano coding the codeword length must satisfy the Kraft inequality where the length of the codeword is limited to the prefix code. This can be explained by the example in which if the length of each valid codeword is taken in exponential form then the resulting group of values should appear like a probability mass function.
Here the prefix code means the code in which no codeword forms the prefix of any other codeword.
Content: Huffman Coding Vs Shannon Fano Coding
|Basis for comparison||Huffman Coding||Shannon Fano Coding|
|Basic||Based on source symbol probabilities||Based on the cumulative distribution function|
|Developed by||David Huffman||Claude Shannon and Robert Fano|
|Invented in the year||1952||1949|
Definition of Huffman Coding
Huffman coding is a compression algorithm that handles the data compression of ASCII characters. In this algorithm the Huffman codes which are not mandatory with prefix codes and generated from a group of probabilities. It was designed by David Huffman when he was studying at MIT in 1952 in the field of information theory. The algorithm is based on the top-down approach where a binary tree is created in a top-down fashion to generate a minimal sequence.
It works by translating the characters contained in a data file into a binary code. However, the major characters in the file have the smallest binary code, and the least occurring characters have the longest binary codes. It is a prevalently used method for lossless text compression where the compressed data can be restored in its original format.
Now, let us understand the algorithm with the help of some steps:
- In the very first step, the symbols or characters are sorted in the descending order according to its probability.
- If the probability is the same, then sort the index of the symbol or letter in descending order.
- Then select two symbols with the lowest probability in which the upper symbol is assigned as ‘1’ bit and under bit as ‘0’ combine into a new symbol and add the probability.
- Revise the symbols like the first step.
- If the probability is the same, the newest symbol is under the old symbol.
- Now, repeat the steps 2 and 3 repeatedly until the probability sum = 1.0.
- Define the codewords of each symbol with binary.
Definition of Shannon Fano Coding
Similar to Huffman coding the Shannon Fano algorithm used to create a uniquely decodable code. It was developed earlier than the Huffman coding algorithm by Claude Shannon and Robert Fano in the year of 1949. This technique also uses the probabilities of the data to encode it. Although, it does not ensures the optimal code generation. It is considered as the technique of constructing prefix codes in accordance with the group of symbols and probabilities.
Furthermore, the codeword length of the codes is l(x)=[log 1/P(x)] and these are known as Shannon codes. As mentioned above the Shannon codeword length fulfils the Kraft inequality condition and can be utilized to generate uniquely decodable codes.
Here the steps are given for interpretation of the algorithm:
- In this algorithm also we need to sort the symbols by descending frequency of occurrence of probabilities.
- If the frequency of the symbol is same, then sort the increasing symbol index.
- Then the symbols are parted into two sets with the least difference possible.
- Repeat step 3 to assign each group 1 symbol.
- Once finished, then produce the code according to the binary tree.
Key Differences Between Huffman Coding and Shannon Fano Coding
- The Huffman coding employs the prefix code conditions while Shannon fano coding uses cumulative distribution function.
- The codes produced by the Shannon fano coding method are not optimal, but the Huffman coding produces optimal results.
Among both of the encoding methods, the Huffman coding is more efficient and optimal than the Shannon fano coding.