Explanation
- **Huffman Coding** is a **lossless data compression** algorithm that assigns variable-length codes to input characters based on their frequencies.
-
- The idea is to give shorter codes to characters that appear more frequently and longer codes to less frequent characters. This results in compressed data representation.
-
- It is widely used in file compression formats like **ZIP** and **JPEG**.
-
-
Steps
- Build a frequency table of characters in the input.
- Build a priority queue (min-heap) using the frequency of each character.
- Build a Huffman Tree: Combine the two nodes with the lowest frequency until there is only one node left. This node is the root of the Huffman tree.
- Generate Huffman codes: Traverse the tree to assign binary codes to each character.
- Encode the input data using the generated Huffman codes.
- Decode the data by traversing the Huffman tree.
-
Time Complexity
- Building the frequency table: O(n), where n is the number of characters.
- Building the heap: O(n log n), where n is the number of unique characters.
- Encoding: O(n), where n is the length of the input.