算法-二叉树2-霍夫曼编码

分类: 数据结构与算法 |
一、基本概念
霍夫曼编码是一种古老但是及其有效的数据压缩方法。在计算机中,每一个字符在没有压缩时都占有一个字节(unicode为两个字节),每个字符都有相同的位数(8位或16位)。但在实际使用时,高频字符和低频字符使用相同固定长度,必然会增加总的字节长度,因为我们可以降低高频字符所占位数,容忍低频字符位数较多的情况。
霍夫曼编码就是基于上面的思路,将高频字符使用位数较小的编码表示,而低频字符则使用相对较长的位数。需要注意的是,压缩后的高频字符不能是其他字符的前缀,因为这样在串行读入整个消息时无法区分是高频字符还是其他字符的一部分。例如如果用01表示E,0111101表示X,那么无法判断。
二、霍夫曼编码——霍夫曼树
那么如何进行霍夫曼编码呢?霍夫曼利用二叉树构建了霍夫曼树,算法如下:
1.
首先,霍夫曼树的节点包含两种信息,字符权重(出现频率)和字符本身:
public class HuffmanNode implements Comparable{
}
2.
算法如下:
1)
2)
3)
4)重复3)中过程,直到队列中只有一棵树,这棵树就是这条信息的霍夫曼树
http://s15/mw690/002i2qf8zy7dFprk3j85e&690
private HuffmanNode root;
private Map, Integer> frequencyMap = new HashMap<>();
public Map, Integer> getFrequencyMap() {
return frequencyMap;
}
public void setFrequencyMap(String rawMessage) {
for(int i = 0; i < rawMessage.length(); i++) {
; String character = String.valueOf(rawMessage.charAt(i))
(null == frequencyMap.get(character)) { if
frequencyMap.put(character, 1);
}
else {
int count = frequencyMap.get(character);
frequencyMap.put(character, ++count);
}
}
}
public void buildHuffmanTree() {
; PriorityQueue queue = initHuffmanQueue()
(queue.size() > 1) { while
; addHuffmanQueue(queue)
}
root = queue.poll();
}
private PriorityQueue initHuffmanQueue() {
new PriorityQueue<>(); PriorityQueue result =
(Map.Entry, Integer> entry : for frequencyMap.entrySet()) {
new HuffmanNode(); HuffmanNode node =
node.setCharater(entry.getKey());
node.setWeight(entry.getValue());
result.add(node);
}
return result;
}
private void addHuffmanQueue(PriorityQueue queue) {
; HuffmanNode first = queue.poll()
HuffmanNode second = queue.poll();
HuffmanNode node = new HuffmanNode();
node.leftNode = first;
node.rightNode = second;
node.setWeight(first.getWeight() + second.getWeight());
queue.add(node);
}
3.
霍夫曼树构建好了我们就可以得到一张编码表,也就是字符到霍夫曼编码的映射关系,具体实现如下:
private static String ZERO = "0";
private static String ONE = "1";
private Map, String> encodeTable = new HashMap<>();
public Map, String> getEncodeTable() {
if(encodeTable.isEmpty()) {
root, ""); generateEncodeTable(
}
return encodeTable;
}
private void generateEncodeTable(HuffmanNode node, String binaryNum) {
if(node.leftNode == null && node.rightNode == null) {
, binaryNum); encodeTable.put(node.getCharater()
} return;
leftNode, binaryNum + ZERO); generateEncodeTable(node.
generateEncodeTable(node.rightNode, binaryNum + ONE);
}
4.
编码的过程就是将信息转换成霍夫曼编码,具体实现如下:
public String encode(String rawMessage) {
""; String result =
Map, String> encodeTable = getEncodeTable();
(int i = 0; i < rawMessage.length(); i++) { for
; String character = String.valueOf(rawMessage.charAt(i))
result += encodeTable.get(character);
}
return result;
}
三、霍夫曼解码
下面会详细阐述如何进行简单的解码。假设我们有信息“SUSIE SAYS IT IS EASY”的霍夫曼编码表如下:
" " -> "00"
"A" -> "1110"
"S" -> "10"
"T" -> "0110"
"U" -> "0111"
"E" -> "1111"
"Y" -> "010"
"I" -> "110"
霍夫曼树如下:
http://s16/mw690/002i2qf8zy7dFpsssZFdf&690
对于给定的压缩码100111101101111001011100
从树根开始,如果1则走向右子树,如果是0走左子树,直到找到叶节点(包含字符)。然后回到树根,重复上面过程直到压缩码被解压完成。
下图为上述过程的一个例子:
实现如下:
public String decode(String encoded) {
""; String result =
HuffmanNode currentNode = root;
(int i = 0; i < encoded.length(); i++) { for
if(currentNode.leftNode == null && currentNode.rightNode == null){
; result = result + currentNode.getCharater()
currentNode = root;
}
; String binaryNum = String.valueOf(encoded.charAt(i))
(ZERO.equals(binaryNum)) { if
leftNode; currentNode = currentNode.
}
else {
rightNode; currentNode = currentNode.
}
if(currentNode.leftNode == null && currentNode.rightNode == null){ }
; result = result + currentNode.getCharater()
}
return result;
}
四、全部实现代码
public class HuffmanTree {
private static String ZERO = "0";
String ONE = "1"; private static
HuffmanNode root; private
Map, Integer> frequencyMap = new HashMap<>(); private
Map, String> encodeTable = new HashMap<>(); private
Map, Integer> getFrequencyMap() { public
return frequencyMap;
}
public void setFrequencyMap(String rawMessage) {
for(int i = 0; i < rawMessage.length(); i++) {
; String character = String.valueOf(rawMessage.charAt(i))
(null == frequencyMap.get(character)) { if
frequencyMap.put(character, 1);
}
else {
int count = frequencyMap.get(character);
frequencyMap.put(character, ++count);
}
public void buildHuffmanTree() { }
}
; PriorityQueue queue = initHuffmanQueue()
(queue.size() > 1) { while
; addHuffmanQueue(queue)
}
root = queue.poll();
}
public String decode(String encoded) {
""; String result =
HuffmanNode currentNode = root;
(int i = 0; i < encoded.length(); i++) { for
if(currentNode.leftNode == null && currentNode.rightNode == null){
; result = result + currentNode.getCharater()
currentNode = root;
}
; String binaryNum = String.valueOf(encoded.charAt(i))
(ZERO.equals(binaryNum)) { if
leftNode; currentNode = currentNode.
}
else {
rightNode; currentNode = currentNode.
}
if(currentNode.leftNode == null && currentNode.rightNode == null){ }
; result = result + currentNode.getCharater()
}
return result;
}
public String encode(String rawMessage) {
""; String result =
Map, String> encodeTable = getEncodeTable();
(int i = 0; i < rawMessage.length(); i++) { for
; String character = String.valueOf(rawMessage.charAt(i))
result += encodeTable.get(character);
}
return result;
}
public Map, String> getEncodeTable() {
if(encodeTable.isEmpty()) {
root, ""); generateEncodeTable(
}
return encodeTable;
}
private void generateEncodeTable(HuffmanNode node, String binaryNum) {
if(node.leftNode == null && node.rightNode == null) {
encodeTable.put(node.getCharater(), binaryNum);
} return;
leftNode, binaryNum + ZERO); generateEncodeTable(node.
generateEncodeTable(node.rightNode, binaryNum + ONE);
}
private PriorityQueue initHuffmanQueue() {
new PriorityQueue<>(); PriorityQueue result =
(Map.Entry, Integer> entry : for frequencyMap.entrySet()) {
new HuffmanNode(); HuffmanNode node =
node.setCharater(entry.getKey());
node.setWeight(entry.getValue());
result.add(node);
}
return result;
}
private void addHuffmanQueue(PriorityQueue queue) {
; HuffmanNode first = queue.poll()
HuffmanNode second = queue.poll();
HuffmanNode node = new HuffmanNode();
node.leftNode = first;
node.rightNode = second;
node.setWeight(first.getWeight() + second.getWeight());
queue.add(node);
}
public HuffmanNode getRoot() {
return root;
}
public void setRoot(HuffmanNode root) {
this.root = root;
}
public void inOrder(HuffmanNode node) {
if(node == null) {
return;
}
leftNode); inOrder(node.
System.out.println(node.getCharater() + " weight:" + node.getWeight());
inOrder(node.rightNode);
}
}