Questions:
1.
(10 marks) Suppose eight characters have a distribution
A:(1), B:(1),
C:(1), D:(2),
E:(3), F:(5),
G:(5), H:(10). Draw a Huffman tree
for this distribution. (Because the algorithm may group subtrees
with equal probability in a different order, your answer is not
strictly unique.)
Answer:
2.
(30 marks)
(a)
10 What is the entropy (η) of the image below, where numbers (0, 20, 50,
99) denote the gray-level intensities?
99 99 99 99 99 99 99 99
20
20
20
20
20
20
20
20
0
0
0
0
0
0
0
0
0
0
50
50
50
50
0
0
0
0
50
50
50
50
0
0
0
0
50
50
50
50
0
0
0
0
50
50
50
50
0
0
0
0
0
0
0
0
0
0
(b)
15 Show step by step how to construct the Huffman tree to
encode the above four intensity values in this image. Show the
resulting code for each intensity value.
(c)
5 What is the average number of bits needed for each pixel,
using your Huffman code? How does it compare
toη?
Answer:
(a)
P20
= P99 = 1/8; P50 = 1/4; P0=
1/2.
(b)
Only the final tree is shown below. Resulting
code: 0: “1”, 50:
“01”, 20: “000”, 99:
“001”
(c)
Average number of bits = 0.5 * 1 + 0.25 * 2 +2 * 0.125 * 3 = 1.75.
This happens to be
identical to η — it only happens
when all probabilities are 2k where
k is an integer.
Otherwise, this number will be larger than
η.
3.
(20 marks) Arithmetic Coding and Huffman Coding are two
popular lossless compression methods.
(a)
5 What are the advantages and disadvantages of Arithmetic
Coding as compared to Huffman Coding?
(b)
15 Suppose the
alphabet is [A, B, C], and the known
probability distribution is PA = 0.5; PB = 0.4; PC = 0.1. For
simplicity, let’s also assume that both encoder and decoder know
that the length of the messages is always 3, so there is no need
for a terminator.
i.
7 How many bits are needed to
encode the message BBB by Huffman coding?
ii.
8 How many bits are needed to
encode the message BBB by arithmetic coding?
Answer:
(a)
The main advantage of Arithmetic Coding over Huffman Coding
is that whereas the minimum code length for a symbol in Huffman
Coding is 1, since we create a binary tree with 0 or 1 attached to
each branch, in Arithmetic Coding the number of bits per symbol can
be fractional.
(b)
i.
6 bits. Huffman
Code: A - 0, B - 10, C - 11; or A - 1, B - 00, C -
01.
ii.
4.
(30 marks)
(a)
10 What are the advantages of Adaptive Huffman Coding
compared to the original Huffman Coding
algorithm?
(b)
20 Assume that the Adaptive Huffman Coding is used to code
an information source S with a vocabulary
of four letters (a, b, c, d). Before any transmission, the initial
coding is a = 00, b = 01, c = 10, d = 11. As in the example
illustrated in Fig. 7.7, a special symbol NEW will be sent before
any letter if it is to be sent the first time. Fig. 7.11 is the
Adaptive Huffman Tree after sending letters “aabb”. After that, the additional bitstream received by the
decoder for the next few letters is 01010010101.
i.
10 What are the additional
letters received?
ii.
10 Draw the adaptive Huffman
trees after each of the additional letters is
received.
Answer:
(a)
Like any other adaptive compression algorithms, it is more
dynamic, therefore offers better compression. Statistics are
gathered and updated dynamically as the datasteam arrives. As the
probability distribution of the received symbols changes, symbols
will be given new (longer or shorter) codes
It works even when
prior statistics of the data distribution is unavailable as it is
in most multimedia applications.
It also saves the
overhead since no symbol table needs to be
transmitted.
(b)
A
i.
The additional
letters received are “b (01) a (01) c (00 10) c
(101)”.
ii.
The trees are as
below.
5.
(10 marks) Consider the dictionary-based LZW compression
algorithm. Suppose the alphabet is the set of symbols
{0,1}. Show the dictionary (symbol sets plus
associated codes) and output for LZW compression of the
input
0 1 1 0 0 1 1
Answer:
With input 0 1 1 0 0
1 1, we have