Data Structures And Algorithms Data Structures and Algorithms 22:544:613:SEC03 & 26:198:685SEC03 Homework 4 Instructor: Farid Alizadeh Due Date: Sunday

Data Structures And Algorithms Data Structures and Algorithms

22:544:613:SEC03 & 26:198:685SEC03

Homework 4

Instructor: Farid Alizadeh
Due Date: Sunday December 12 , 2021, at 11:50PM

Note: This is a strict dealine

last updated on December 4, 2021

Please answer the following questions in electronic form and upload and
submit your files to the Canvas Assignment site before the due date. Make sure
to click on the submit button.

You have two choices for the format.

First Choice: Submit in pdf format all theoretical questions. You may ei-
ther typeset or hand-write your answers. In case of hand-write transform your
work into pdf using apps such as CamScan. You should have only a single
pdf file. For Python scripts, include the text of the script as a separate *.py
file. You should have a single file for Questions 2 & 7. For other questions, if
you use Python to compute your answers include it with the pdf text above.

Second Choice: You can use a single Python Notebook *.ipynb file. You
should use the text format of the notebook for the theoretical results, and the
Python script for the programming parts.


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

1. Consider the following set of symbols, along with their frequency of oc-
currence in a document:

‘a’ ‘b’ ‘c’ ‘d’ ‘e’
5 10 3 17 24

Step by step show how the Huffman coding will work in this case. Show
the status of the binary tree at each step. Show the final tree and the final
binary encoding of each symbol.

2. The knapsack problem is an optimization problem that attempts to find
the best combination of items. To motivate the problem, assume that a
vendor-hiker can carry a number of items with him for a steep hike. He
plans to sell these items (for example, bottles of water and soda, cookies,
snacks, etc.) The items are numbered 1 through n. Each item i has selling
price pi and a weight of wi. The hiker can only carry with him a total
weight of W kilograms. Also for each item he has only one, so he has
to decide whether to take the item or not. Therefore, the problem is for
item i whether to take it with him (xi = 1) or not (xi = 0.) so that he
maximizes his revenue, without the total weights of items he carries with
hom exceeding his total weight of W. We assume that demand is high
and the hiker can sell all of his items.

This problem can be formulated as an optimization problem as follows:

max p1x1 + · · ·+ pnxn
s.t. w1x1 + · · ·+ wnxn ≤ W

xi ∈ {0, 1}

For example, let us assume that the items are as in the following table:

item: water bottle bottle of soda pack of chips pack of trail mix
value: $4 $3 $2.5 $1.2

weight: 4 3 2 1

And let’s assume that the hiker can carry a total of 5kg. Then we want
to solve how many of each item the hiker wishes to take with him to get
the maximum revenue.

2a) Propose a greedy algorithm for solving this problem. What is the
complexity of your algorithm? Give an example that shows that your
greedy algorithm actually does not always find the optimal solution.


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

2b) Now develop a dynamic programming approach. To do so, since our
goal is to find the optimal values of x1, x2, . . . , xn, define

V(k, w) = the best solution if we can take only

items 1…k and can have maximum value weight of w

Here by “items 1 … k” we mean some arbitrary order which is set at
the beginning and is fixed throughout the execution of the algorithm.
First, find what is the value V(1, w). In other words, if we could only
carry item 1 with weight w1 and price p1, then how may of this item
would we carry and what would be our revenue?

2c) Now show that the following recurrence relation is correct:

V(k, w) = max
V(k − 1, w), V(k, w − wk) + pk

In words: The optimal value of using the first k items and weight
limit of w is

• either equal to the optimal value using the first k−1 items (that
is the new kth item is not used at all,)

• or it uses the kth item at least once, so the remainder now is
using the items 1 through k but with only weight limit of w−wk.
This remainder must also be optimal for these parameters (the
dynamic programming principle.)

2d) The recurrence above gives only the optimal value of revenue. To
find the actual value of how many of each item we need to use,
that is to determine the variables xi, we first need to set up an-
other variable x(k, w) indicating whether item k is taken or not, when
maximum weight capacity is w and only the first k items are used.
Then x(k, w) = 1 if item k is used with maximum capacity w, and
x(k, w) = 0 when item k is not used. The decision whether to use
item k or not is made when building the value table V(k, w).

From the values of x(k, w) it is possible to determine x1, x2, . . . , xn.

2e) Using the recurrence for V(k, w), and the definition of x(k, w) devise a
dynamic programming algorithm as follows and apply your algorithm
to the example given above along with the table.

• First, set up a values table to record V(k, w) with rows indexed
from 1 to k. In row i we can only use items 1 through i. The
columns are from 1 to w (or actually from v = mini wi, because if
the hiker’s maximum capacity is less than the lightest item, then
he cannot take anything with him.)

• Then explain how you would fill this table row by row, starting
from row 1. In the table in cell [k, w] you should write both
x(k, w) and the optimal value V(k, w).


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

p k ↓ w → 1 2 3 4 5
3 1 x, v x, v x, v x, v x, v
3 2 x, v x, v x, v x, v x, v
2.5 3 x, v x, v x, v x, v x, v
1.2 4 x, v x, v x, v x, v x, v

Here, in each cell k, w, the red item is x(k, w) and the black item is
V(k, w)

2f) Explain how from the values of x(k, w) in the table you can recover
the actual values of xi the amount of each item to take. (Hint: Start
from the last cell in the table and work backwards.)

2g) Give a time complexity analysis of the dynamic programming algo-
rithm. Is your algorithm polynomial time?

2h) Write a Python function knapsack(weights, prices, max weight)
which takes a list of weights, a list of prices, and a maximum capacity
and returns an integer list amount which indicates the amount xi of
each item taken, the total revenue rev, the vTble and xTbl the tables
for the x(k, w) and V(k, w). Inside the function you should set up the
tables for V(k, w) and x(k, w) and carry out the computations as
explained above.

3. Consider the following graph with weights on the edges.


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

















1 6





3a) show the adjacency list representation of the graph.

3b) Run Kruskal’s algorithm to find the smallest spanning tree. Assume
the set of edges are sorted in increasing order of weight (so, no heap
is needed.) At each iteration show the status of the forest as repre-
sented by the data structure for union-find operations. Show the tree
representation of sets only in tree format. Also show the status of
this tree representation after every union-find operation.

3c) Now run Prim’s “grow-a-tree” algorithm starting at node ‘s’. At
each iteration make sure that at each stage to show the partially
constructed tree, and the frontier set.

3d) Now apply Dijkstra’s algorithm find all the shortest paths from vertex
‘s’ to all other vertices. Again, at each stage show the shortest path
to the nodes discovered, and highlight the frontier set.

4. Run the Bellman-Ford algorithm to find the shortest path from vertex ‘a’
in the following graph to all other nodes, or show that there is a negative-
length cycle.

At each iteration show the ordered set of edges, and the relabeling and
the predecessor of each vertex. Use the lexical ordering of the edges, for


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

instance edge a-b comes before b-c etc. Determine if the graph has a
negative cycle or not, and if so, how does the Bellman-Ford algorithm
finds it.

a b c


2 -1




-1 8



5. Consider the following 2-3 tree:








c e i j l

5a) Insert the letter ‘i’ into this tree, and show the status of the tree step
by step until the final result. At each step also show the corresponding
red-black tree with red links leaning left. Show all the rotations and
change of color on the links which are needed.

5b) Apply DeleteMin operation on the resulting tree. Again show the
status of the 2-3 tree step by step. (No need to show the corresponding


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

red-black tree operations.) Make sure to show how you backtrack to
eliminate all the illegal “4-nodes” that may have arisen in the process.

6. Consider a hash table of length m = 11 where data are stored using
chaining. Using the multiplicative method, that is the hash function

h(k) = bm(kA mod 1)c with A =

5 − 1


6a) Insert the keys 1, 2, . . . , 20 into the hash table. Show the final status
of table. (You may write a very short script to find the hash values
of 1,2,. . . ,20. You may also run this homework using a Python script.
However, this is optional. If you do, include your Pythion code as

6b) Search for number 21, and show the cells you have to search to realize
it is not in the table.

6c) Delete 1 from the table and show the resulting table.

7. Mini-Python project: In this assignment you will run an experiment
to compare several hash functions. Suppose the set of items we are going
to store in the table is the set of integers [1…N]; of course with possible
repetitions. Suppose we allocate a table of length m for this purpose.

7a) The hash functions we are considering are:

h1(k) = bm(kA1 mod 1)c with A1 =

5 − 1


h2(k) = bm(kA2 mod 1)c with A2 = ln(2)
h3(k) = bm(kA3 mod 1)c with A3 = 0.7
h4(k) = k mod m

7b) Generate N random integers from the range [1, . . . , N] (with repeti-
tion) and put them in a list called L. Experiment with N = 1000, and
m = 11.

7c) Map the numbers in L to an index using the hash functions hi(k)
for i = 1, 2, 3, 4 as above. Put these values in lists called Hi for
i = 1, 2, 3, 4.

7d) Compute the histogram of each array, which produces the frequency
count of each index using the numpy.histogram function.


22:544:613SEC03 & 26:711:685SEC03, Fall 2021
Homework 4 Due date: 12/12/2021 at 11:50PM

7e) Plot each of the four histograms using matplotlib.pyplot function

7f) Compare these four histograms and see if they all “look” reasonably
uniform. Does any of them behave poorly?


Looking for this or a Similar Assignment? Click below to Place your Order