Posts on Computer science

Run-time limits of comparison-based sorting algorithms


Posted by Diego Assencio on 2017.10.01 under Computer science (Algorithms)

Let $T(n)$ be the run-time of some comparison-based sorting algorithm for a sequence of length $n$. By "comparison-based", we mean a sorting algorithm which accesses input array elements only via comparisons as is the case for general-purpose sorting algorithms such as merge sort, quicksort, and heapsort. Such an algorithm does not know in advance which values the input array can store (as is for instance the case with bucket sort), but simply orders them using a predefined "less than" comparison function. In this post, we will show that regardless of how well designed an algorithm of this type is, its worst-case running time cannot be faster than $c n\log_2(n)$ as $n$ becomes large (for some constant $c \gt 0$). In more technical words, there exist constants $n_0 \geq 0$ and $c \gt 0$ such that $T(n) \geq cn\log_2(n)$ for $n \geq n_0$, and therefore $T(n) = \Omega(n\log_2(n))$.

This post will only deal with deterministic algorithms, but the results extend to randomized algorithms as well. The main idea of the proof is to consider only input arrays which are permutations of $S_n = \{1, 2, \ldots, n\}$. The total number of distinct arrays which represent permutations of $S_n$ is $n!$.

For a given deterministic, comparison-based sorting algorithm, let $k$ be the largest number of comparisons within which it can sort any of the $n!$ arrays mentioned above. Because the execution of the algorithm is based on the results of the value comparisons it performs, there are at most $2^k$ possible distinct execution paths for it. To clarify, the algorithm sorts an input array by repeatedly comparing pairs of values and deciding on how to proceed depending on which value is greater according to the comparison function. Each comparison generates two possible branches of execution, and since we know the total number of comparisons performed cannot exceed $k$, there can be at most $2^k$ distinct execution paths for the arrays we are considering.

Suppose now that $2^k \lt n!$. If that is the case, then the number of possible execution paths of the algorithm is smaller than the total number of input arrays we are considering. Therefore, there must be at least two distinct input arrays for which the algorithm executes following the exact same path (this is referred to as the pigeonhole principle in mathematics). This implies that when the algorithm is applied to these arrays, the same transformations (e.g. elements swaps) are applied to each of them, with the final result being the same: the values of $S_n$ ordered according to the predefined comparison function. This is not possible since the algorithm is deterministic: by inverting its steps from the final (sorted) result, we cannot arrive at two distinct initial inputs. Therefore $2^k \geq n!$.

The final step in the proof needs a simple fact about $n!$: $$ \displaystyle n! = n(n-1)(n-2)\ldots\frac{n}{2}\left(\frac{n}{2} - 1\right) \ldots1 \geq \left(\frac{n}{2}\right)^{\frac{n}{2}} $$ Hence, since $2^k \geq n!$: $$ \log_2(2^k) \geq \log_2(n!) \Longrightarrow k \geq \frac{n}{2}\log\left(\frac{n}{2}\right) \Longrightarrow k = \Omega(n\log_2(n)) $$ Therefore, the best lower bound for the worst-case running time of the sorting algorithm is $T(n) = \Omega(n\log_2(n))$.

Comments (0) Direct link

The DFT of an image: phase vs. magnitude


Posted by Diego Assencio on 2017.09.01 under Computer science (Digital signal processing)

Let $X$ be an $M \times N$ matrix (a two-dimensional array) of complex numbers $x_{mn}$ for $m = 0, 1, \ldots, M - 1$ and $n = 0, 1, \ldots, N - 1$. The discrete Fourier transform of $X$ is another $M \times N$ matrix $\tilde{X}$ of complex numbers $\tilde{x}_{pq}$ such that: $$ \tilde{x}_{pq} = \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} x_{mn} e^{-i2\pi pm/M} e^{-i2\pi qn/N} \label{post_81059bc883bd6156c2e83f83e2e38c2b_dft_2d} $$ for $p = 0, 1, \ldots, M - 1$ and $q = 0, 1, \ldots, N - 1$. The original matrix $X$ can be recovered from $\tilde{X}$ through its inverse discrete Fourier transform: $$ x_{mn} = \displaystyle\frac{1}{MN} \sum_{p = 0}^{M - 1} \sum_{q = 0}^{N - 1} \tilde{x}_{pq} e^{i2\pi pm/M} e^{i2\pi qn/N} $$ Since a grayscale image can be represented as a matrix whose elements are 8-bit integers representing pixel values ranging from $0$ (black) to $255$ (white), we can compute its DFT using equation \eqref{post_81059bc883bd6156c2e83f83e2e38c2b_dft_2d}. Interestingly, the magnitude of the DFT of such an image does not contain much information, but its phase does. To clarify what this means, let us define two $M \times N$ matrices $\tilde{X}_{\textrm{mag}}$ and $\tilde{X}_{\textrm{phase}}$ whose elements are given by: $$ \begin{eqnarray} (\tilde{X}_{\textrm{mag}})_{pq} &=& \left|\tilde{x}_{pq}\right| \label{post_81059bc883bd6156c2e83f83e2e38c2b_M}\\[5pt] (\tilde{X}_{\textrm{phase}})_{pq} &=& \tilde{x}_{pq} / \left|\tilde{x}_{pq}\right| \label{post_81059bc883bd6156c2e83f83e2e38c2b_P} \end{eqnarray} $$ Being a complex number, $\tilde{x}_{pq} = \rho_{pq} e^{i\phi_{pq}}$, where $\rho_{pq}$ is its magnitude and $\phi_{pq}$ is its phase. From equations \eqref{post_81059bc883bd6156c2e83f83e2e38c2b_M} and \eqref{post_81059bc883bd6156c2e83f83e2e38c2b_P}, we have that $(\tilde{X}_{\textrm{mag}})_{pq} = \rho_{pq}$ and $(\tilde{X}_{\textrm{phase}})_{pq} = e^{i\phi_{pq}}$, so the matrices $\tilde{X}_{\textrm{mag}}$ and $\tilde{X}_{\textrm{phase}}$ contain only magnitude and phase information of the elements of $\tilde{X}$ respectively.

If we compute the inverse DFTs of $\tilde{X}_{\textrm{mag}}$ and $\tilde{X}_{\textrm{phase}}$, what type of "images" will we recover? Our original image had only real values, but these inverse DFTs will in general not, so we will consider only their real components. Additionally, the pixel values of the resulting images will not necessarily fall within the closed interval $[0, 255]$, but we can clip the negative values (set them to zero) and rescale the nonnegative values to have them fall within this interval. An example of what the resulting images look like is shown on figure 1 (all figures in this post were generated using the script available here).

Fig. 1: An original image (left) and the images produced by computing the inverse DFTs of the magnitude (center) and phase (right) components of the original image's DFT.

From figure 1, we can see that the inverse DFT of the magnitude matrix $\tilde{X}_{\textrm{mag}}$ produces a nearly black image, but the inverse DFT of the phase matrix $\tilde{X}_{\textrm{phase}}$ shows well-defined contours from the original image (if you cannot see them, try increasing the brightness of your screen or click on the figure to see a larger version of it). This indicates that the phase component of the image's DFT carries more information than the magnitude component. To better see this, we can apply histogram equalization to the images produced from the inverse DFTs. The results, shown in figure 2, serve as further confirmation of our hypothesis.

Fig. 2: An original image (left) and the images produced (after histogram equalization) by computing the inverse DFTs of the magnitude (center) and phase (right) components of the original image's DFT.

Comments (1) Direct link

Solving Sudoku puzzles using linear programming


Posted by Diego Assencio on 2017.02.28 under Computer science (Linear programming)

In this post, I will show how solving a Sudoku puzzle is equivalent to solving an integer linear programming (ILP) problem. This equivalence allows us to solve a Sudoku puzzle using any of the many freely available ILP solvers; an implementation of a solver (in Python 3) which follows the formulation described in this post can be found found here.

A Sudoku puzzle is an $N \times N$ grid divided in blocks of size $m \times n$, i.e., each block contains $m$ rows and $n$ columns, with $N = mn$ since the number of cells in a block is the same as the number of rows/columns on the puzzle. The most commonly known version of Sudoku is a $9 \times 9$ grid (i.e., $N = 9$) with $3 \times 3$ blocks (i.e., $m = n = 3$). Initially, a cell can be either empty or contain an integer value in the interval $[1,N]$; non-empty cells are fixed and cannot be modified as the puzzle is solved. The rules for solving the puzzle are:

1.each integer value $k \in [1,N]$ must appear exactly once in each row
2.each integer value $k \in [1,N]$ must appear exactly once in each column
3.each integer value $k \in [1,N]$ must appear exactly once in each block.

Each rule above individually implies that every cell of the puzzle will have a number assigned to it when the puzzle is solved, and that each row/column/block of a solved the puzzle will represent some permutation of the sequence $\{1, 2, \ldots, N\}$.

The version of Sudoku outlined by these rules is the standard one. There are many variants of Sudoku, but I hope that after reading this post, you will realize that the these variants can also be expressed as ILP problems using the same ideas presented here.

The rules above can be directly expressed as constraints of an ILP problem. Our formulation will be such that the constraints will enforce everything needed to determine a valid solution to the puzzle, and the objective function will therefore be irrelevant since any point which satisfies the constraints will represent a solution to the problem (notice, however, that some Sudoku puzzles may contain more than one solution, but I am assuming the reader will be content with finding a single one of those). Therefore, our objective function will be simply ${\bf 0}^T{\bf x} = 0$, where ${\bf 0}$ is a vector with all elements set to $0$ and ${\bf x}$ is a vector representing all variables used in the ILP formulation below.

The puzzle grid can be represented as an $N \times N$ matrix, and each grid cell can be naturally assigned a pair of indices $(i,j)$, with $i$ and $j$ representing the cell row and column respectively (see figure 1). The top-left grid cell has $(i,j) = (1,1)$, and the bottom-right one has $(i,j) = (N,N)$, with $i$ increasing downwards and $j$ increasing towards the right.

Fig. 1: A Sudoku puzzle with blocks of size $m \times n = 2 \times 3$. The cell indices $(i,j)$ are shown inside every cell, and the block indices $(I,J)$ are shown on the left and top sides of the grid respectively. Both the height (number of rows) and width (number of columns) of the puzzle are equal to $N = m n = 6$. The puzzle has $n = 3$ blocks along the vertical direction and $m = 2$ blocks along the horizontal direction.

Let us then define $N^3$ variables as follows: $x_{ijk}$ is an integer variable which is restricted to be either $0$ or $1$, with $1$ meaning the value at cell $(i,j)$ is equal to $k$, and $0$ meaning the value at cell $(i,j)$ is not $k$. Rule (1) above, i.e., the requirement that each $k \in [1,N]$ must appear exactly once per row, can be expressed as: $$ \sum_{j=1}^N x_{ijk} = 1 \quad \textrm{ for } \quad i,k = 1,2,\ldots,N \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_row_constraint} $$ In other words, for a fixed row $i$ and a fixed $k \in [1,N]$, only a single $x_{ijk}$ will be $1$ on that row for $j = 1, 2, \ldots, N$.

If the fact that the constraints above do not have any "$\leq$" is bothering you, remind yourself of the fact that $x = a$ can be expressed as $a \leq x \leq a$, which in turn is equivalent to the combination of $-x \leq -a$ and $x \leq a$, i.e., any equality constraint can be expressed as a pair of "less-than-or-equal-to" constraints like the ones we need in linear programming problems.

Rule (2), i.e., the requirement that each $k \in [1,N]$ must appear exactly once per column, can be expressed as: $$ \sum_{i=1}^N x_{ijk} = 1 \quad \textrm{ for } \quad j,k = 1,2,\ldots,N \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_column_constraint} $$ Expressing rule (3), i.e., the requirement that each $k \in [1,N]$ must appear exactly once per block, is a bit more complicated. A way to simplify this task is by assigning pairs of indices $(I, J)$ to each block in a similar way as we did for cells (see figure 1): $(I,J) = (1,1)$ represents the top-left block, and $(I, J) = (n,m)$ represents the bottom-right block, with $I$ increasing downwards and ranging from $1$ to $n$ (there are $n$ blocks along the vertical direction) and $J$ increasing towards the right and ranging from $1$ to $m$ (there are $m$ blocks along the horizontal direction). Block $(I,J)$ will therefore contain cells with row indices $i = (I-1)m + 1, \ldots, Im$ and column indices $j = (J-1)n + 1, \ldots, Jn$. Therefore, rule (3) can be expressed as: $$ \sum_{i=(I-1)m + 1}^{Im} \sum_{j=(J-1)n + 1}^{Jn} x_{ijk} = 1 \quad \textrm{ for }\quad \left\{ \begin{matrix} \; I = 1,2,\ldots,n \\[5pt] \; J = 1,2,\ldots,m \\[5pt] \; k = 1,2,\ldots,N \end{matrix} \right. \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_block_constraint} $$ Notice that both equations \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_row_constraint} and \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_column_constraint} represent $N^2$ constraints each. As it turns out, equation \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_block_constraint} contains $nmN = N^2$ constraints as well.

So far, our formulation does not prevent $x_{ijk}$ from being equal to $1$ for two or more distinct values $k$ at the same cell $(i,j)$. We need therefore to impose these constraints by hand: $$ \sum_{k=1}^N x_{ijk} = 1 \quad \textrm{ for } \quad i,j = 1,2,\ldots,N \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_cell_constraint} $$ Not all cells are initially empty on a Sudoku puzzle. Some cells will already contain values at the beginning, and those values are necessary so that the solution to the puzzle can be deduced logically (ideally, there should be a single valid solution). Let $\mathcal{C}$ be the set of tuples $(i,j,k)$ representing the fact that a cell $(i,j)$ contains the value $k$ at the beginning. We then have: $$ x_{ijk} = 1 \quad \textrm{ for } (i,j,k) \in \mathcal{C} \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_initial_puzzle_constraint} $$ Our last set of constraints limits the values which each variable $x_{ijk}$ can take: it's either $0$ or $1$ (our ILP formulation then technically defines a binary integer linear programming problem, or BILP for short): $$ 0 \leq x_{ijk} \leq 1 \quad \textrm{ for } \quad i,j,k = 1,2,\ldots,N \label{post_25ea1e49ca59de51b4ef6885dcc3ee3b_binary_constraint} $$ Since most ILP solvers allow bounds on the values which each $x_{ijk}$ can take to be set directly, this last set of constraints often does not need to be specified in the same manner as the previous ones.

We now have a complete ILP formulation of a Sudoku puzzle: our goal is to minimize the objective function $f(x_{111}, \ldots, x_{ijk}, \ldots x_{NNN}) = 0$ subject to all constraints specified on equations \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_row_constraint}, \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_column_constraint}, \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_block_constraint}, \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_cell_constraint}, \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_initial_puzzle_constraint} and \eqref{post_25ea1e49ca59de51b4ef6885dcc3ee3b_binary_constraint}.

After solving the ILP problem outlined above, the solution to the Sudoku puzzle can be constructed directly by placing, at each cell $(i,j)$, the value $k$ such that $x_{ijk} = 1$.

Comments (1) Direct link