Image compression: the singular value decomposition

Posted by Diego Assencio on 2014.03.07 under Computer science (Numerical methods)

In a previous post, I described how one can use the singular value decomposition of a matrix to represent it in a compressed form. The trick was to discard information (singular values) from the original matrix to generate an "approximate" version of it. This post describes how that technique can be used to also compress images. The method is shown for educational purposes only: it is not adequate for professional image compression.

I will use octave to illustrate the method. On Ubuntu/Debian, you can install octave by opening a terminal and running the following command (the module octave-image must also be installed):

sudo apt-get install octave octave-image


Now start octave. Images can be read in octave with the function imread:

I = imread("dog.jpg");

 Fig. 1: Sample picture with $425 \times 425$ pixels.

The command above will read the file dog.jpg (shown on figure 1) and store it as a set of three matrices on the object $I$. Each matrix contains the amount of red, green and blue on each pixel of the original image respectively. Their entries are one-byte integers which can take any value in the range $[0,255]$. To be clear, $I(i,j,n)$ contains the amount of the color $n$ on the pixel $(i,j)$, where $n = 1,2,3$ for red, green and blue respectively (pixel $(1,1)$ is the one on the top left of the image). For a fixed $n$ and the sample image used, $I(:,:,n)$ is a $425 \times 425$ matrix.

Before we actually do image compression using $I$, let's first work on a simpler example: the grayscale version of the image. Octave has a built-in function which converts colored images to grayscale:

J = rgb2gray(I);


The created object $J$ is a $425 \times 425$ matrix which contains the grayscale intensity of each pixel of the original image.

The function defined below will be used to compress matrices in the same way as described in the previous post:

# A = input matrix, N = number of singular values to keep
function [Uc, Sigmac, Vc] = compress_matrix(A, N)
[U, Sigma, V] = svd(A);
Uc = U(:, 1:N);
Sigmac = Sigma(1:N, 1:N);
Vc = V(:, 1:N);
end


NOTE: each entry of $J$ is a one-byte integer, but each entry of $U_c$, $\Sigma_c$ or $V_c$ is a double (8 bytes). This means the total number of stored matrix entries for these three matrices is not something we can directly compare with $N_J := \textrm{size}(J) = 425^2$. However, for the purposes of this post this will not be an issue since our goal is to compress images and not matrices.

Let's go ahead and compress $J$ by keeping only its $N = 50$ largest singular values:

[Uc, Sigmac, Vc] = compress_matrix(J, 50);
Jc = uint8(Uc * Sigmac * Vc');


The values of the matrix $J_c = U_c\Sigma_c V_c^T$ must first be converted from doubles back to one-byte integers to correctly represent grayscale intensities. The smaller the value of $N$, the less information from the original image we preserve, so discarding too many singular values might yield a compressed image with very poor quality. To visualize the compressed image, run:

imshow(Jc)


To visualize both the original and compressed images next to each other, run:

figure
subplot(1,2,1)
xlabel("original")
imshow(J)
subplot(1,2,2)
xlabel("compressed")
imshow(Jc)


Figure 2 shows the grayscale version of the original image and also compressed versions for several values of $N$ (for an animation showing the images for $N = 50...1$, click here):

 Fig. 2: Grayscale versions of the original picture. From left to right and top to bottom, the pictures show: the original image and compressed versions with $N = 100$, $50$, $25$, $10$, $5$. Notice how the main aspects of the image are preserved even for small values of $N$. Unsurprisingly, however, the dog becomes less recognizable as $N$ is decreased.

The question which must be asked at this point is: would the size of the generated image $J_c$ be significantly different from the original image $J$ if both were stored as JPEG files? The answer is not a clear "yes" since the JPEG compression mechanism is not the same as the one we used here and therefore it might not be able to benefit much from our "matrix compression" trick. To see what happens, let's store both $J$ and $J_c$ as JPEG files with maximum quality and check the resulting file sizes:

imwrite(J, "dog-gray-original.jpg", 'Quality', 100)
imwrite(Jc, "dog-gray-compressed.jpg", 'Quality', 100)


Table 1 shows the size of the resulting JPEG files for different values of $N$. The size of the file dog-gray-original.jpg is $119.9\textrm{kB}$. Taking $N = 425$ means keeping all singular values; in this case, the size of both JPEG files ("original" and "compressed") are the same:

$N$JPEG size (in kB)
$425$$119.9 100$$118.2$
$50$$108.3 25$$93.4$
$10$$71.9 5$$59.9$
 Table 1: Sizes of the generated JPEG images for different values of $N$.

As table 1 shows, the generated JPEG images become smaller as we discard more singular values. However, to generate an image with only $50\%$ of the size of the original, we must keep only $N = 5$ singular values. The resulting image is unfortunately quite blurred at that point (see figure 2).

Back to the original colored image

We can now proceed to compress the original colored image (see figure 1). Remember that $I(:,:,n)$ is a $425 \times 425$ matrix which contains the intensity of red, green and blue at each pixel of the original image for $n = 1, 2, 3$ respectively:

N = 50

[U1c, Sigma1c, V1c] = compress_matrix(I(:,:,1), N);
[U2c, Sigma2c, V2c] = compress_matrix(I(:,:,2), N);
[U3c, Sigma3c, V3c] = compress_matrix(I(:,:,3), N);

Ic = uint8( zeros(size(I,1), size(I,2), 3) );

Ic(:,:,1) = uint8(U1c * Sigma1c * V1c');
Ic(:,:,2) = uint8(U2c * Sigma2c * V2c');
Ic(:,:,3) = uint8(U3c * Sigma3c * V3c');


Figure 3 shows the colored version of the original image and also compressed versions for several values of $N$ (for an animation showing the images for $N = 50...1$, click here):

 Fig. 3: Colored versions of the original picture. From left to right and top to bottom, the pictures show: the original image and compressed versions with $N = 100$, $50$, $25$, $10$, $5$.

As we did above, we can compare the sizes of the original and compressed images by storing them both as JPEG files (we must store the original image too to make sure both are generated with the same JPEG quality):

imwrite(I, "dog-color-original.jpg", 'Quality', 100)
imwrite(Ic, "dog-color-compressed.jpg", 'Quality', 100)


Table 2 shows the size of the resulting JPEG files for different values of $N$. The size of the file dog-color-original.jpg is $195.6\textrm{kB}$. Taking $N = 425$ means keeping all singular values; in this case, the size of both JPEG files ("original" and "compressed") are the same:

$N$JPEG size (in kB)
$425$$195.6 300$$209.2$
$200$$233.4 100$$253.8$
$50$$248.3 25$$225.1$
$10$$175.8 5$$153.5$
 Table 2: Sizes of the generated JPEG images for different values of $N$.

Strangely, the size of the compressed image grows as we reduce the value of $N$ but starts shrinking quickly for $N \leq 50$. This indicates that the JPEG compression mechanism does not always work well with the compression technique presented in this post.