Basic Image Processing

In my most recent Introduction to Computer Programming (aka CS101), the instructor decided to show us a very simple image processing: blurring and secret image hiding. There are several things he did not mention because they are too complicated, but I think someone might be interested, so I decided to write it here.

I am not talking about how to load or save images – you can easily find documentation online for your favourite language. Our professors talked about two algorithms in class: blurring and secret image hiding. I am going to cover only these two here.

Blur an image

The most simple algorithm is to take an average of 8 neighbouring pixels and itself, which he demonstrated. But what can we go from this? Blurring can be also defined as a “convolution kernel”. Convolution kernel (more info on Wikipedia, in terms of image processing, is a matrix that gets convolved with the image. For example, this blurring algorithm can also be written as:

$$ M = \begin{bmatrix}1\over{9} & 1\over{9} & 1\over{9} \\[0.3em] 1\over{9} & 1\over{9} & 1\over{9} \\[0.3em] 1\over{9} & 1\over{9} & 1\over{9}\end{bmatrix} $$

More complicated kernels for other tasks can be easily found on Wikipedia.

Hiding secret image

The algorithm is simple: store a greyscale image inside the LSB (the least significant bits) of another image. But this actually has quite a few problems: first, converting the image to greyscale and preserving luminance is not easy; second, since 8-bit isn’t divisible by three, some colour channels of the hiding image may need to carry more bits than the other; and three, lossy image compression can kill data hidden by this technique.

Greyscale conversion

Our professors provide a very basic formula to do this, while easy, sadly very inaccurate:

$$ Y = {{R+G+B}\over{3}} $$

This does not preserve luminance at all because of how our eyes work. Each colour has a different contribution to the total luminosity. Moreover, sRGB colourspace (which is what most images are in) is gamma-compressed, which means that the value does not scale linearly to actual luminosity.

So to convert the sRGB image to greyscale while preserving luminosity, first we need to convert the RGB to a linear scale (R’G’B’) as defined by sRGB specification:

$$ C'=\begin{cases}\dfrac{C}{12.92}, & C\le0.0405\\\left(\dfrac{C+0.055}{1.055}\right)^{2.4}, & C\gt0.0405 \end{cases} $$

Or in Java code (as that is what we are learning):

public double decompressGamma(int color) {
    if (color < 0 || color > 255)
        return Double.NaN; // invalid number

    double c = color/255d;

    if (c <= 0.0405)
        return c/12.92;
    else
        return Math.pow((c+0.055)/1.055, 2.4);
}

(Note that all colour values in the maths formula have the range of [0, 1] while in the Java code, the integer has a range of [0, 255] and the double has a range of [0, 1])

After we have linear colour values nicely in floating point value now, the next step is to create the Y (luminance), which according to sRGB is defined by:

$$ Y' = 0.2126 R' + 0.7152 G' + 0.0722 B' $$

Or in Java code:

public double toLuminance(double r, double g, double b) {
    return 0.2126*r+0.7152*g+0.0722*b;
}

Next, we must compress the final luminance value back to the gamma-compressed value, by inversing the decompress operation above:

$$ Y=\begin{cases}12.92\ Y', & Y' \le 0.0031308\\1.055\ {Y'}^{\frac{1}{2.4}}-0.055, & Y' > 0.0031308. \end{cases} $$

Or in Java code:

public int compressGamma(double color) {
    if (color < 0d || color > 1d)
        return -1; // invalid number

    double result = -1;

    if (color <= 0.0031308)
        result = color*12.92;
    else
        result = Math.pow(color, 1d/2.4)*1.055-0.055;

    return (int) Math.round(result);
}

There! We finally have a luminance-preserved greyscale image!

Storing data in LSB bits

Next is how we hide the data in another image. If you look at the above luminance combining factor, you will notice that the factor for G is huge while B is very small. This is because our eyes perceive green color very well but percept blue color horribly. We need to split 8 bits into three sections: using this information, I think we should split bits to be stored in (R,G,B) as (2,2,4) or (3,1,4). This is to maximise the green data and minimise the blue data precision.

Finally, one must know that lossy image compression such as JPEG works on the same principle as us: the LSB part of the image is not so important, so these areas are destroyed. Thus, you must save your image in lossless formats such as BMP, PNG or TIFF.

That’s all for today! Thank you for reading!