Wednesday 31 December 2014

Fractal Image Compression Based on Iterated Function Theory

Introduction
The development of the Internet and digital imaging necessitates a staggering amount of data to represent modern images. This in turn results in larger storage space and longer time for transmission over computer networks. To address these two problems, there has been a growing interest among researchers and companies in devising effective image compression algorithms. Introduced by Michael Barnsley in 1988, Fractal image compression has been recognised and is now one of the most pervasive coding methods.

History of Fractal Image Compression
In 1982, Benoit B. Mandelbrot, an IBM Mathematician, pioneered a new geometry: fractal geometry. Fractal geometry and traditional geometry differ in that the former can represent the geometry of natural objects such as trees and clouds while the latter cannot. With this emerging branch of mathematics, computer scientists have at their command the tools for simulating artificial but natural looking landscapes. In 1988, Michael Barnsley, the author of Fractals Everywhere, discusses the mathematics of Iterated Function System (IFS) in his book, and proves the Collage Theorem. In essence, the theorem states what an Iterated Function System should be to represent an image. (Barnsley, 1988). This leads one to wonder: in reverse, can an Iterated Function System that generates the original image be constructed from an image? Barnsley proposed that storing images as Iterated Function System can lead to image compression. This is how this idea emerged.

Idea of Fractal Image Compression
Fractal Image Compression (FIC) is an image coding technology rested upon fractal geometry. Natural photographs have redundant information. Some parts are replica of certain parts. As such, they can be altered to imitate a picture that needs to be resized. With fractal image compression, the bit-by-bit storage of the image is represented by an iterated function, resulting in a dramatically less storage.

Iterated Function Systems
IFS theory utilises affine transformations to relate different parts of an image. A combination of rotation, scaling, translation is known as an affine transformation. Using solely such transformations, sophisticated pictures can be defined.

Mathematical notation
$x_{n+1}=a\cdot x_n$ is a scaling transformation. When $0<a<1$, then $x_{n+1}<x_n$ and x is scaled down. When $a>1$, then x is scaled up. Finally, when $-1<a<0$, x jumps back and forth in ever decreasing jumps until it settles at the origin, which is the attractor.
Adding translation is expressed by $x_{n+1}=a\cdot x_n+b$, where b is a real number.
A general form for an affine transformation: $W\begin{pmatrix} x \\ y \end{pmatrix}=\begin{pmatrix} a&b \\ c&d \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix}+\begin{pmatrix} e \\ f \end{pmatrix}=\begin{pmatrix} ax+by+e \\ cx+dy+f \end{pmatrix}$,
where W represents affine transformation.

To find an affine transformation, one needs to find six coefficients a, b, c, d, e and f such that the transformation W obeys: W(original image) ~ desired image. The first step is to introduce x, y coordinate axes. Secondly, randomly choose and mark three points on the original image. Then, mark the respective points on desired image and determine their coordinates.

The coefficients a, b, e can be determined by solving the three linear equations:

$\alpha_1 a+\alpha_2 b+e=\widetilde{\alpha_1}$
$\beta_1 a+\beta_2 b+e=\widetilde{\beta_1}$
$\gamma_1 a+\gamma_2 b+e=\widetilde{\gamma_1}$

The coefficients c, d, f are found similarly from the following equations:

$\alpha_1 c+\alpha_2 d+f=\widetilde{\alpha_2}$
$\beta_1 c+\beta_2 d+f=\widetilde{\beta_2}$
$\gamma_1 c+\gamma_2 d+f=\widetilde{\gamma_2}$

Equation solvers such as TK Solver Plus or Eureka can be used to find the coefficient values.

Traditional way of Fractal Image Compression
1. Dividing the image – Divide it into large and small blocks: N non-overlapping range blocks Ri and M domain blocks Di that can overlap with others. Those N range blocks should cover the whole image.
2. Matching – For each range block Ri, find a suitable domain block Di such that after applying a contractive transformation Wi, it resembles Ri. In other words, Wi(Di) = Ri.
3. Encoding – Find the best match Di for each Ri. Store the locations of Di and Ri, and the parameters of Wi. The goal is to find Di and Ri by minimizing the distance between them.
4. Decoding – Start with a random image which has the same size as the compressed image, and decode it. For a chosen M domain blocks Di, use a respective contractive transformation. This is the first iteration. In practice, the image will become stable after around ten iterations. When the attractor or final image is reached, the encoded image has been decoded. 

No comments:

Post a Comment