Thursday, March 19, 2009

Small survey of Objective Image Quality metrics

PDF version of this post is here

The importance of objective quality metric methods cannot be underestimated: such methods are being used in automated images restoration algorithms, for comparison of images compression algorithms and so on. Quality metrics are graphically presented in Fig. 1.

All proposed quality metrics can divided to two general classes¹: subjective and objective [2].

Subjective evaluation of images quality is oriented on Human Vision System (HVS). As it was mentioned in [3], the best way to assess the quality of an image is perhaps to look at it because human eyes are the ultimate receivers in most image processing environments. The subjective quality measurement Mean Opinion Score (MOS) has been used for many years.

Objective metrics include Mean Squared Error (MSE), or $L_p$-norm [4,5], and measures that are mimicking the HVS such as [6,7,8,9,10,11]. In particular, it is well known that a large number of neurons in the primary visual cortex are tuned to visual stimuli with specific spatial locations, frequencies, and orientations. Images quality metrics that incorporate perceptual quality measures by considering human visual system (HVS) were proposed in [12,13,14,15,16]. Image quality measure (IQM) that computes image quality based on the 2-D spatial frequency power spectrum of an image was proposed in [10]. But still such metrics have poor performance in real applications and widely criticized for not correlating well with perceived quality measurement [3].

As a promising techniques for images quality measure, Universal Quality Index [17,3], Structural SIMilarity index [18,19], and Multidimensional Quality Measure Using SVD [1] are worth to be mentioned ².

Figure 1: Types of images quality metrics.

So there are three objective methods of images' quality estimation to be discussed below: the UQI, the SSIM, and MQMuSVD. Brief information about main ideas of those metrics is given. But first of all, let me render homage to a mean squared error (MSE) metric.

A Good-Old MSE

Considering that $x={x_i | i = 1,2,\dots N}$ and $y={x_i | i = 1,2,\dots N}$ are two images, where N is the number of image's pixels, the MSE between these images is:

Of course, there is more general and well-suitable formulation of MSE for images processing given by Fienup [5]:

(2)

where

(3)

Such NRMSE metrics allows to estimate quality of images especially in various applications of digital deconvolution techniques. Although Eq. 2 is better than pure MSE, the NRMSE metric have been criticizing a lot.

As it was written in remarkable paper [19], the MSE is used commonly for many reasons. The MSE is simple, parameter-free, and easy to compute. Moreover, the MSE has clear physical meaning as the energy of the error signal. Such an energy measure is preserved after any orthogonal linear transformation, such as Fourier transform. The MSE is widely used in optimization tasks and in deconvolution problem [21,22,23]. Finally, competing algorithms have most often been compared using the MSE or Peak SNR ratio.

But problems arising when one is trying to predict human perception of image fidelity and quality using MSE. As it was shown in [19], the MSE is very similar despite the differences in image's distortions. That is why there were many attempts to overcome MSE's limitations and find a new images quality metrics. Some of them are briefly discussed below.

Multidimensional Quality Measure Using SVD

The new metric of images quality called ``Multidimensional Quality Measure Using SVD'' was proposed in [1]. The main idea is that every real matrix A can be decomposed into a product of 3 matrices A = USV^T, where U and V are orthogonal matrices, U^TU = I, V^TV = I, and $S = diag (s_1, s_2, \dots)$. The diagonal entries of S are called the singular values of A, the columns of U are called the left singular vectors of A, and the columns of V are called the right singular vectors of A. This decomposition is known as the Singular Value Decomposition (SVD) of A [24]. If the SVD is applied to the full images, we obtain a global measure whereas if a smaller block is used, we compute the local error in that block:

$s_i$ are the singular values of the original block, $\hat{s}_i$ are the singular values of the distorted block, and N is the block size. If the image size is $K$, we have $(K/N) \times (K/N)$ blocks. The set of distances, when displayed in a graph, represents a ``distortion map''.

A universal image quality index (UQI)

As a more promising new paradigm of images quality measurements, a universal image quality index was proposed in [17]. This images quality metric is based on the following idea:

The main function of the human eyes is to extract structural information from the viewing field, and the human visual system is highly adapted for this purpose. Therefore, a measurement of structural distortion should be a good approximation of perceived image distortion.

The key point of the new philosophy is the switch from error measurement to structural distortion measurement. So the problem is how to define and quantify structural distortions. First, let's define a necessary mathematics [17] for original image X and test image Y . The universal quality index can be written as [3]:

(5)

where

The first component is the linear correlation coefficient between x and y, i.e., this is a measure of loss of correlation. The second component measures how close the mean values are between x and y, i.e., luminance distortion. The third component measures how similar the variances of the signals are, i.e., contrast distortion.

UQI quality measurement method is applied to local regions using sliding window approach. For overall quality index to be obtained, average value of local quality indexes $Q_i$ must be calculated:

(6)

As it mentioned in [17], the average quality index UQI coincides with the mean subjective ranks of observers. That gives to researchers a very powerful tool for images' quality estimation.

Structural SIMilarity (SSIM) index

The Structural Similarity index (SSIM) that is proposed in [18] is a generalized form of a Universal Quality Index [17]. As above, $x$ and $y$ are discrete non-negative signals; $\mu_x$, $\sigma_{x}^2$, and $\sigma_{xy}$ are the mean value of $x$, the variance of $x$, and the covariance of $x$ and $y$, respectively. According to [18] the luminance, contrast, and structure comparison measures were given as follows:

			(7)
			(8)
			(9)

where $C_1$, $C_2$ and $C_3$ are small constants given by $C_1 = (K_1\cdot L)^2$ ; $C_2 = (K_2 \cdot L)^2$ and $C_3 = C_2/2$. Here $L$ is the dynamic range of the pixel values, and $K_1 \ll 1$ and $K_2 \ll 1$ are two scalar constants. The general form of the Structural SIMilarity (SSIM) index between signal x and y is defined as:

(10)

where $\alpha, \beta, \; \text{and} \; \gamma$ are parameters to define the relative importance of the three components [18]. If $\alpha= \beta= \gamma =1$, the resulting SSIM index is given by:

(11)

SSIM is maximal when two images are coinciding (i.e., SSIM is <=1 ). The universal image quality index proposed in [17] corresponds to the case of $C_1 = C_2 = 0$ , therefore is a special case of Eq. (11).

A drawback of the basic SSIM index is its sensitivity to relative translations, scalings and rotations of images [18]. To handle such situations, a waveletdomain version of SSIM, called the complex wavelet SSIM (CW-SSIM) index was developed [25]. The CWSSIM index is also inspired by the fact that local phase contains more structural information than magnitude in natural images [26], while rigid translations of image structures leads to consistent phase shifts.

Despite its simplicity, the SSIM index performs remarkably well [18] across a wide variety of image and distortion types as has been shown in intensive human studies [27].

Instead of conclusion

As it was said in [18], ``we hope to inspire signal processing engineers to rethink whether the MSE is truly the criterion of choice in their own theories and applications, and whether it is time to look for alternatives.'' And I think that such articles provide a great deal of precious information for making decision to give away the MSE.

Useful links:
A very good and brief survey of images quality metrics, with links to MATLAB examples. Zhou Wang's page with huge amount of articles and MATLAB source code for UQI and SSIM. Another useful link for HDR images quality metrics.

Bibliography

1: Aleksandr Shnayderman, Alexander Gusev, and Ahmet M. Eskicioglu.
A multidimensional image quality measure using singular value decomposition.
In Image Quality and System Performance. Edited by Miyake, Yoichi; Rasmussen, D. Rene. Proceedings of the SPIE, Volume 5294, pp. 82-92, 2003.
2: A. M. Eskicioglu and P. S. Fisher.
A survey of image quality measures for gray scale image compression.
In Proceedings of 1993 Space and Earth Science Data Compression Workshop, pp. 49-61, Snowbird, UT, April 2, 1993.
3: Ligang Lu Zhou Wang, Alan C. Bovik.
Why is image quality assessment so difficult?
In In: Proceedings of the ICASSP'02, vol. 4, pp. IV-3313-IV-3316., 2002.
4: W. K. Pratt.
Digital Image Processing.
John Wiley and Sons, Inc., USA, 1978.
5: J.R. Fienup.
Invariant error metrics for image reconstruction.
Applied Optics, No 32, 36:8352-57, 1997.
6: J. L. Mannos and D. J. Sakrison.
The effects of a visual fidelity criterion on the encoding of images,.
IEEE Transactions on Information Theory, Vol. 20, No. 4:525-536, July 1974.
7: J. O. Limb.
Distortion criteria of the human viewer.
IEEE Transactions on Systems, Man, and Cybernetics, Vol. 9, No. 12:778-793, December 1979.
8: H. Marmolin.
Subjective mse measures.
IEEE Transactions on Systems, Man, and Cybernetics, Vol. 16, No. 3:486-489, May/June 1986.
9: J. A. Saghri, P. S. Cheatham, and A. Habibi.
Image quality measure based on a human visual system model.
Optical Engineering, Vol. 28, No. 7:813-818, July 1989.
10: B. N. Norman and H. B. Brian.
Objective image quality measure derived from digital image power spectra.
Optical Engineering, 31(4):813-825, 1992.
11: A.A. Webster, C. T. Jones, M. H. Pinson, S. D. Voran, and S. Wolf.
An objective video quality assessment system based on human perception.
In Proceedings of SPIE, Vol. 1913, 1993.
12: T. N. Pappas and R. J. Safranek.
in book ``Handbook of Image and Video Processing'' (A.Bovik, ed.), chapter Perceptual criteria for image quality evaluation.
Academic Press, May 2000.
13: B. Girod.
in book Digital Images and Human Vision (A. B. Watson, ed.), chapter What's wrong with mean-squared error, pages 207-220.
the MIT press, 1993.
14: S. Daly.
The visible difference predictor: An algorithm for the assessment of image fidelity.
In in Proceedings of SPIE, vol. 1616, pp. 2-15, 1992.
15: A. B. Watson, J. Hu, and J. F. III. McGowan.
Digital video quality metric based on human vision.
Journal of Electronic Imaging, vol. 10, no. 1:20-29, 2001.
16: J.-B. Martens and L. Meesters.
Image dissimilarity.
Signal Processing, vol. 70:155-176, Nov. 1998.
17: Z. Wang and A.C. Bovik.
A universal image quality index.
IEEE Signal Processing Letters, vol. 9, no. 3:81-84, Mar. 2002.
18: Z. Wang, A.C. Bovik, H.R. Sheikh, and E.P. Simoncelli.
Image quality assessment: From error visibility to structural similarity.
IEEE Transactions on Image Processing, vol. 13, no. 4:600-612, Apr. 2004.
19: Zhou Wang and Alan C. Bovik.
Mean squared error: Love it or leave it?
IEEE Signal Processing Magazine, 98:98-117, January 2009.
20: D.M. Chandler and S.S. Hemami.
Vsnr: A wavelet-based visual signal-to-noise ratio for natural images.
IEEE Transactions on Image Processing, vol. 16, no. 9:2284-2298, Sept. 2007.
21: Wiener N.
The extrapolation, interpolation and smoothing of stationary time series.
New York: Wiley, page 163р, 1949.
22: J.R. Fienup.
Refined wiener-helstrom image reconstruction.
Annual Meeting of the Optical Society of America, Long Beach, CA, October 18, 2001.
23: James R. Fienup, Douglas K. Griffith, L. Harrington, A. M. Kowalczyk, Jason J. Miller, and James A. Mooney.
Comparison of reconstruction algorithms for images from sparse-aperture systems.
In Proc. SPIE, Image Reconstruction from Incomplete Data II, volume 4792, pages 1-8, 2002.
24: D. Kahaner, C. Moler, and S. Nash.
Numerical Methods and Software.
Prentice-Hall, Inc., 1989.
25: Z. Wang and E.P. Simoncelli.
Translation insensitive image similarity in complex wavelet domain.
In Proceedings of IEEE International Conference of Acoustics, Speech, and Signal Processing, pp. 573-576., Mar. 2005.
26: T.S. Huang, J.W. Burdett, and A.G. Deczky.
The importance of phase in image processing filters.
IEEE Transactions on Acoustic, Speech, and Signal Processing, vol. 23, no. 6:529-542, Dec. 1975.
27: H.R. Sheikh, M.F. Sabir, and A.C. Bovik.
A statistical evaluation of recent full reference image quality assessment algorithms.
IEEE Transactions on Image Processing, vol. 15, no. 11:3449-3451, Nov. 2006.

Monday, March 16, 2009

Interesting facts about snakes' vision

The more I know about vision systems of animals, the more I think that it is advisable to read and study biology and biophysics for specialists in artificial imagery. So the main topic of this post are snakes that have an ability of thermal vision.

Not all of snakes have an ability of heat vision, but some groups of pythons and rattlesnakes can see both in visible and in far-IR band [1]. Snakes use infra-red radiation with wavelengths centred on 10 micrometers (wavelength corresponds to emitted from warmblooded animals). As it was written in [1],

certain groups of snakes do what no other animals or artificial devices can do. They form detailed images of extremely small heat signatures. What is most fascinating is that they do this with receptors that are microscopic in size, extraordinarily sensitive, uncooled, and are able to repair themselves. Snake infra-red imagers are at least 10 times more sensitive than the best artificial infra-red sensors...[1]

Several papers give us better understanding of how snakes can actually see and attack preys using only heat vision. A brief survey of articles devoted to snakes vision as well as some thoughts are given further.

How does the snake see?

The detection system, which consists of cavities located on each side of the head called pit organs, operates on a principle similar to that of a pinhole camera [2]. Pit vipers and boids, the two snake types that possess this ability, have heat-sensitive membranes that can detect the difference in temperature between a moving prey and its surroundings on the scale of mK. If the radiation intensity hitting the membrane at some point is larger than the emitted thermal radiation of the membrane itself, the membrane heats up at that location [2]. The picture of such cavities is presented in Fig. 1.

Figure 1: Snake's heat vision: a) head of a pit viper with nostril, pit hole, and eye, left to right. Photograph courtesy of Guido Westhoff; b) A pit viper's infra-red-sensitive pit organ works like a pinhole camera. The image from the paper [2].

According to the Planck radiation law as an approximation of the emitted heat intensity, 99% of the radiation is emitted at wavelengths under 75 micrometers and the radiation intensity is maximal at 9.5micrometers [3], which is within the 8-12 micrometers IR atmospheric transmittance window [4].

Because the pit hole is very large compared to the membrane size, the radiation strikes many points. Optical quality of the infra-red vision is much too blurry to allow snakes to strike prey with the observed accuracy of about 5 degrees. The most fascinating is an amount of heat-sensitive sensors and their precision:

In pit vipers, which have only two pit holes (one in front of each eye), a block of about 1600 sensory cells lie on a membrane which has a field of view of about 100 degrees . This means the snake's brain would receive an image resolution of about 2.5 degrees for point-like objects, such as eyes, which are one of the hottest points on mammals... [2]

If the aperture was very small, the amount of energy per unit time (second) reaching the membrane would also be small. The need to gather a reasonable amount of thermal energy per second necessitates the ``pinhole'' of the pit organ to be very large, thus greatly reducing its optical performance. If on the other hand the aperture of the organ is large, the image of a point source of heat is disc-shaped rather than point-like. Since, however, the size of the disc-shaped image may be determined by the detectors on the membrane, it is still possible to tell from which direction the radiation comes, ensuring directional sensitivity of the system [3]. The aperture size was probably an evolutionary trade-off between image sharpness and radiant flux [2]. Although the image that is formed on the pit membrane has a very low quality, the information that is needed to reconstruct the original temperature distribution in space is still available [3].

So how a snake could possibly use such poorly focused IR input to find its prey in darkness with a surprising angular precision of 5 degrees? How the snake may be able to extract information on the location of the prey from the blurred image that is formed on the pit-membrane?

What does the snake see?

Without the ability of real-time imaging the IR organ would be of little use for the snake. So Dr. van Hemmen proved that it is possible to reconstruct the original heat distribution using the blurred image on the membrane [3].

The image on the membrane resulting from the total heat distribution in space will be some complicated shape that consists of the superposition of the contributions of all heat sources [3]. A superposition of edge detectors in the brain can now reconstruct the heat distribution by using the whole image on the membrane for each point in space to be reconstructed. So reconstruction is possible because the information is still available in the blurred image on the pit membrane, where the receptors are [2]. As a demonstration of the model, sample image (see Fig. 2) was used.

Figure 2: The famous hare by Durer (left) was converted into 8- bit gray levels at a resolution of 32x32 (right). The image from the paper [2].

Since a snake has limited computational resources (all ``calculations'' must be realizable in neuronal ``hardware'') the reconstruction model must be simple. Our model [5] thus uses only one computational step (it is noniterative) to estimate the input image from the measured response on the pit membrane. It resembles a Wiener filter and is akin to, but different from, some of the algorithms used in image reconstruction [6].

So it is highly remarkable that snakes can perform some kind of an image processing, like our artificial devices based on ``wavefront coding''[7,8] and ``pupil engineering''[9,10] techniques.

Image processing in nature

There was developed a neuronal algorithm [11] that accurately reconstructed the heat image from the membrane. The most vital requirements is accurate detectors and the ability to detect edges in the images produced on the pit membrane [2]. That is similar to the situation with ``wavefront coding'' devices: the dynamic range and accuracy of the ADC is much more important than it is much more important than an amount of elements.

I would like to introduce an analogy here: such imaging is like drawing a picture on a sand. The more fine the sand, the more accurate and delicate pictures one can draw. That is the case of high dynamic range of the detector. And vice versa: on a coarse and stony sand it is difficult to draw a fine tracery that is the case of low dynamic range's detector [12,13].

But let us get back to the model of snakes vision:

The model has a fairly high noise tolerance. For input noise levels up to 50%, the hare is recognizable. Sensitivity to measurement errors is larger. In our calculations, one pixel of the reconstructed image corresponds to about 3 degrees . For detector noise levels up to about 1% of the membrane heat intensity, a good reconstruction is possible, meaning that the edge of the hare may be determined with about one pixel accuracy. At detector noise levels beyond about 1%, the image is not so easily recognizable, but the presence of an object is still evident...[5]

The assumptions that went into the calculations are a ``worst case scenario''. For instance, we assumed [3] that the input to the pit organ is totally uncorrelated, meaning that the snake has no idea what heat distribution to expect. In reality, important information about the environment is always available. For example, typical temperature and size of a prey animal may be encoded in the neuronal processing structure. If the snake ``knows'' what kind of images to expect, the reconstruction process can be enhanced considerably [3].

How does the reconstruction matrix become imprinted on the snake's neural circuitry in the first place? ``It can't be genetic coding,'' says van Hemmen. ``The snake would need a suitcase full of genes to encode such detail. Besides we know that snakes ...need a season of actual learning, not just anatomical maturation, to acquire their extraordinary skills.''... [11]

On the Fig. 3 it is shown a deconvolution results that give us a concept of the snakes vision capabilities.

Figure 3: On the left, this figure displays the membrane heat intensity as captured by the ``pithole camera''. On the right are reconstructions for four different membrane noise levels. The pit membrane was taken as a flat square containing 41x41 receptors. The model works equally well if applied to other membrane shapes. The membrane noise term was taken to be Gaussian with SIGMA= 25, 100, 200, and 500 from left to right and top to bottom, corresponding to 0.25%, 1%, 2%, and 5% of the maximal membrane intensity. The image from the paper [2]

Ultimately, a snake's ability to utilize information from the pit organs depends on its capability to detect edges in the image produced on the pit membrane. If the snake performed no reconstruction, but instead simply targeted bloblike ``hot spots'' on the membrane, it would still have to be able to discern the edge of the blob. The present model performs edge detection for all spatial positions and hence automatically creates a full reconstruction. A level of neuronal processing beyond what is represented in our model is unlikely to be beneficial since the quality of the system is fundamentally limited by the relatively small number of heat receptors.[5]

Conclusion

Snakes' heat vision presents such a clear image when reconstructed that it surpasses even many human devices - it is far better than any technical uncooled infra-red camera with a similar number of detector cells [2].

Bibliography

1: Liz Tottenham.
Infrared imaging research targets 'snake vision'.
web publication - Discovery: Florida Tech, DE-402-901:4-5, 2002.
2: Lisa Zyga.
Snakes' heat vision enables accurate attacks on prey.
PhysOrg.com, www.physorg.com/news76249412.html, page 2, 2006.
3: Andreas B. Sichert, Paul Friedel, and J. Leo van Hemmen.
Modelling imaging performance of snake infrared sense.
In Proceedings of the 13th Congress of the Societas Europaea Herpetologica. pp. 219-223; M. Vences, J. Kohler, T. Ziegler, W. Bohme (eds): Herpetologia Bonnensis II., 2006.
4: David A. Allen.
Infrared: The New Astronomy.
Infrared: The New Astronomy, 1975.
5: Andreas B. Sichert, Paul Friedel, and J. Leo van Hemmen.
Snake's perspective on heat: Reconstruction of input using an imperfect detection system.
PHYSICAL REVIEW LETTERS, PRL 97:068105, 2006.
6: R. C. Puetter, T. R. Gosnell, and Amos Yahil.
Annu. Rev. Astron. Astrophys, 43:139, 2005.
7: J. van der Gracht, E.R. Dowski, M. Taylor, and D. Deaver.
New paradigm for imaging systems.
Optics Letters, Vol. 21, No 13:919-921, July 1, 1996.
8: Jr. Edward R. Dowski and Gregory E. Johnson.
Wavefront coding: a modern method of achieving high-performance and/or low-cost imaging systems.
In Proc. SPIE, Current Developments in Optical Design and Optical Engineering VIII, volume 3779, pages 137-145, 1999.
9: R. J. Plemmons, M. Horvath, E. Leonhardt, V. P. Pauca, S. Prasad, S. B. Robinson, H. Setty, T. C. Torgersen, J. van der Gracht, E. Dowski, R. Narayanswamy, and P. E. X. Silveira.
Computational imaging systems for iris recognition.
In Proc. SPIE, Advanced Signal Processing Algorithms, Architectures, and Implementations XIV, volume 5559, pages 346-357, 2004.
10: Sudhakar Prasad, Todd C. Torgersen, Victor P. Pauca, Robert J. Plemmons, and Joseph van der Gracht.
Engineering the pupil phase to improve image quality.
In Proc. SPIE, Visual Information Processing XII, volume 5108, pages 1-12, 2003.
11: Bertram Schwarzschild.
Neural-network model may explain the surprisingly good infrared vision of snakes.
Physics Today, IX:18-20, September 2006 Physics Today.
12: Konnik M.V.
Image's linearization from commercial cameras used in optical-digital systems with optical coding.
In Proceedings of 5th International Conference of young scientists ``Optics-2007'', Saint-Petersburg, pages 354-355, 2007.
13: M.V. Konnik, E.A. Manykin, and S.N. Starikov.
Increasing linear dynamic range of commercial digital photocamera used in imaging systems with optical coding.
In OSAV'2008 Topical meeting, Saint-Petersburg, Russia, 2008.

Saturday, February 28, 2009

Discriminative sensing

Recently I have red a very interesting paper [1] written by Keith Lewis about discriminative sensing. In this post I have selected most interesting ideas (from my point of view) as well as to share my thoughts.

Both natural and artificial vision systems have many points in common; for example, three different photo receptors for red, green, and blue bands of visible light or ability to process multi element scenes. But natural vision systems have significant advantage of pre-processing of images before processing and understanding by a visual cortex in a brain:

In general the biological imaging sensor takes a minimalist approach to sensing its environment, whereas current optical engineering approaches follow a ``brute'' force solution...[1]

This is a very serious problem: in case of in-vehicle systems, which requires real-time image processing, such ``brute'' force solutions are inefficient. Once you need to process images fast, you have to use powerful computers, parallel image processing algorithms, and symmetric multiprocessor methods. All of such ``brute force'' solutions led to increase of energy consumption of in-vehicle systems, requires more batteries, and eventually increasing size and weight of such dinosaurs-alike devices.

Further, the only one sensor is used in many cases of unmanned systems construction. Images produced by such sensor are tend to be redundant, excessively vast, and hence it is difficult to process them fast.

In the biological world, most organisms have an abundant and diverse assortment of peripheral sensors, both across and within sensory modalities.iMultiple sensors offer many functional advantages in relation to the perception and response to environmental signals....[1]

So I convinced that the next generation of imaging techniques and devices should use ideas and methods from natural vision systems. Indeed, it is sometimes useful to take lessons from the Nature as from an engineer with multi-billion years of experience.

Bio-inspiration

I have always been fascinating with insects - such small creatures that can distinguish and understand clogs and make decisions in complicated situations - more or less intellectually. For example,

...the fly has compound eyes, ...as well as the requisite neural processing cortex, all within an extremely small host organism. Its compound eyes provide the basis for sensing rapid movements across a wide field of view, and as such provide the basis of a very effective threat detection system...[1]

That's the main idea, I presume: only relevant objects are registered by small compound eyes and then understood by neural processing cortex of the fly. Interesting enough that there are much more complex vision systems exists even than human vision:

...a more complex vision architecture is found in the mantis shrimp[2]. Receptors in different regions of its eye are anatomically diverse and incorporate unusual structural features, not seen in other compound eyes. Structures are provided for analysis of the spectral and polarisation properties of light, and include more photoreceptor classes for analysis of ultraviolet light, color, and polarization than occur in any other known visual system...This implies that the visual cortex must be associated with some significant processing capability if the objective is to generate an image of its environment...[1]

In contrast, artificial systems are far less intelligent than ants or flies. For our unmanned systems, it is required to register all the scene at once, without understanding or even preprocessing it. Then using on-board computer systems, unmanned devices process such a huge stream of images by pixel-by-pixel strategy, without understanding of what kind of signals are relevant.

Although both natural and artificial vision systems use the same idea of tri-chromatic photoreceptor, the result differs dramatically. While animals are very good in recognition of preys or threats, artificial systems such as correlators and expert systems are relatively bad in making decisions. Primitive artificial YES-NO logic is not so flexible as natural neural networks based fuzzy sets of rules and growing experience of dealing with threats.

Such situation is very like a history of human's attempts of flight: for a long time people tried to get off the ground like birds. The success have came only after understanding the idea of flight.

Beyond a Nyquist's limit

As an example of non-trivial yet elegant approach, coded aperture systems are remarkable. Such idea can be applied both for visible [3] and IR [4] band. As it has been truly stated that such technique

...provides significant advantage in improving signal-to-noise ratio at the detector, without compromising the other benefits of the coded aperture technique. Radiation from any point in the scene is still spread across several hundred elements, but this is also sufficient to simplify the signal processing required to decode the image...[1]

It is noteworthy that analogous techniques such as ``wavefront coding''[5,6] and ``pupil engineering''[7,8] are applied in various optical systems, too. Application of such paradigms allows creating unique devices that combine both high optical parallelism and flexibility of digital algorithms of images processing.

It is clear that there is a little way to go yet before such computational imaging systems can be fielded on a practical basis...[1]

Moreover, such computational imaging systems are already here, in practical applications! Devices that are based on such techniques are used in security systems [9], tomography [10], aberrations correction [11,12] in optical systems, in depth of field improving [13], an so on.

It is curious that coded aperture approach can be found even in natural vision systems such as snakes vision [14]. These sensory organs enable the snake to successfully strike prey items even in total darkness or following the disruption of other sensory systems. Although the image that is formed on the pit membrane has a very low quality, the information that is needed to reconstruct the original temperature distribution in space is still available. Mathematical model that allows the original heat distribution to be reconstructed from the low-quality image on the membrane is reported in [15].

Instead of conclusion

There are no doubts that more and more approaches from natural vision systems will be used in artificial imaging systems. Hence the more we know about animals' eyes, the better we can design our artificial vision systems. I presume that in the near future, many of us are going to be constant readers of biological scientific journals...

Bibliography

1: Keith Lewis.
Discriminative sensing techniques.
Proc. of SPIE, Vol. 7113:71130C-10, 2008.
2: Cronin, T. W. and Marshall, J.
Parallel processing and image analysis in the eyes of mantis shrimps.
Biol. Bulletin, 200:177, 2001.
3: Slinger, C., Eismann, M., Gordon, N., Lewis, K., McDonald, G., McNie, M., Payne, D., Ridley, K., Strens, M., de Villiers G., and Wilson R.
An investigation of the potential for the use of a high resolution adaptive coded aperture system in the mid-wave infrared.
In Proc. SPIE 6714, 671408, 2007.
4: Slinger, C., Dyer, G., Gordon, N., McNie, M., Payne, D., Ridley, K., Todd, M., de Villiers, G., Watson, P., Wilson, R., Clark, T., Jaska, E., Eismann, M., Meola, J., and Rogers, S.
Adaptive coded aperture imaging in the infrared: towards a practical implementation.
In Proc. SPIE Annual Meeting, 2008.
5: J. van der Gracht, E.R. Dowski, M. Taylor, and D. Deaver.
New paradigm for imaging systems.
Optics Letters, Vol. 21, No 13:919-921, July 1, 1996.
6: Jr. Edward R. Dowski and Gregory E. Johnson.
Wavefront coding: a modern method of achieving high-performance and/or low-cost imaging systems.
In Proc. SPIE, Current Developments in Optical Design and Optical Engineering VIII, volume 3779, pages 137-145, 1999.
7: R. J. Plemmons, M. Horvath, E. Leonhardt, V. P. Pauca, S. Prasad, S. B. Robinson, H. Setty, T. C. Torgersen, J. van der Gracht, E. Dowski, R. Narayanswamy, and P. E. X. Silveira.
Computational imaging systems for iris recognition.
In Proc. SPIE, Advanced Signal Processing Algorithms, Architectures, and Implementations XIV, volume 5559, pages 346-357, 2004.
8: Sudhakar Prasad, Todd C. Torgersen, Victor P. Pauca, Robert J. Plemmons, and Joseph van der Gracht.
Engineering the pupil phase to improve image quality.
In Proc. SPIE, Visual Information Processing XII, volume 5108, pages 1-12, 2003.
9: Songcan Lai and Mark A. Neifeld.
Digital wavefront reconstruction and its application to image encryption.
Optics Communications, 178:283-289, 2000.
10: David J. Brady Daniel L. Marks, Ronald A. Stack.
Three-dimensional tomography using a cubic-phase plate extended depth-of-field system.
Opt. Letters No 4, 24:253-255, 1999.
11: H. Wans, E.R. Dowski, and W.T. Cathey.
Aberration invariant optical/digital incoherent systems.
Applied Optics, Vol. 37, No. 23:5359-5367, August 10, 1998.
12: Sara C. Tucker, W. Thomas Cathey, and Edward R. Dowski, Jr.
Extended depth of field and aberration control for inexpensive digital microscope systems.
Optics Express, Vol. 4, No. 11:467-474, 24 May 1999.
13: Daniel L. Barton, Jeremy A. Walraven, Edward R. Dowski Jr., Rainer Danz, Andreas Faulstich, and Bernd Faltermeier.
Wavefront coded imaging systems for MEMS analysis.
Proc. of ISTFA, pages 295-303, 2002.
14: Andreas B. Sichert, Paul Friedel, and J. Leo van Hemmen.
Snake's perspective on heat: Reconstruction of input using an imperfect detection system.
PHYSICAL REVIEW LETTERS, PRL 97:068105-1-4, 2006.
15: Andreas B. Sichert, Paul Friedel, and J. Leo van Hemmen.
Modelling imaging performance of snake infrared sense.
In Proceedings of the 13th Congress of the Societas Europaea Herpetologica. pp. 219-223; M. Vences, J. Kohler, T. Ziegler, W. Bohme (eds): Herpetologia Bonnensis II., 2006.

Tuesday, January 6, 2009

Optical-digital encryption systems

As an alternative approach of encryption techniques, optical-digital system approach is worth of noting. Such systems introduce optically a known distortion in a registered image. Then such introduced distortion can be compensated digitally.

All of such encryption systems can be divided in three category: optical, digital, and hybrid optical-digital systems. A brief survey of them is presented further.

Digital encryption systems

Digital encryption systems are performing all operations in numerical form in computer [1] so there is no need of optical systems to be created. There are several systems that are mostly digital such as Virtual Optics system [2] and virtual-optical-holography (VOH) [3] that are characterized by high cryptography resistance. Other digital systems that are worth noting are using fractional wavelet transform [4] or fractional Fourier transform [5].

One of the most widespread digital encryption technique is virtual-optical imaging scheme, VOIS [2] that simulates an optical imaging system in a computer model. As a computation model, Fresnel approach is used; encryption parameters are the wavelength of coherent ``virtual'' light LAMBDA , the distance between the image to be encoded and ``lens'' d0 , focal distance of the ``lens'' f , and distance between the ``lens'' and observing plane d_i (see Fig. 1).

Figure 1: The key diagram of the Virtual Optics.

Fourier-spectrum of an image to be encoded is multiplied by Fourier-spectrum of coding mask, so one need to know exact parameters of LAMBDA, d0 , d_i , and f in order to decrypt the image.

Optical encryption systems

In optical encryption techniques are utilized high speed and parallelism of optical images processing. As encryption keys, diffractive optical elements (DOEs) are used that are synthesized and outputted to ferroelectric [6] or LCD-modulators [7]. For example, such systems use lensless approach [8], 4-f based systems [9], fractional Fourier transform [10,11,12] or systems based on double random phase mask [13]. Apart from these systems, toroidal zone plates encryption systems is worth mention [14].

The most widespread approach in optical encryption systems is double random phase mask [13,15,16,17]. As it shortly described in [18], the image to be encrypted P is immediately followed by a first random phase mask, which is the first key X . Both the image and the mask are located in the object focal plane of a first lens (see Fig. 2).

Figure 2: DRPE coding scheme

In the image focal plane of this lens is therefore obtained the Fourier transform (FT) of the product P*X. This product is then multiplied by another random phase mask that is the second key Y. Lastly, another FT is performed by a second lens to return to the spatial domain. Since the last FT does not add anything to the security of the system, we will perform all our analyses in the Fourier plane. The ciphered image C is then:

C = Y * FT(P*X)

(1)

where F stands for the Fourier transform operation. In most of the paper, we will assume that P is a grey-level image.

Such systems allow obtaining encrypted images that are characterized by high cryptography resistance. Images are being encrypted in a very short time because of parallel optical processing is performed. Complexity of optical key diagram and expensiveness are drawbacks of such systems. Moreover, several vulnerabilities of double random phase mask were reported recently [19,20,21].

Hybrid optical-digital systems

Hybrid optical-digital systems allow to combine advantages of optical processing (high speed and parallelism) and digital processing (flexibility of digital image processing methods). Application of digital methods in optical coding allow to reduce weight and cost of devices.

As a most widespread optical-digital paradigms, ``wavefront coding'' [22] and ``pupil engineering'' [23] are worth mention. Systems based on these paradigms are used in enhancing depth of field in microscopic imaging [24], aberrations compensation [25], depth-of-field improvement in MEMS-systems [26], and enhancing of tomography images [27].

Coding diffractive element (DOE) is introduced in the imaging scheme of such devices; hence optical convolution of input object and point spread function (PSF) of the DOE is performed optically. As a result, the image registered is blurred but a blur is the same across the image. Digital images deconvolution is performed in order to reconstruct the image and compensate introduced distortion. An example of hybrid optical-digital device is shown in Fig. 3.

Figure 3: Hybrid optical-digital imaging system: a photo sensor and a kinoform (DOE).

Hybrid optical-digital systems based on ``wavefront coding'' and ``pupil engineering'' paradigms can be used not only for encryption but for depth-of-field enhancing, too.

Such systems are advantageous because of their inexpensiveness, flexibility, and reliability (may be used not only for data encryption). But after image decryption, visual quality of the image is degraded slightly that is a disadvantage.

Bibliography

1: Ari Y. Benbasat.
A survey of current optical security techniques.
Technical report, MIT Media Lab, Prepared for Prof. Cardinal Warde 6.637 Spring 1999 Research Project, April 15, 1999.
2: Xiang Peng, Zhiyong Cui, and Tieniu Tan.
Information encryption with virtual-optics imaging system.
Optics Communications, 212(4-6):235-245, November 2002.
3: Xiang Peng and Peng Zhang.
Security of virtual-optics-based cryptosystem.
Optik, 117:525-531, 2006.
4: Linfei Chen and Daomu Zhao.
Optical image encryption based on fractional wavelet transform.
Optics Communications, 254(4-6):361-367, October 2005.
5: B.M. Hennelly and J.T. Sheridan.
Image encryption and the fractional fourier transform.
Optik - International Journal for Light and Electron Optics, 114(6):251-265, 2003.
6: G. Unnikrishnan, M. Pohit, and K. Singh.
A polarization encoded optical encryption system using ferroelectric spatial light modulator.
Optics Communications, 185(1-3):25-31, November 2000.
7: Chau-Jern Cheng and Mao-Ling Chen.
Polarization encoding for optical encryption using twisted nematic liquid crystal spatial light modulators.
Optics Communications, 237(1-3):45-52, July 2004.
8: Guohai Situ and Jingjuan Zhang.
A lensless optical security system based on computer-generated phase only masks.
Optics Communications, 232(1-6):115-122, March 2004.
9: Xiaogang Wang, Daomu Zhao, and Linfei Chen.
Image encryption based on extended fractional fourier transform and digital holography technique.
Optics Communications, 260(2):449-453, April 2006.
10: Naveen Kumar Nishchal, Joby Joseph, and Kehar Singh.
Fully phase-encrypted memory using cascaded extended fractional fourier transform.
Optics and Lasers in Engineering, 42(2):141-151, August 2004.
11: Naveen Kumar Nishchal, Joby Joseph, and Kehar Singh.
Securing information using fractional fourier transform in digital holography.
Optics Communications, 235(4-6):253-259, May 2004.
12: Banghe Zhu and Shutian Liu.
Optical image encryption based on the generalized fractional convolution operation.
Optics Communications, 195(5-6):371-381, August 2001.
13: Steven C. Verrall Enrique Tajahuerce, Osamu Matoba and Bahram Javidi.
Optoelectronic information encryption with phase-shifting interferometry.
Applied Optics, Vol. 39, No. 14:2313-2320, 10 May 2000.
14: John Fredy Barrera, Rodrigo Henao, and Roberto Torroba.
Optical encryption method using toroidal zone plates.
Optics Communications, 248(1-3):35-40, April 2005.
15: Bahram Javidi Pedro Andres Enrique Tajahuerce, Jesus Lancis.
Optical security and encryption with totally incoherent light.
Optics Letters, Vol. 26, No. 10:678-680, May 15, 2001.
16: Enrique Tajahuerce and Bahram Javidi.
Encrypting three-dimensional information with digital holography.
Applied Optics, Vol. 39, No. 35:6595-6601, 10 December 2000.
17: Enrique Tajahuerce Bahram Javidi.
Three-dimensional object recognition by use of digital holography.
Optics Letters, Vol. 25, No. 9:610-612, May 1, 2000.
18: Yann Frauel, Albertina Castro, Thomas J. Naughton, and Bahram Javidi.
Resistance of the double random phase encryption against various attacks.
Optics Express, Vol. 15, No. 16:10253-10265, 6 August 2007.
19: A. Carnicer, M. Montes-Usategui, S. Arcos, and I. Juvells.
Vulnerability to chosen-cyphertext attacks of optical encryption schemes based on double random phase keys.
Optics Letters, 30:1644-1646, 2005.
20: Thomas J. Naughton Unnikrishnan Gopinathan, David S. Monaghan and John T. Sheridan.
A known-plaintext heuristic attack on the fourier plane encryption algorithm.
Optics Express, Vol. 14, No. 8:3181-3186, 2006.
21: X. Peng, P. Zhang, H. Wei, and B. Yu.
Known-plaintext attack on optical encryption based on double random phase keys.
Optics Letters, 31:1044-1046, 2006.
22: W.T. Cathey and E.R. Dowski.
New paradigm for imaging systems.
Applied Optics, 41:6080-6092, 2002.
23: Sudhakar Prasad, Todd C. Torgersen, Victor P. Pauca, Robert J. Plemmons, and Joseph van der Gracht.
Engineering the pupil phase to improve image quality.
In Proc. SPIE, Visual Information Processing XII, volume 5108, pages 1-12, 2003.
24: M. R. Fetterman P. Potuluri and D. J. Brady.
High depth of field microscopic imaging using an interferometric camera.
Optics Express, Vol. 8, No. 11:624-630, 21 May 2001.
25: Sara C. Tucker, W. Thomas Cathey, and Edward R. Dowski, Jr.
Extended depth of field and aberration control for inexpensive digital microscope systems.
Optics Express, Vol. 4, No. 11:467-474, 24 May 1999.
26: Daniel L. Barton, Jeremy A. Walraven, Edward R. Dowski Jr., Rainer Danz, Andreas Faulstich, and Bernd Faltermeier.
Wavefront coded imaging systems for MEMS analysis.
Proc. of ISTFA, pages 295-303, 2002.
27: David J. Brady Daniel L. Marks, Ronald A. Stack.
Three-dimensional tomography using a cubic-phase plate extended depth-of-field system.
Opt. Letters No 4, 24:253-255, 1999.