Reasoning verbally about image processing

Working on a new training class in digital camera design, I found it hard to describe the processes and things involved in capturing and processing a 'digital image' without using words that themselves are based on an intuitive understanding of what an image 'is'.


Take this statement: "a digital camera takes a picture".

It looks straightforward until you look closer and see that it is in fact quite subtle.

This is not just because searching for clear definitions can take us in a circle (for instance Google defines a 'picture' as a 'visual representation', a 'representation' as an 'image' and an 'image' as a 'picture'). It is also that the language to reason about images is itself based on imagery. If I want to explain some aspect of an image processing problem then I might choose something to 'illustrate' the problem. Or I might ask you to try to 'see' the problem from a particular 'point of view'. These visual analogies are helpful because they derive from some strong intuitive understanding, but they can (a) confuse by using visual metaphors to describe concrete visual things, and (b) they refer to understanding but do not actually explain it.

It may not be important, but it might be. If I say something that seems straightforward: "a digital camera takes a picture" then we all know what a 'picture' is. Don't we? Go on, then, tell me - without saying it is a 'visual representation' or an 'image' or using any other visual metaphor that in fact is just saying that 'a picture is a picture'.

Take this further. What is the 'picture' that the camera takes? Perhaps we can refine this by extending the statement to say that the camera 'takes a picture OF something'. So, if I frame part of a view in the viewfinder, is that part of the view the 'picture'? Probably not - I might say that view would 'make a nice picture' but not talk of its being the picture (except in a recursive metaphor where I say the view IS a picture, using a metaphor of a picture to say something about the view that the picture would be a picture of). Perhaps the picture is the image projected by the camera lens onto a surface? (Notice that I am not yet sure what an 'image' is..). That could indeed be a picture but it would in fact not look very like one - the camera's CCD on which the image is projected is a field of bumpy microlenses, and the sensors are designed to absorb light not reflect it so I would probably not see much, and what I could see would be a horrid bumpy thing. So is the picture the pattern of light that would have been projected if the CCD array had instead been a little screen? Or is the picture the stored result in memory of reading the CCD array? The stored picture would then be very different from the 'real' projected picture, not only because it has been sampled and distorted and polluted by noise. And, I can't see the CCD array contents, and something I can't see does not feel like it would be a picture. Perhaps the picture is the result of taking the stored array and displaying it? Personally I feel this is getting close - a picture is intrinsically something that is looked at (unless you are into conceptual modern art). But now the display medium becomes important. The picture will look different if it is displayed on an LCD panel, or on a CRT monitor, or is printed on paper (photos may look good on screen but awful when printed).

I think the way through this is to see that a 'picture' is inextricably linked with its being displayed on a surface: so any reasoning about it should take into account the process of rendering it onto that surface. Another word might be used to mean 'the phsyical pattern of light focussed by the lens onto the sensors' - and this might be the 'image'. When digitized this could be the 'digital image'. The image is of the part of the scene that is focussed onto the sensor by the camera lens. This is supposedly the same as what is seen through the frame of the viewfinder, so we could use the word 'frame' to suggest the physically real part of the scene whose image is projected ('frame' being commonly used to describe the composition of a photograph - and of course also as in a 'picture frame'). Which leaves the 'scene' to be the overall world view that the camera's human user experiences.

Although this discussion concerns the interpretation and meaning of words, I think it also leads to the idea that we cannot reason effectively about just some component parts of a digital camera - but instead must start off with a system-level approach where we recognise and clarify the essential inter-relationships. Mayb the simple statement abou the camera taking a picture needs to be written more pedanticall (and restrictively) as:

"a digital camera records a map of the light that is projected by the lens as an image of the part of a scene that is framed by the viewfinder, with the intention that this map will b rendered onto some medium to be looked at by a person".

Then we can start to break this down into its component stages and reason about each stage in some limited isolation.

Comments

Popular posts from this blog

Early Years

Wave Watching

Timing error in sampling, and balanced ADC/clock choice