Tutorial on Local image operations and Convolution

This tutorial covers three subtopics on local image operations. In general many standard opeations may be realized with python package functions (such as scipy). We also the cover the case of explicit implementation for exposition and for development and validation of new local functions.

  1. The implementation of local operations such as convolution
  2. The mangement byte pixel types
  3. Boudary padding to maintain image dimensions

1. Image Convolution

Image procesing and computer vision opeartions may frequently be implemetned with available python packages. A simple example of image convolution is shown in the following.

The basc VX image class may be created either from a v4 file or explicity as shown above. The structure element .i is the image data in a numpy array. im also contains the image metadata.

1.1 Convolution with scipy

1.2 Explicit Convolution with padding

An explicit (but less efficient) implementation of convolution may be achieved with the vx.Vx embedim function. It is very common to make the output of a local image operation have the same dimensions as the input image; padding is a convenient mechanism to achieve this for explicit, pixel level, programs.

Consider below the implicit implemetnation of convolution

Note, The outcome is identical to the scipy implicit function

2. Byte pixel data format and scaling

An issue with practical images is that the pixels are typically in byte format which is adequate for viewing images with the human visual system (HVS). There are several details that need to be addressed when performing computations on byte valued pixels. The strategy to deal with the low dynamic range pixel format may depend upon the application.

For recent deep learning systems, a common approach is to convert the pixels to float format and to scale their values from 0.0 to 1.0. For run-time systems, special hardware and data formats may be used to improve efficiency.

For many applications, arbitrary scaling of input pixels may be a problem especially when a desired output may also be a byte image for human review. Below we give examples of two approaches when using byte pixel data.


2.1

Consider the following modified image:

We see the result is incorrect because of numerical overflow of our array dtype 'uint8' as it is not able to represent any value greater than 255 or less than 0. Note, python does not provide any error message to inform that an overflow occurred. Since the kernel is (by default) set to type float, one way to achieve a valid output is to scale the kernel. Alternatively, once could change array dtype to “int16” (or “float32”); however, in that case, it would be necessary to scale the result to a byte again in order to create a conventional image file for human viewing.

In general, it is common practice, to scale (potiive valued) convolution kernels so that they sum to one, then no overflow is possible. The situation is more complex if any kernel elments have negative values.

3. Image padding for local operations and index offsets

Python requires that the “first” element of an array has index values of [0,0]. This means that a padded image has a “spatial” offset with respect to the initial image which makes indexing a little messier as is shown in the loop-expanded explicit example below.

3.1 explicit neighborhood pixel indexing

Consider that we want a convolution of a pixel with its four-connected near neighbors. One way to do this is to use a kernel of

0 1 0
1 1 1 
0 1 0

Consider the explicit version of this convolution below. While, the relative indices of NN pixels are +/- one pixel from the center pixel location, offsets (of +1 in this case) need to be added to all ip pixel indices to accomodate the padding.

In summary, (1) good library packages exist for many standard iame processing functions in python. (2) When developing programs for images be aware of pixel precision in your program design and that python does not usually report overflow errors. (3) When developing or validating new algorihtms with pixel indexing be careful to get the index values correct when image padding is involved.