VisionX files use a tagged type system. That is, each component of the file consists of a tag which contains a type and a length, followed by the data which has the specified length. The philosophy of this organization (used in many image file formats) is that a large number of diverse data types can easily be accommodated. Many applications modify just those components that they are designed to operate on and ignore any others. In this way, the file structure is easily extended to add new component types. In addition, some higher level structures are required that require multiple tags. For example, an image consists of a bounding box component followed by a pixel data component; the bounding box provides the structure information while the pixel data provides the image contents.
An advantage of the above design is that new features can be easily added to the system without modifying all the commands to accommodate them (of which VisionX has over 200). A disadvantage is that not all commands treat new data structures in the expected manner.
There are several VisionX commands for viewing the components of a VisionX file. Every file has a title component and a cumulative history component; these can be viewed with the vls command or from the View menu of vxm. A summary of the components in a file can be obtained with the vps command (with a t option) or with the View menu of xvm. The contents of a file with each component listed can be viewed with the vpr command.
In the following the main basic component organizations currently used in VisionX are outlined.
The 2D image is the fundamental building block of vision systems. In VisionX it consists of two components: a bounding box, and pixel data. The bounding box specifies the index range of the image and the the pixel data tag specifies the base type and the size of the data. Multicomponent pixels (e.g., color pixels) are indicated by the pixel data length being a multiple of the size specified by the bounding box. Originally the bounding box specified 4 elements (xlow, xhigh, ylow and yhigh); more recently they often have six elements including an additional (zlow and zhigh). For 2D images these last two values are usually set to zero. For color index images the image structure is usually preceded by a color lookuptable component.
Programming tools are available for treating 2D images like 2D arrays.
The 3D image is a relatively recent feature of VisionX. In VisionX files it is realized by a set of 2D images in which the zspecification of the bounding box is consistent. For example, a 3D image with 3 voxels in the zdirection would consist of 3 2D images which have the same x and y direction bounding box values. In addition if for the first image the bounding box has zlow = 0 and zhigh = 1 (indicating a 01 range in z) then the box for the second image must have zlow = 1, and zhigh = 2 and the box for the third image must have zlow = 2 and zhigh = 3. Most VisionX commands designed to operate primarily on 3D images have a v3 prefix in their command name. Note, many older VisionX commands will just treat 3D images as a set of 2D images and will frequently perform the correct function without requiring any modification. Note also that sets of 2D images which do not conform to the 3D convention are still valid VisionX data files but will not be treated as 3D entities.
Programming tools are available for treating 3D images like 3D arrays.
A file may be organized in frames. A frame consists of a start frame component, the frame contents components, and an end frame component. There are many cases when a whole image file is not to be read into memory in one step (when the file is a movie for example); the frame provides a mechanism for "chunking" a file so that a file read operation may read just one frame at a time as a unit. The frame end element prevents readahead into the next frame when this capability is required. A temporal sequence of images (movie) is usually represented by a set of 2D images with similar x and y specifications in separate frames. Prior to the introduction of the 3D convention above, 3D images were also represented by this structure. In fact it is still appropriate to store large 3D image sets in this format.
There are two fundamental programming tools for reading data files: the first reads the whole file in a single operation and the second reads a file one frame at a time. Most commands use the latter form as this enables them to process files of an arbitrary large size. Programming tools are available for processing a moving window of frames.
A 4D image in VisionX is represented by a framed sequence of 3D images. That is, each frame contains one 3D image. In the future, an extended version of the 3D image structure could be used for 4D images; however, while such a file can be created now, there are currently no programming tools to directly support this data structure.
Commands are available for reorganizing the dimensions of 4D images so that they can be processed with standard 2D and 3D commands.
Objects are collections or groups of components. Objects are delineated by the object component. Components between two object components are considered to comprise of a single object. Object groupings are nested within and do not cross frame boundaries.
Objects are very useful in grouping say a set of polygons to a single entity. Other attributes (such as color) may then be included in the group.
3D graphics in VisionX is based on files containing a set of 3D polygons (preceded by a single 3D bounding box). A set of polygons may be grouped into a single "object" using the object component mentioned above such that a file may contain a collection of "objects". In addition a polygon or set of polygons may be preceded with other attributes such as a face color and a a boundary color. An important feature of the VisionX system is the matching of coordinate systems of both images and polygons which makes possible the mixed rendering of both image and polygon surface data.
Programming tools are available for the rendering of polygon files that contain just 3 and 4sided polygons. A number of utility commands are available for manipulating 3D polygon files in the above format.
There are two 3D image structures outlined above, 3D images and image sequences. At this time, some commands will only operate on one not both of these formats. However, the "vdim" command will convert between these formats. In general, it is possible add a vdim pipe between two incompatible commands.