Inside any modern ultrasound system, images are created as digital images. The generation of the ultrasound images (ultrasound physics, signal processing, and instrumentation) is beyond the scope of this chapter. Excellent descriptions can be found in many handbooks.³^–⁵ Digital images consist of pixels whose brightness or color is represented by a numeric (digital) value. Brightness level is also referred to as intensity or gray value. A cineloop or movie is a sequence of such images, typically at a frame rate of 20 to 200 images per second. The digital representation makes it possible to store and process images in a computer—hence, digital image processing. Modern ultrasound systems are totally digitized and support the storage and communication of digital images and cineloops. For display on a monitor and recording on VCR, these digital images are converted into an analog video signal. The use of analog video output or VCR tape should be strongly discouraged for analysis purposes. Although it is possible to redigitize the analog video with devices such as frame grabbers, this results in severe, irreversible loss of information and image quality: spatial resolution, frame rate, and intensity accuracy will be degraded; separation among image, graphics, and color overlays is lost; and calibration and patient information disappears.

Storage Formats and Image Communication

DICOM

The method of choice for digital image storage and exchange is DICOM (Digital Imaging and Communications in Medicine ⁶). DICOM is a generally accepted international standard for medical images of all modalities, including all types of ultrasound imaging. The DICOM standard (current version: 2009) is still being extended and improved to better support new developments in medical imaging. As its name implies, DICOM is a communication standard rather than a file format—it defines how medical imaging devices such as ultrasound systems, Picture Archiving and Communications Systems (PACS) servers, and printers communicate to transport, store, retrieve, find, or print images and associated patient information. All major manufacturers have committed themselves to support DICOM. Ultimately, this should lead to the integrated electronic patient record, which contains the full patient file, including patient history, laboratory reports, images of all modalities, and other information. Therefore, DICOM is a very complicated standard: the full description covers several thousand pages.⁶ A very readable explanation of DICOM for echocardiographers is given by Thomas.⁷

Note that if a device is labeled “DICOM compliant,” this does not mean that it will automatically work with every other DICOM device: DICOM defines a multitude of services and imaging modalities. For every device, a DICOM Conformance Statement defines precisely which services are supplied and supported for what modalities and to what extent. To verify interoperability between devices, their conformance statements should be compared—not a simple job for a novice in DICOM. In particular, interoperability between vendors is rather limited for ultrasound, because many essential parameters are stored in vendor-specific “private tags” that are not accessible to others. Also, there is as yet no standardized format for 3D data storage.

General-Purpose Formats

Other widely used general-purpose image formats such as BMP, TIF, GIF, and JPEG are often used for export of screen shots or single images for use in reports, presentations, and publications. For movies, AVI, MPEG, and QuickTime are popular formats. These general-purpose formats usually cannot store additional patient information and can lead to significant image quality degradation because of lossy compression (see later discussion). Therefore, they should not be used for primary image storage or archiving.

Image Compression

To reduce data storage requirements, image compression can be employed. Lossless compression techniques (such as run-length encoding [RLE], lossless JPEG) can reduce file sizes by a factor of 2 to 5, and uncompressing will produce a perfect copy of the original image. Lossy compression reaches much higher compression ratios (up to 20 to 100) by eliminating information for which the eye is least sensitive, at the cost of some irreversible image degradation. This degradation is acceptable visually (JPEG factor 20 has been found to produce no diagnostically significant degradation ⁸) and is marginal compared to the degradation associated with VCR storage. However, the compression artifacts may certainly influence digital image processing and analysis. Severely lossy compression is not advisable for archiving or when digital image postprocessing is foreseen. Lossy compression techniques include lossy JPEG, fractal and wavelet compression, and MPEG. DICOM currently⁶ supports RLE, JPEG (lossless and lossy), JPEG2000, and MPEG2 compression schemes.

Medical Image Processing

Medical image processing is a thriving subdiscipline of digital image processing.^9,¹⁰ Several good handbooks on medical image processing, with special attention to ultrasound, are available.^11,¹²

Image Enhancement: Level Manipulations, Filtering

Image enhancement deals with the improvement of images, either for visual interpretation or as a preprocessing for analysis. Image-enhancement techniques used by ultrasound machines and analysis programs are similar to those used by most general-purpose image manipulating programs (such as photo editors). The simplest class of such operations includes level manipulations or lookup-table operations: these change the brightness level (or color or transparency) of each pixel without considering any neighboring pixels. The original brightness of the pixel is simply used to look up the new value in a conversion table. Lookup-table operations include brightness level manipulations and pseudocoloring. In 3D visualization, such operations are the main tools to control the appearance of the volume rendering, in terms of brightness, transparency, and surface extraction.

Brightness level manipulations include all one-to-one conversions of image brightness levels (input) to display brightness levels (output), either linear or nonlinear. Examples are digital contrast/brightness adjustments, image inversion, and gamma correction. Some examples are given in Figure 12-1. Note that many level manipulations may result in clipping (see Fig. 12-1, C, D, and E) and in reduction of the effectively used number of brightness levels. The extreme example is thresholding (see Fig. 12-1, E), in which all brightness levels above a threshold are set to white, and all below to black.

Figure 12-1 Image brightness conversions (lookup-table operations) and their results.

A, Identity: no change. B, Inversion. C, Increased contrast (note clipping, C). D, Decreased brightness. E, Thresholding. F, Histogram equalization.

Pseudocoloring involves a direct conversion of brightness levels to a color scale, generally labeled with names such as Rainbow, Ocean, or Harvest. Because the eye is more sensitive to color differences than to intensities, this may reveal subtle intensity differences. It can be visually pleasing but may also be highly suggestive, as it clusters similar gray values into color groups. Because brightness levels in ultrasound are highly dependent on signal attenuation and local gain settings, the borders that are suggested visually by these colors have little practical significance.¹³ Pseudocoloring is also sometimes applied to highlight brightness differences with respect to some baseline value (e.g., increase above the local brightness level in a baseline image) with a color (e.g., to visualize the arrival of contrast agents in perfusion imaging). This is an effective tool, but one should be aware that tissue motion may also induce a brightness change and show up as color.

Filtering entails image operations that consider pixels within their neighborhood and deal with the spatial or temporal aspects of the image. Filtering operations include smoothing (noise reduction), and sharpening (edge enhancement). Smoothing or low-pass filtering (e.g., uniform, Gaussian) is used in many ABD methods (see later discussion) to reduce the speckle noise and get more or less homogeneous regions; high-pass edge enhancing or detection filters (e.g., Sobel, Laplacian) are often used to find (candidate) border points. Note that most smoothing methods slightly change the positions of edges and differentiate poorly between noise and weak signals. High-pass filters tend to be very sensitive to noise. In general, filtering does not improve the appearance of ultrasonographic images without simultaneously removing valuable information. Smoothing and sharpening filters may be available on your 3D ultrasound machine for real-time use: keep the caveats just mentioned in mind when using this option.

Image Interpretation: The Interpretation Pyramid

The interpretation of medical images is an extremely complicated task that is very hard to transfer into a computer. For us humans, vision is a natural task that we perform instantly and automatically. From the study of human perception however, we know that vision is anything but a simple, straightforward process. Think of the many well-known optical illusions: there is a lot of hidden interpretation going on. In the interpretation of images, several information abstraction levels can be distinguished. This is generally known as the image interpretation pyramid (Fig. 12-2). The levels of this pyramid give us more insight into the mechanisms of different automated techniques and their limitations. A good analogy is found in the interpretation of handwriting or spoken language. This analogy is described in Table 12-1. For interpretation of a written text, one has to know about the alphabet, spelling, vocabulary, syntax, and semantics, and ultimately about the subject of the text, the intentions of the source, and adornments such as humor, sarcasm, and metaphors. These last aspects are not about language—they refer to the real-world domain that the text is discussing. Interpretation is not a simple bottom-up process of combining letters into words into sentences into significance. Text can be fragmented; there are imperfections, such as misspellings and ambiguities, and missing domain knowledge that necessitate feedback between all levels, and even guessing, to come to a consistent interpretation.