The limits of resolution

This paper challenges the notion that a fixed spatial resolution is applicable for digitising all items in a collection, irrespective of their characteristics. The fineness of detail in crafted objects depends on the granularity of the substrate, the diameter of the tool, and the visual contrast sensitivity of the craftsman. The ability of an imaging device to capture fine detail is characterised by its modulation transfer function (MTF). A technique based on the 1-D Fourier transform is introduced for determining the spatial frequency of the finest detail in an object. It is proposed that for capturing hand-crafted detail the image should resolve features of 50 μm diameter, requiring 40 samples per mm (1200 dpi).


INTRODUCTION
Many 'cook book' guidelines are available for digital capture of images in archives and collections. Typically these recommend 300 dots/inch (dpi) scanning, with images captured as 24-bit RGB files in a lossless format such as TIFF. A typical example is found in the specification published by a committee at Columbia University (Table 1):  (AcIS, 1997) To be fair, the document does go on to differentiate between the number of pixels captured per inch from the original material and DPI as the number of dots per inch on computer displays and printers. It emphasises that selection of the optimum resolution should start with a determination of the smallest meaningful element that must be legible in the end product: 'When dealing with textual materials, this determination is relatively easy: find the smallest letter, numeral, diacritic, or symbol that must be clearly distinguished. In printed books the smallest textual element is often the superscript footnote numbers or letters with diacritics; with hand-written documents there is a great deal of variation. It is much more difficult to determine what the smallest meaningful element in a photograph or artwork is. In part it depends on who will use the scanned image and in what way. A non-specialist may look at a landscape photograph casually, while a geologist may need to be able to distinguish the stratigraphy of the cliff in the background.
… Color photographs should be tested at 300 dpi to see if it will suffice; higher resolution may be needed for photos with significant small details. Historical artefacts like papyri may require 600 dpi if extremely fine details of paper grain, etc. must be captured.' Thus the criterion for spatial resolution depends on the uses and users of images in the collection. But what if these are not known at the time of digitising? Subsequent users of the images may find that the digital image is not fit for purpose. For example, a manuscript might have been scanned at sufficient resolution to produce 'facsimile' printed reproductions at 300 dpi, but when a paleographer later needs to distinguish between the fine details of seraphs he may find that 1000 pixels per inch are needed. In general it is a mistake to adopt a 'one size fits all' approach, and to choose fixed parameters.
The principle advocated in this paper is that archival image records should be captured at sufficient resolution to support all later uses, including (but not limited to) reading of text and interpretation, reproduction on display or in print, and analysis of material and technique. Given the time and effort required for the logistics of digitising a collection, it is desirable to capture as much information as possible from the original document at the time of digitisation, including spatial detail (points, edges, texture), the spectrum of colorants (substrate, inks, paints), and tonal range (lightest to darkest). Ideally the master digital image file should contain all of the hand-crafted information in the original, from which other subsidiary images can later be derived. The cost of extra memory for storing high-resolution images is small in comparison to the cost of repeating the digitisation process. What is not captured initially cannot easily be regained later!

DETAIL IN CRAFTED OBJECTS
A topic of particular interest for collections in museums and galleries concerns the limits of spatial resolution in artefacts produced by artists and craftsmen. A key question for any art medium is: 'What is the finest man made detail in surface?' The question is important because it sets an upper limit on the spatial resolution needed for digitisation in either 2D or 3D. Before opto-mechanical aids were available, such as microscopes and machine tools, the fineness of rendered detail in crafted objects was limited by point diameter of hand-tool, granularity of the substrate, manual dexterity, and visual contrast sensitivity. Preliminary analysis of documents or objects in a collection can establish the optimal image capture parameters, particularly spatial resolution, spectral resolution and dynamic range. The preferred approach is first to determine the characteristics of a typical object, including its substrate and colorants, then to choose a capture device to encode them with sufficient precision. Because archival image records should support all later uses, including uses not known at the time of acquisition, it is highly desirable to capture as much information as possible from the original. The granularity of a substrate limits its ability to hold the strokes applied by the artist. In the case of stone or clay tablets, the limit can be approximated by the mean particle size, i.e. the diameter of one grain of the material, such as sediment or the lithified particles in clastic rock. Particle sizes are classified by the Wentworth scale, a modification of which is known as the Krumbein phi scale, defined logarithmically to the base 2: Where is the diameter of the particle, is the reference diameter equal to 1 mm, and φ is the scale parameter. Table 2 shows the lower half of the Krumbein scale, applicable to incised tablets, suggesting that the finest detail rendered in a sandstone composed of very fine sand would be approximately 60 µm whereas for a mud tablet composed of river silt it could be as little as 4 µm.
The finest mark that can be made when painting with a brush is limited by the width of a single hair. A nominal value often chosen for human hair is 75 μm, but this (like other anthropometric data) is subject to variance across the population. Animal hairs vary in thickness in different parts of the body, and also according to the season. Table 3 shows the range of diameters for human and animal hair, indicating that 20 μm is a likely minimum dimension for brush strokes. By comparison, the synthetic fibre with trade name Taklon used in modern brushes for art and makeup, is manufactured in several diameters from 80 to 200 μm, which mimic hair and boar bristle (Wikipedia). An impressive example of fine painted detail is found in A Faun and His Family with a Slain Lion by Lucas Cranach (c.1526). Recent examination of the painting at the Getty Museum (Szafran&Woollett, 2008) revealed a tiny detail of a running figure on the road in the background ( Figure 1). The overall width of the figure is only 1 mm, and the width of the strokes for his legs, sword and staff are approximately 50 μm. The width of the cracks in the paint (craquelure) is less than 20 μm. In general the crack patterns are complex, caused by variations in temperature, humidity and stretching of the canvas support (Keck, 1969). The distribution of crack widths is continuous, down to infinitesimal. The detail in paintings on woven substrates such as canvas and silk is influenced by the fineness of the weave. A study of French canvases of hemp, linen and cotton indicated a range of 10-23 threads/cm for the warp and 8-20 threads/cm for the weft (Carbonnel, 1980). The minimum distance between adjacent threads is therefore approximately 430 μm. Silks may have a finer thread width of 100 μm and a pitch of 200μm, with the threads more closely spaced on the warp axis (Table 4). Kuhn (1995) described a Chinese polychrome shroud with a warp:weft thread count of 156:38, corresponding to a warp period of only 74 μm. For all types of woven materials the weft, which is threaded transversely and compacted at irregular intervals, is more variable and widely spaced than the fixed warp threads.  Winter et al., 1996) Modern sewing needle sizes are specified on an index scale 1-12, with the smallest (size 12) having a diameter of 35 μm. The finest pins available for sewing and lace-making have a diameter of 40 μm.
Indentations made with pin-pricks in paper or wood may therefore have a diameter of less than 40 μm, depending on pressure applied.
Radiographs generated by X-rays are normally lifesize representations of the original object, i.e. at 1:1 scale, which produces the sharpest possible image with minimal geometric distortion. The resolution of radiographic film is limited by its grain size, which is typically 25 μm (Clogg&Caple, 1996). In the study of tool marks on ivory, to distinguish medieval originals from modern forgeries, Fiegenbaum (1996) used microscope image analysis to estimate the spatial parameters of the ridges in the material surface made by hand tools such as gouges and scrapers. The ridge structures were found to have typical periodicity of 50 μm.
The detail in stained glass panels is typically not so fine as for paintings on canvas because fine lines are prone to erosion during the firing process, and the windows are generally intended to be viewed from a greater distance. Figure 2 shows a detail from a window at Fairford Church (MacDonald, 1997), where each image from the digital camera was 3072 by 2320 pixels, covering an area of glass 48 cm wide by 38.7 cm high at a surface resolution of approximately 6 pixels/mm. This resolution was sufficient to sample the finest detail of the brush strokes which were spaced by about 0.5 mm.

VISUAL CONTRAST SENSITIVITY
For any craftsman a key limiting factor on the fineness of detail that can be rendered with a tool is visual contrast sensitivity. This is a measure of the eye's ability to discriminate fine detail, and is strongly dependent on the level of illumination. The contrast sensitivity function (CSF) was first investigated by Robson (1966) who showed the interdependence between spatial and temporal contrast. Figure 3 shows that in bright sunlight (photopic conditions) the spatial CSF peaks between 5 and 10 cycles/degree (cpd) and falls away rapidly above 30 cpd. At a 'close' working distance of 30 cm, 10 cpd is equivalent to 20 line pairs/mm on the work surface, which requires 40 pix/mm to resolve. In lower levels of light the pupil of the eye dilates and the optical performance of the eye is reduced. At lower levels still the cone photoreceptors become inactive and only the rods detect light (scotopic vision), which provides poor spatial resolution because the retinal network pools the rod signals over a wide collection area.

Figure 3: Spatial contrast sensitivity function at different levels of retinal illuminance. http://webvision.med.utah.edu/KallSpatial.html
Thus a craftsman working in good lighting without the aid of a magnifier or microscope should readily be able to see and produce fine details of 50μm, and possibly down to 20 μm, subject to manual dexterity with the tool. This is subject to the reduced ability to focus at close working distances with increasing age (presbyopia) and also the reduction in CSF with age (Ross et al, 1985). The minimum focusing distance (nearpoint) increases steadily with age from approximately 10 cm at age 20, to 20 cm at age 40, to 100 cm at age 60 ( Figure  4) because of hardening of the crystalline lens in the eye (Duane, 1912). A twenty-year-old craftsman working under bright lighting in optimal conditions very close at 10 cm (4 inches) to the surface might in theory be able to resolve 100 line pairs per mm, i.e. features of only 10 μm in diameter.

MODULATION TRANSFER FUNCTION
The ability of an image capture system, such as scanner or camera, to capture spatial detail from an object or scene is dependent on the performance of each component in cascade: the optics, the aperture, the sampling array, and the electronics. The performance of each component is characterised by the modulation transfer function (MTF), defined as the ratio of output to input modulation for each spatial frequency (Boreman, 2001). The MTF of the complete system is the product of individual MTFs of all the components: There is an optimum range of MTF for best overall system performance ( Figure 5). If the system MTF is too poor (lower bound), the image will be blurred and lacking in contrast. If the system MTF is too high (upper bound) the image will be noisy, and ringing will be apparent at high-contrast edges, together with other undesirable aliasing patterns.  (Burns and Williams, 2001) The MTF of a Nikon D200 digital camera with Nikkor 105mm lens was evaluated, using the slanted edge technique (Burns, 2000) with the ISO 16067 target (Figure 6). At a focused distance of approx. 70 cm from the target, the sampling rate was 30 pixels/mm, corresponding to a Nyquist limit of 15 cycles/mm (Figure 7).

Figure 6: The ISO 16067-1 standard test target
In this configuration details of 70 μm diameter could be resolved. The spatial frequency response of the camera/lens combination at an aperture of f/11 was superior in the horizontal direction, with the 0.5 threshold at about 8 cy/mm vs. 7 cy/mm for the vertical direction.

TECHNIQUE FOR SPATIAL ANALYSIS
It may not be necessary to digitise a collection at very high spatial resolution if the man-made detail in the original objects (documents, artwork, artefacts, etc.) is limited. Especially for a homogeneous collection, in which all objects are of a similar material and produced by a similar technique, it is useful to analyse some representative objects to determine the maximum spatial frequency present and then to set the scanning resolution accordingly.
For investigation, the Nikon D200 digital camera with 105mm lens was mounted on a copy-stand and focused at 31.4 cm (12 inches) distance from the baseboard, with two high-frequency fluorescent light sources of 5000K correlated colour temperature positioned at 45° on either side. The image of 3900 × 2616 pixels covered an area of 23.6 × 15.9 mm at a resolution 165 pixels/mm. (In this setting the image on the sensor was magnified to exactly 1:1 scale with the original.) The Nyquist frequency was therefore 82.5 cycles/mm, and so the finest resolvable detail was 12 nm.
A facsimile reproduction of the 14 th century illuminated manuscript 'Les Très Belles Heures de Turin' was examined. This was reproduced in 1993 by Schwitter (Basel) in eight inks: the standard process colours of cyan, magenta, yellow and black (CMYK) plus blue, orange, brown and gold, all printed with exceptionally fine halftone screens of 300 lines per inch (Figure 8). The distribution of spatial frequencies in the image was determined by computing the 1-D Fourier power spectrum across one dimension of a section of the image and averaging the spectrum across the other dimension ( Figure 9). The distribution was very noisy for a single line. When averaged over multiple lines the noise was greatly reduced and the salient features became apparent ( Figure 10). The frequency of the halftone dot pattern appeared as a spike at 12 cycles/mm, and the painted features of the image (lines and contours) as an elevated section in the range 6-10 cycles/mm, with two spikes at approximately 6 and 8 cycles/mm. Thus all of the spatial detail in the image could be captured, excluding the halftone pattern, by scanning at a resolution of 10 cycles/mm, i.e. a sample size of 50 μm, giving 20 pix/mm in image.

DISCUSSION
Based on analysis of the dimensions of various physical media and visual contrast sensitivity in Sections 2 and 3, the minimum discernible feature size should be somewhere in the range 20-75 μm ( Table 4). The Nyquist criterion requires that the original should be sampled at least twice the highest frequency, so these dimensions correspond to minimum sampling rates ranging from 27 to 100 pixels per mm. Such high sampling rates would generate a lot of data: for example, 100 pixels per linear mm would give 10,000 pixels per mm 2 , or 10 10 = 10 Gpixels per m 2 , producing 60 Gbytes per m 2 for a 16-bitper-channel RGB image. Nonetheless, the cost of disk storage for 1 Terabyte per 16 m 2 of the painted surface is small compared to the labour costs of moving the object from gallery, setting up the camera, capturing the image, image processing, data transfer and cataloguing. At the National Gallery in London the digital image archive in current use (see http://cima.nglondon.org.uk/collection/) was captured by the MARC digital camera (Saunders, 1998), and through the setup in the photographic studio every image is optically scaled to the range 7,500-10,000 pixels in the longer axis, regardless of the physical dimensions of the original painting. This means that  The difference in the level of detail is very obvious when areas of the same physical size are juxtaposed (Figure 11). For a small painting it is possible to resolve very fine details, as shown in Figure 12, where the individual hairs of the Madonna's eyebrow are well defined. Each of the broader hairs is an average of 2.5 pixels in width, representing a stroke width of 90 μm, but the finer hairs can also be resolved, with a stroke width of approximately 55 μm.
A proposal is currently being considered that in future all painting sin the National Gallery collection will be digitised to a 'documentary record standard' of 24 pix/mm = 609.6 pix/inch. This would resolve details of 83 μm in all paintings, similar to that currently achieved in the Madonna of the Pinks, and would produce a larger image file than at present for any painting with either dimension larger than about 30 cm. Such a proposal is welcome, but the proposed resolution would still fail to capture the finest brush strokes and craquelure. It would be preferable to adopt a digitising standard of at least 50 pixels/mm (nominal 1,200 dpi) so that details of size 40 μm could be resolved. This assumes that the image capture device (camera or scanner) would have a sufficiently high MTF to resolve such details.
In summary, the recommended approach for any image digitisation campaign is: • Analyse the original documents or artefacts to determine the finest spatial detail; • Choose the best scanner or camera optics to optimise MTF and minimise distortion; • Select a sampling resolution (pixels/mm) at least double the highest spatial frequency in the original; • For a camera, correct non-uniform distribution of illumination (white card) in document plane; • Always store RAW images as the master records -all other formats can be derived.
This approach is consistent with guidelines published by the EC Minerva Network (Drake et al, 2003): §Image capture (by scanner or digital camera) should be carried out at the highest reasonable resolution. This will often result in very large masterfiles; smaller files can be extracted from the master, for purposes such as web delivery. However, a higher-quality image can never be derived from a lower-quality image. § The definition of a 'reasonable' resolution will depend on the nature of the material being scanned, and on the uses to which the scanned image will be put. For example, if the scanned images are only ever to be used as thumbnails, this can allow scanning at a low resolution. Equally, the resolution must capture the most significant details of the item -if scanning at a high resolution yields no more information than at a lower resolution, the high resolution scanning is difficult to justify.