Horse kicks , flying bombs and potsherds : statistical theory contributes to archaeological survey

During a career spent in developing, using and teaching the applications of statistical techniques in archaeology, it has been apparent to me that a crucial first step towards a successful analysis is the choice of an appropriate statistical model to and how it led to the development of software that could be used in “real time” in the field • Present some preliminary results and show how they can be interpreted • Sketch out a possible route for future developments. The problem4 One aim of the NAP is to locate and characterize archaeological sites in the area surrounding Noviodunum, to shed light on the ways in which the fort may have interacted with its hinterland. This is achieved by extensive field-walking, that is by walking over the terrain and collecting surface scatters of artefacts which may indicate the locations of buried sites. Traditionally, this is done by teams of walkers moving linearly across the landscape at a pre-determined spacing (e.g. 10m apart).5 However, the density of surface artefacts around Noviodunum is so great that this would have rapidly filled the available storage space, so a less intensive approach, the “spot” method, was used. In this approach, a regular grid of circular “spots” of a chosen size is laid out, and all the artefacts within each spot are collected; nothing is collected from outside the spots. At Noviodunum, the spots were given an area of 2 sqare metres (i.e. a radius of 0.80m) and were placed on a rectangular grid at 30m intervals. This is achieved by laying out a central line of points using a total station, and then laying out the points either side of that line using an optical right-angle and tapes. The team of five walkers progresses along the transect, stopping every fifth line to input the data to the program, and undertake any extra collection that is indicated (see below). The transects are located in the landscape using hand-held GPS (Figs 2–4). It was anticipated that some spots would contain a high density of artefacts (specifically, Roman pottery sherds), and would be considered “on site”, while others would contain a lower density Horse kicks, flying bombs and potsherds: statistical theory contributes to archaeological survey


D
uring a career spent in developing, using and teaching the applications of statistical techniques in archaeology, it has been apparent to me that a crucial first step towards a successful analysis is the choice of an appropriate statistical model to and how it led to the development of software that could be used in "real time" in the field • Present some preliminary results and show how they can be interpreted • Sketch out a possible route for future developments.
The problem 4 One aim of the NAP is to locate and characterize archaeological sites in the area surrounding Noviodunum, to shed light on the ways in which the fort may have interacted with its hinterland.This is achieved by extensive field-walking, that is by walking over the terrain and collecting surface scatters of artefacts which may indicate the locations of buried sites.Traditionally, this is done by teams of walkers moving linearly across the landscape at a pre-determined spacing (e.g.10m apart). 5However, the density of surface artefacts around Noviodunum is so great that this would have rapidly filled the available storage space, so a less intensive approach, the "spot" method, was used.In this approach, a regular grid of circular "spots" of a chosen size is laid out, and all the artefacts within each spot are collected; nothing is collected from outside the spots.At Noviodunum, the spots were given an area of 2 sqare metres (i.e. a radius of 0.80m) and were placed on a rectangular grid at 30m intervals.This is achieved by laying out a central line of points using a total station, and then laying out the points either side of that line using an optical right-angle and tapes.The team of five walkers progresses along the transect, stopping every fifth line to input the data to the program, and undertake any extra collection that is indicated (see below).The transects are located in the landscape using hand-held GPS (Figs 2-4).
It was anticipated that some spots would contain a high density of artefacts (specifically, Roman pottery sherds), and would be considered "on site", while others would contain a lower density

Clive Orton
In the application of statistics in archaeology, it is essential to choose an appropriate statistical model for the specific archaeological problem being addressed.This article discusses the application of statistical models to large-scale archaeological field survey and describes the development of software (sherdnav) designed to guide fieldwalkers to where they should look next, once they have surveyed all the primary spots in a block, and its application to the Noviodunum Archaeological Project (NAP).describe the situation being studied.The process of choosing a model not only focuses the mind on the essentials of a problem, it also guides one towards an appropriate choice of technique, which in these days of user-friendly software can be the most difficult part of an analysis.This has been a particularly valuable exercise in two case studies: an analysis of Museum Collection Condition Surveys 1 and a study of ceramic production centres. 2 More recently, the development of a range of statistical models of increasing complexity has proved its value in a new area -the undertaking of large-scale archaeological field survey, and in particular the Noviodunum Archaeological Project (NAP) (Fig. 1). 3 The aims of this paper are to: • Outline the problem of carrying out archaeological field surveys effectively and efficiently • Tell the story of the search for models that could guide the progress of the fieldwork, DOI: http://dx.doi.org/10.5334/ai.1005 of sherds and be considered "off site". 6t would then be possible to define roughly the extent of any sites that had been located by simply drawing a curve round contiguous groups of high-density spots.Because of the wide (30m) spacing between the spots, this would give only a rough delineation of any sites, and a finer drawing of their outline was felt to be necessary.It was decided to achieve this by surveying further spots, at 10m intervals, but only when needed to refine the boundary of a site.Considering pairs of spots 30m apart, there are three possible situations and two practical outcomes, as follows: a) both spots are "high density": both are "on site", so there is no need to survey between them b) both spots are "low density": both are "off site", so there is no need to survey between them c) one spot is "high density" and the other is "low density": one is "on site" and one is "off site", so there is a need to survey two further spots between them, reducing spacing from 30m to 10m.This approach throws down the statistical challenge of deciding which spots are of "high density" and which are of "low density", preferably with the minimum of data processing so that any intermediate spots could be surveyed as soon as possible.This was technically possible because each team has a PDA and could thus enter data, carry out calculations and ascertain which (if any) further spots needed to be surveyed.

The search for a model
For purposes of analysis, the spots are grouped into "blocks" of 25 spots (five-byfive).From a statistical point of view, the problem is to take a set of 25 counts (the numbers of sherds in each of the 25 spots in a block), and to decide whether they could all reasonably be seen as samples from the same background density (accepting that some variation will occur), or whether this simple belief is untenable, and they must therefore be from zones of different densities.The first alternative corresponds to options (a) and (b) above, and the second to option (c). 7his begs a further question -if there were sherds scattered randomly across a block, whether at high or low density, what sort of pattern of variation would we expect to find between the numbers found in each spot?Let's look at the sorts of numbers that might be involved, drawing on data from the 2005 survey.Fig. 4 shows the numbers of Roman sherds found in each spot of a test block.It's useful to re-cast this as a histogram, which shows how many spots contain no sherds, one sherd, two sherds, etc. (Fig. 5).How might this histogram look if sherds had been scattered randomly in the block at the same overall density?Does our histogram look anything like this theoretical ideal (which is known as complete spatial randomness or CSR)?
At this point we need to digress into statistical theory.In 1837 a French mathematician, Siméon-Denis Poisson, described a mathematical model for the numbers of "events" that would occur in a certain "interval", if there were very many such events, each of which had a very small probability of occurring in any particular interval. 8This became known, not surprisingly, as the Poisson distribution, and was admired as a piece of mathematics, but remained an abstract concept until 1898, when Ladislaus von Bortkiewicz, a German economist, interpreted "event" as a soldier in the Prussian army being killed by a horsekick, and the "interval" as a period of time, such as a year.His data on the numbers of Prussian soldiers killed in this way did indeed fit the theoretical Poisson distribution, i.e. deaths occurred randomly over time. 9Later, it was appreciated that "interval" could refer to space as well as time.The British actuary Clarke was able to fit a Poisson model to the spatial distribution of flying-bomb hits on London, demonstrating that within the general target area their spatial occurrence was random. 10In a further  The means of the two distributions, and their relative weights, are estimated using maximum likelihood estimation (mle). 12f the lower of the two means is zero, we have the P+ model (Poisson plus a number of zeros) 13 as a special case of the PP model.So far, the PP model has never failed to fit. 14Sherdnav can find both the "simplest adequate model" and the "bestfit model" 15 to describe the counts in a block.
If the PP model is needed, the block is partly on-site and partly off-site.Sherdnav then compares the count in each spot with those in its immediate neighbours; if they belong to different distributions, then survey of the intermediate (10m) spots is recommended (intensive survey).
If the block appears to be partly or wholly on-site, survey of the appropriate neighbouring squares is recommended (extensive survey).
Figure 6 shows the division of a test block into spots that are on-site and spots that are off-site, and indicates which intermediate spots need to be surveyed  Sherdnav is designed to run on Pocket Excel© on a PDA.Users enter data in three adjacent columns: eastings, northings and count, using -1 to indicate unsurveyed spots.Spots are entered in a standard order.The outcomes consist of four additional columns headed north, east, south and west, telling users whether they should survey the extra spots immediately to the north, east, south and west of each spot, and a row telling them whether they should survey the blocks to the north, east, south and west.All the calculations and intermediate working are hidden from users, thus making good use of the small screen.

Use in the field
The software was first used in the field in April 2006, revealing weaknesses that led to some false positives (surveying spots that did not need it).These were put right for the summer 2006 season, which unfortunately discovered very little, intensively.Blocks adjacent to sides of the block which are wholly or partly on-site are recommended for extensive survey.

Implementation
and did not seriously test the software's capabilities.The software worked very well during the Easter 2007 survey, and the data are currently being evaluated.

Future work
The main criticism of the sherdnav approach is that it is, as currently implemented, purely local.It gives clear advice about actions within a block, but says nothing about what to do in the 30m-wide strips between blocks.Indeed, the definitions of high and low density could be quite different in two adjacent blocks.Additional software is needed to sit above the level of the individual blocks, to allow them to be metaphorically stitched together.This is the next major task.Other lesser tasks are to examine the relative merits of the "simplest adequate model" and the "best fit model", and to examine whether the theoretical advantages of mle over MoM outweigh the practical complications.

Figure 1
Figure 1 Location map showing site of the Roman fort of Noviodunum.

Figure 2
Figure 2 Spot walking at Noviodunum.From top-centre clockwise: laying out the centre line using a total station; setting out the line of spots using an optical right angle; locating the end of a transect using a handheld GPS; a line of spots being walked.

Figure 3
Figure 3 Recording sheet for one survey block of the NAP.Figure 4 Completed recording sheet for the test block, showing numbers of sherds found in each spot (a count of -1 indicates a spot that could not be surveyed).

Figure 4
Figure 3 Recording sheet for one survey block of the NAP.Figure 4 Completed recording sheet for the test block, showing numbers of sherds found in each spot (a count of -1 indicates a spot that could not be surveyed).

Figure 5
Figure 5 Histogram of the data from Figure 4, showing the numbers of spots with 0, 1, 2, … sherds.

Figure 6
Figure6 Recording sheet for test/d block after analysis, showing on-site spots (red), off-site spots (blue) and extra spots to be surveyed (shaded).