A Lightweight Classification Algorithm for Human Activity Recognition in Outdoor Spaces

The aim of this paper is to discuss the development of a lightweight classification algorithm for human activity recognition in a defined setting. Current techniques to analyse data such as machine learning are often very resource intensive meaning they can only be implemented on machines or devices that have large amounts of storage or processing power. The lightweight algorithm uses Euclidean distance to measure the difference between two points and predict the class of new records. The results of the algorithm are largely positive achieving accuracy of 100% when classifying records taken from the same sensor position and accuracy of 80% when records are taken from different sensor positions. The outcome of this work is to foster the development of lightweight algorithms for the future development of devices that will consume less energy and will require a lower computational capacity.


INTRODUCTION
Development in the Internet of Things means the world is quickly filling up with digital sensors measuring everything from location, movement to humidity [1].This is increasing the amount of data generated in the world.Estimates show that the amount of data generated is doubling every two years [1].However, to unlock the full potential of all of this data it needs to be analysed.If the data goes unanalysed it is arguably just noise.
Current techniques for data analysis such as machine learning or artificial intelligence are excellent at making sense of all this data.However, this analysis comes with a high computational cost.This high amount of computation can require large amounts of storage or processing power.Machine learning also has the drawback of needing the model to be trained on a data set before accurate results can be given.Collecting enough data to train a model is often time consuming and expensive to complete [2].
It is important that we give users the ability to analyse all this data.With 68% of UK adults owning a smartphone it is likely a smartphone is the most common computational device owned [3] .But these high cost techniques like machine learning are too computationally intensive to be implemented well (or at all) on a smartphone or smartwatch [2].Current methods to allow a user to analyse this data with a smartphone often involve accessing some external resource rich service or platform over the internet.This in turn requires the user to have an internet connection.Connecting to the internet is still an issue for many users given that 4G mobile data coverage is limited to just 43% of the UK's landmass [4].Wi-Fi is another option, but this is mostly limited to indoor spaces and large Wi-Fi networks cause some pretty serious security concerns, so it is not an ideal option.
It therefore makes sense that we try to perform some local analysis of the data near the point of data collection on the smart phone.This will enable users to interact fully with the data around them.A possible solution is the development of lightweight algorithms.Such algorithms would have a low computational cost and therefore would not need the computing power of an external resource and could be performed on a smartphone without the need to connect to the internet.This paper will present an example of a low computational algorithm for activity recognition in an outdoor gym.The paper will be structured as follows; section two will explain the data set and the algorithm.Section three will present the results of the algorithm and discuss the findings comparing them to classification results from an ANN (artificial neural network).Finally, section four will conclude the paper.

METHODOLOGY
The data used in this study was collected as part of a wider study, in collaboration with the Dublin Institute of Technology, on the development of a Bluetooth enabled MEMS device for increasing user engagement in outdoor areas.The data set consists of nine subjects each performing two exercises with an air walker and pull-down machine, in an outdoor location.
While performing the exercises each of the participants was wearing three smartphones located at the upper arm, wrist and hip pocket.Each of the smartphones contains a gyroscope and accelerometer.The before mentioned MEMS device was also attached to the gym equipment and recorded each of the exercises.Resulting in sixty-nine records being collected.(a protocol issue with one of the files means only sixty-eight records are used in the tests).
The goal of the light weight algorithm will be to classify which of the two exercises is being performed, regardless of what position (hip pocket, wrist, upper arm or machine mounted) the data is recorded at.The classification accuracy will be compared to the results of a simple ANN trained and tested on the same data set.

Euclidean classification algorithm (ECA)
The lightweight algorithm uses Euclidean distance to measure how different records are from each other.
Euclidean distance is defined as the shortest path between two points.This is not a new measurement by any standard.It has been used in the development of a fuzzy motion classifier [5].Classification is performed by establishing example profiles for each of the exercises, more information on how the example profile are selected is given in section 2.2.
The distance from a new point to each of the example profiles gives a good indication as to which class the new record belongs to.Whichever path has the shortest distance means it is likely to be part of that class.
The algorithm reads in data from the sensors gyroscope (X,Y and Z) and accelerometer (X,Y and Z. From the accelerometer data total acceleration is calculated using the Equation (1).Where aX is acceleration in the X direction, aY is acceleration in the Y direction and aZ is acceleration in the Z direction.To reduce the noise from the data windowing is applied, taking the mean of every five data points.From this data we extract fourteen features listed in Table 1.At this stage all 14 features are used, though they may not all be required, feature ranking and selection will the next step in the development of the algorithm (see further developments in conclusion).The example profiles also contains the same fourteen features and noise reduction is applied in the same way.

√
(1) We then calculate the Euclidean distance from a new point to an example profile using equation ( 2) Where d is the distance, n is the number of features, p is the new point containing the fourteen features outlined in Table 1 and c is an example profile containing the same fourteen features.
(  We then measure the distance from the new point to the other example profile.Whichever gives the shorter distance is the predicted class of that new data point.The algorithm has been summarised in the flow diagram shown in Figure 1 2.2 Selection of the example profiles The selection of an example profile for each of the exercises is one of the most important parts of the algorithm.If the point is too central to its own class, it might not classify outliers correctly.It would be very simple to run the algorithm multiple times to find the two records from the given data set that gives the best results.However, this would be an unfair test of the algorithm as the results are not likely to be that good in a real-world situation.Instead what is a better and fairer test would be to find which of the four-sensor positions is best at classifying records from the other sensors.Two records for this sensor group can then be used as the example profile.This will be a fairer test of the algorithm and give a truer reflection of how it would cope in the real world.
To find out which was the best sensor position, four tests were conducted.First, we selected a random test subject.The data from this test subject was used as the example profile for each of the tests.
The four tests were structured as follows; test how well the upper arm data from the random test subject classified the upper arm data from the other eight test subjects this was repeated for hip pocket, wrist and the machine mounted data.
To compare these tests the accuracy, specificity and sensitivity for the four tests has been calculated for each of the classes using Equations ( 3)-( 5).Where TP is true positive, TN is true negative, FP is false positive, and FN is False negative.The results for the four tests are shown in Table 2 ( ) ( ) The results for these tests show that the hip pocket is the worst at predicting the class of records taken from the same sensor position.This is probably due to the fact that the hip pocket is less secure than the upper arm, wrist and machine mounted positions.It is very easy for the smartphone to be placed in the pocket upside down or back to front in the pocket.Meaning the accelerometer and gyroscope would give very different results for each of the records and it would be hard to classify them.While its ability to classify records from different sensor positions is not as good as originally expected.

Pull-down
In the following stages of the project it is planned that the accuracy in classifying different gym equipment and the sensitiveness in detecting the human activity recognition associated can be improved.
The next step in this line of research should be to investigate if we can reduce the computational cost of the algorithm further while improving the accuracy of the algorithm and in a further step how energy efficiency of these devices can be further increased in accordance with a recent trend in technology development called "Planned Obsolescence" [5].

Figure 1 :
Figure 1: Flow diagram from Euclidean classification algorithm

Table 1 :
name and description for the fourteen features in the feature vector A Lightweight Classification Algorithm for Human Activity Recognition in Outdoor Space Graham McCalmont • Huiru Zheng • Haiying Wang • Sally McClean • Matteo Zallio • Damon Berry 3

Table 2 :
Accuracy, specificity and sensitivity for each of the sensor groups This paper presented a lightweight algorithm that was able to classify what exercise is being performed, if data from the same sensor is used.