INTRODUCTION
One of the most popular study fields is the Human Activity Recognition (HAR) subject ( Hussain et al., 2022). Owing to the availability of accelerometers and sensors, low energy consumption and minimum cost, and developments in computer vision, artificial intelligence, and Internet of Things (IoT) applications were built with the human-centered model observing to categorize, recognize, and detect human behavior. Scholars have provided several approaches regarding this topic ( Yadav et al., 2022). HAR has become a crucial tool for monitoring a person’s dynamism, and it is accomplished by utilizing ML techniques ( Brishtel et al., 2023). HAR is a technique of automatically analyzing and detecting human activities related to data needed from different wearable devices and smartphone sensors like location, accelerometer sensors, time, various other environmental sensors, and gyroscope sensors ( Thapa et al., 2023). Combined with other technologies like IoT, it is utilized in diverse application areas like industry, healthcare, and sports ( Park et al., 2019).
The detection of human activity is applied to different fields like elderly care, health care, and preventive medicine ( Mazzia et al., 2022). Furthermore, with the dramatic increase of devices with built-in sensors like smartphones, the cost of sensing gadgets has decreased significantly. Consequently, studies on mobile activity detection were conducted actively ( Shen et al., 2023). In conventional activity detection methods, authors have often utilized an ML approach like naive Bayes, support vector machine, decision tree, and random forest to detect actions from feature vectors derived from signals in time window utilizing Fourier transformation or statistic values ( Moutinho et al., 2023). Recurrent neural networks (RNNs) have a directed closed cycle. RNNs are appropriate for managing time-series datasets, like video and audio signals and natural language. Currently, hierarchical multi-layered convolutional neural networks (CNNs) have reached visible outcomes in fields like image processing and are grabbing attention to the technique named deep learning (DL). In this direction, as the RNN has deep layers for temporal direction, it comes to capture as a DL approach. Compared with conventional activity identification approaches that can be input feature vectors, in DL, the original datasets are a direct input ( Islam et al., 2022). It allowed the computation of feature vectors to be skipped during recognition and training so that a speedup can be anticipated, particularly in detection. Meanwhile, one can expect that the recognition outcome has to be accurate due to DL ( Khodabandelou et al., 2023).
This paper presents a new Arithmetic Optimization Algorithm with LSTM Autoencoder (AOA-LSTMAE) for HAR technique in the IoT environment. In the presented AOA-LSTMAE technique, the major intention is to recognize several types of human activities in the IoT environment. To accomplish this, the AOA-LSTMAE technique mainly derives the P-ResNet model for feature extraction. In addition, the AOA-LSTMAE technique utilizes the LSTMAE classification model for the recognition of different activities. For improving the recognition efficacy of the LSTMAE model, AOA is used as a hyperparameter optimization system. The simulation validation of the AOA-LSTMAE algorithm is tested on benchmark activity recognition data.
LITERATURE REVIEW
Zhang et al. (2023) introduced a new architecture containing three parts: feature selection related to an oppositional and chaos PSO method, deep decision fusion relating to D-S evidence theory, and entropy multi-input 1D-CNN leveraging frequency-domain and time-domain signals. The presented structure can be assessed on the WIDSM and UCI HAR data. Slim et al. (2021) present a new approach for enhancing the DL structure through GA and adding novel statistical features. For acquiring the optimum value variables of DL, GA can be leveraged as an enhancing approach. Also, from the CNN method, novel statistical attributes are added to the automatically extracted attributes. The authors ( Khan et al., 2022) established a hybrid method by merging LSTM and CNN for activity detection, where CNN was utilized for extracting spatial features, and the LSTM network was leveraged for learning temporal data. A wide ablation study was executed over various conventional DL and ML methods for attaining an optimum solution for HAR. In addition, utilizing the Kinect V2 sensor, a novel challenging dataset is generated.
Pesenti et al. (2023) modeled a DL-related method using inertial sensors to offer industrial exoskeletons with adaptive payload compensation and HAR. In any industrial exoskeleton, inertial measurement units were embeddable or easily wearable. The author used LSTM networks to perform HAR and categorize the weight of lifted objects. The method can be tested and trained on 12 young, healthy volunteers. Basak et al. (2022), devised DSwarm-Net, a structure that makes use of DL and SI-related meta-heuristic that utilizes 3-D skeleton data for action classification for HAR. Malik et al. (2023) developed a potential multi-view interaction level action detection mechanism utilizing 2-D skeleton data with more precision, whereas minimizing the computation complexity depends on the DL structure. Utilizing the OpenPose approach, the presented system extracted 2-D skeleton data from the dataset. Then, the extraction of 2D skeleton features is fed as input to the CNN-LSTM structure for detecting actions. In Nafea et al. (2021), the authors introduced a novel approach utilizing CNN with changing kernel dimension and Bi-LSTM to capture attributes at various resolutions. The potential extraction of temporal and spatial features in sensor data and the selection of optimum video representation utilizing BiLSTM and CNN is the novelty of this study. Muaaz et al. (2022) presented Wi-Sense—a HAR that utilizes a CNN for detecting human actions related to the fingerprint derived from Wi-Fi channel state information.
THE PROPOSED MODEL
In this paper, we have presented a novel AOA-LSTMAE model for automated recognition and classification of human activities to aid elderly and disabled people. The AOA-LSTMAE technique’s major intention is to recognize several types of human activities in the IoT environment. To accomplish this, the AOA-LSTMAE technique comprises a P-ResNet feature extractor, LSTMAE classification, and AOA-based hyperparameter tuning. Figure 1 exemplifies the overall procedure of the AOA-LSTMAE algorithm.
Feature extraction
To produce feature vectors, the P-ResNet model is used. P-ResNet is dependent on the development of ResNet, which offers a technique for data classification ( Xu et al., 2022). The network structure of P-ResNet includes six different parts, five of which are the convolutional layer and the final one is the FC layer. The ReLU, convolution function, and BN are exploited as the activation function for completing the output of the convolutional layer. Moreover, to stop overfitting and decrease the computation and amount of parameters from the network, a method of average pooling and max pooling is applied. The input image has been resized to 224×224×3. The convolution layer of the P-ResNet network drives over a 7×7 convolutional layer. The receptive field that is utilized for extracting features of images in these databases is big and sufficient. More subtle features should be extracted to accurately categorize maize seeds. Additionally, a network depth must be developed to minimize the size model. Thus, the convolutional layer of 2-5 layers was enhanced to better suit the classification. The study utilized 24 3×3 convolutional layers for learning, with additional nonlinear activation functions for making decision functions more correct; at the same time, it could efficiently reduce the number of parameters. Additionally, the main area occupies a smaller region of images in online inspection during the seed processing industry, and the proportion of data attained is weak. To avoid useless and redundant data, a pooling layer was added to incorporate spatial data before the convolutional kernel of the remaining model downsampling.
Activity recognition using the LSTMAE model
For the identification of several kinds of human activities, the LSTMAE model is utilized. LSTM network is a revised version of RNN, which remembers the long-term dependency in an effective manner ( Faraz et al., 2020). RNN encounters the gradient vanishing problems, while these problems are solved in the LSTM network. The keystone of LSTM is a cell (or memory unit). A cell incorporates one tanh and three sigmoid layers that procedure three gates forming the data outside and inside of cells. The output and input gates control the output and input dataset from the cell correspondingly. The forget gate resets the memory unit and has a sigmoid function. Assume the information x _{z} , the data flow in an LSTM cell is expressed using the following equations:
Where Tanh denotes the hyperbolic tangent function and * signifies a point-wise multiplication operator, the newest cell state ( C) signifies the novel data. o _{z} , f _{z} , and i _{z} indicate the output, forget, and input gates at z time correspondingly. C _{z} denotes the cell state vector, and h _{z} characterizes the hidden layer at z existing time.
AE is an ANN that takes account of two parts: an encoded h = f( x) and decoded which generates a reconstruction $\widehat{x}=g\mathrm{(}h)$ . The model can be required to provide significance of applicable properties of input; specifically, an AE learns suitable aspects of data.
AE is used to encode and compress the data. AE refers to an unsupervised ANN, which makes a decreased encoder representation of data and later learns that recreates the data back in them. The AE encompasses the LSTM layer in encoder and decoder units, and we use dropout as a regularization technique to avoid overfitting after each LSTM layer. First, the AE is trained. Then, the encoded part has been exploited as the feature generator. And lastly, the LSTM-based predictor is trained. Figure 2 represents the infrastructure of the LSTMAE model.
The w can be defined as a timestep in time sequences data and apply x _{z}, x _{z+} _{1} …, x _{z+w} to forecast the final price the next day. X denotes the AE-LSTM network input and can be shown as follows:
x _{z+w} _{+1} is exploited as the target during the trained stage.
AOA-based hyperparameter tuning
Finally, AOA was used for the optimal hyperparameter tuning of the LSTMAE approach. Arbitrarily creating a candidate solution set having population-oriented systems starts with an improvement procedure ( Deepa and Chokkalingam, 2022). The optimizer rule set incrementally enhances the created group of solutions, but a certain main function calculates it. For the provided issue, the global optimizer technique gains probability. The optimizer method contains two important classes in population-based optimizer approaches: exploitation and exploration. In the exploration step, the final is the improvement of the acquired result. Based on AOA, the subsequent subsections describe intensification (exploitation) and diversification (exploration). Multiplication, addition, division, and subtraction were the main arithmetic operators as explained in Algorithm 1.
Inspiration
With geometry, algebra, and analysis, the most important segment of modern mathematics can be an essential element of the number model termed arithmetic. From any group of candidate solutions, an optimum element exposed to specific conditions was defined utilizing an AOA as the mathematical optimizer.
Initialized step
Equation (8) displays the candidate solutions ( Y) group in AOA. During every iteration, the best-acquired solution regards an optimum candidate result.
(Input): Initialized AOA parameters with the maximal iteration counts
Output: Acquire the optimum solution While ( c_iteration < M_iteration) do Estimate fitness function (FF) Attain the optimum solution once it determines any optimum Upgrade the value of M _{OA} Upgrade the value of M _{OA} For ( j = 1 tos _{olution}), do For ( j = 1 tos _{olution}), do The random values R _{1}, R _{2}, and R _{3} created among zero and one If R _{1} > M _{OA} Then Diversification stage If R _{2} > 0.5 Then The division math operator was executed Upgrades the jth solution position Else The multiplication math operator was implemented Upgrades the kth solution position End If Else Intensification stage If R _{3} > 0.5 Then The subtraction math operator was performed Upgrades the jth solution position Else The addition math operator was carried out Upgrades the kth solution position End If End If End For End For ( c_iteration < c_iteration+1) End While |
The searching phase was chosen before the AOA initialized working. Equation (9) computes the math optimizer accelerated ( M _{OA} ) function.
At the tth iteration, the function value can be referred to as M _{OA} ( c_ iteration); thus, maximal and minimal iteration can be computed.
Diversification stage
It is established as the AOA of diversification or exploration performance. A higher distributed value was attained utilizing multiplication/division operators dependent upon the arithmetic operators. According to multiplication and division, an optimum solution can be defined by exploring AOA exploration operators. The arithmetic operator performance was inspired by utilizing the simplest rule. Equation (10) demonstrates the position upgrade of the exploration step.
During the next iteration, the jth and kth solution positions were defined as y _{j,k} ( c_iteration). The smaller integer is δ, with the control parameter being α. The jth positions of upper and lower bounds are U _{i} and L _{i} .
At the tth iteration, the function value can be formulated as M _{OA} ( c_iteration). Additionally, the sensitive parameter was β.
Intensification stage
Higher dense outcomes can be attained utilizing addition/subtraction of arithmetic operators. Due to lower dispersion, subtraction and addition can simply be the targeted manner. Afterwards, a small iteration assumes the near-optimum solution in the recognition of exploration searches. Equation (12) describes the exploitation step.
Exploit the searching operator of exploitation to avoid being surrounded by the local searching region. The random values R _{1}, R _{2}, and R _{3} are created among zero and one intervals. The optimum solution can be attained by supporting the exploitation searching step.
Fitness choice is a key aspect of the AOA system. An encoded result was employed for evaluating the goodness of candidate results. Recently, the accuracy value has been a significant condition applied to propose a fitness function.
where TP signifies the true-positive value and FP refers to the false-positive value.
RESULTS AND DISCUSSION
The proposed model is simulated using the Python tool. The experimental outcomes of the AOA-LSTMAE methodology are tested on the UR fall detection dataset. It comprises 314 instances with two classes, as depicted in Table 1. Figure 3 shows the sample images.
The suggested technique is put under simulation by employing Python 3.6.5 tool on PC i5-8600k, 250GB SSD, GeForce 1050Ti 4GB, 16GB RAM, and 1TB HDD. The parameter settings are provided in the following: learning rate: 0.01, activation: ReLU, epoch count: 50, dropout: 0.5, and size of batch: 5.
In Figure 4, a brief activity recognition result of the AOA-LSTMAE technique is presented in the form of a confusion matrix. The results notified that the AOA-LSTMAE technique recognized the fall and nonfall events effectually.
In Table 2 and Figure 5, the overall activity detection outcome of the AOA-LSTMAE method is reported under distinct epochs. With 500 epochs, the AOA-LSTMAE technique attains an average accu _{y} of 99.12%, prec _{n} of 99.12%, reca _{l} of 99.12%, spec _{y} of 99.12%, and F _{score} of 99.12%. Simultaneously, with 1500 epochs, the AOA-LSTMAE approach acquires an average accu _{y} of 98.91%, prec _{n} of 98.46%, reca _{l} of 98.91%, spec _{y} of 98.91%, and F _{score} of 98.68%. Concurrently, with 2000 epochs, the AOA-LSTMAE method attains an average accu _{y} of 98.23%, prec _{n} of 98.23%, reca _{l} of 98.23%, spec _{y} of 98.23%, and F _{score} of 98.23%. Finally, with 3000 epochs, the AOA-LSTMAE algorithm reaches an average accu _{y} of 97.35%, prec _{n} of 97.35%, reca _{l} of 97.35%, spec _{y} of 97.35%, and F _{score} of 97.35%.
Class | Accu _{y} | Prec _{n} | Reca _{l} | Spec _{y} | F _{score} | |||||
---|---|---|---|---|---|---|---|---|---|---|
Epoch 500 | ||||||||||
Fall event | 98.65 | 98.65 | 98.65 | 99.58 | 98.65 | |||||
Nonfall event | 99.58 | 99.58 | 99.58 | 98.65 | 99.58 | |||||
Average | 99.12 | 99.12 | 99.12 | 99.12 | 99.12 | |||||
Epoch 1000 | ||||||||||
Fall event | 97.30 | 94.74 | 97.30 | 98.33 | 96.00 | |||||
Nonfall event | 98.33 | 99.16 | 98.33 | 97.30 | 98.74 | |||||
Average | 97.82 | 96.95 | 97.82 | 97.82 | 97.37 | |||||
Epoch 1500 | ||||||||||
Fall event | 98.65 | 97.33 | 98.65 | 99.17 | 97.99 | |||||
Nonfall event | 99.17 | 99.58 | 99.17 | 98.65 | 99.37 | |||||
Average | 98.91 | 98.46 | 98.91 | 98.91 | 98.68 | |||||
Epoch 2000 | ||||||||||
Fall event | 97.30 | 97.30 | 97.30 | 99.17 | 97.30 | |||||
Nonfall event | 99.17 | 99.17 | 99.17 | 97.30 | 99.17 | |||||
Average | 98.23 | 98.23 | 98.23 | 98.23 | 98.23 | |||||
Epoch 2500 | ||||||||||
Fall event | 97.30 | 98.63 | 97.30 | 99.58 | 97.96 | |||||
Nonfall event | 99.58 | 99.17 | 99.58 | 97.30 | 99.38 | |||||
Average | 98.44 | 98.90 | 98.44 | 98.44 | 98.67 | |||||
Epoch 3000 | ||||||||||
Fall event | 95.95 | 95.95 | 95.95 | 98.75 | 95.95 | |||||
Nonfall event | 98.75 | 98.75 | 98.75 | 95.95 | 98.75 | |||||
Average | 97.35 | 97.35 | 97.35 | 97.35 | 97.35 |
Abbreviation: AOA-LSTMAE, Arithmetic Optimization Algorithm with LSTM Autoencoder.
Figure 6 portrays the accuracy of the AOA-LSTMAE method in the training and validation of epoch 500. The result specified that the AOA-LSTMAE algorithm gains higher accuracy values over higher epochs. Furthermore, the higher validation accuracy over training accuracy depicted that the AOA-LSTMAE method learns productively on epoch 500.
The loss analysis of the AOA-LSTMAE method in training and validation is given on epoch 500 in Figure 7. The result highlighted that the AOA-LSTMAE method attained closer training and validation loss values. The AOA-LSTMAE algorithm learns productively on epoch 500.
The detailed precision-recall (PR) curve of the AOA-LSTMAE approach is given on epoch 500 in Figure 8. The figure specified that the AOA-LSTMAE approach leads to higher values of PR. Furthermore, the AOA-LSTMAE method can reach greater PR values on all classes.
In Figure 9, a ROC study of the AOA-LSTMAE method is shown on epoch 500. The figure described that the AOA-LSTMAE method has improved ROC values. Also, the AOA-LSTMAE approach can extend enhanced ROC values in every class.
In Table 3 and Figure 10, a brief comparative accu _{y} result of the AOA-LSTMAE technique and compared methods is given ( Vaiyapuri et al., 2021) The results represented that the ResNet-50 and ResNet-101 approaches accomplish poor performance with accu _{y} of 95.40 and 96.20%, respectively. Then, the VGG-16, VGG-19, and IMEFD-ODCNN models have resulted in closer accu _{y} of 97.60, 98, and 98.57%, respectively. But the AOA-LSTMAE technique exhibits improved results with an accu _{y} of 99.12%.
Methods | Accuracy (%) |
---|---|
VGG-16 | 97.60 |
VGG-19 | 98.00 |
ResNet-50 | 95.40 |
ResNet-101 | 96.20 |
IMEFD-ODCNN | 98.57 |
AOA-LSTMAE | 99.12 |
Abbreviation: AOA-LSTMAE, Arithmetic Optimization Algorithm with LSTM Autoencoder.
The computation time analysis of the AOA-LSTMAE algorithm with other existing methods was performed using training time (TRT) and testing time (TST) shown in Table 4 and Figure 11. The simulation values indicate that the AOA-LSTMAE approach reaches effectual outcomes in terms of TRT and TST. Based on TRT, the AOA-LSTMAE algorithm gains the least TRT of 9.01s, whereas the existing models attain increased TRT values. Next, based on TST, the AOA-LSTMAE technique gains the least TST of 8.30s, whereas the existing models attain increased TST values.
Methods | Training time (seconds) | Testing time (seconds) |
---|---|---|
VGG-16 | 39.21 | 18.48 |
VGG-19 | 46.31 | 22.87 |
ResNet-50 model | 23.68 | 14.65 |
ResNet-101 model | 25.76 | 15.43 |
IMEFD-ODCNN | 16.90 | 11.29 |
AOA-LSTMAE | 09.01 | 08.30 |
Abbreviations: AOA-LSTMAE, Arithmetic Optimization Algorithm with LSTM Autoencoder; TRT, training time; TST, testing time.
Therefore, the results assured that the AOA-LSTMAE technique improves recognition results over other models.
CONCLUSION
In this paper, we have developed a novel AOA-LSTMAE model for the automated recognition and classification of human activities to aid elderly and disabled people. In the presented AOA-LSTMAE technique, the major intention is to recognize several types of human activities in the IoT environment. To accomplish this, the AOA-LSTMAE technique comprises a P-ResNet feature extractor, LSTMAE classification, and AOA-based hyperparameter tuning. For improving the recognition efficacy of the LSTMAE model, AOA is used as a hyperparameter optimization system. The simulation validation of the AOA-LSTMAE system was tested on benchmark activity recognition data. The simulation results of the AOA-LSTMAE technique and compared methods stated the improvement of the proposed model over other recent algorithms. In future, a ten-fold cross-validation approach can be applied to investigate the performance of the proposed model.