Article title: Addressing Polymorphic Attack Strategies with Misbehavior Detection for ITS

Cooperative Intelligent Transportation Systems (cITSs) is one of the Internet of Things (IoT) applications whose purpose is to improve road safety and traffic efficiency. Within this system, vehicles can communicate with one another by establishing a Vehicular Ad-Hoc Network (VANET) along the road section. Although such connectivity facilitates the exchange of information related to road safety and traffic efficiency, it puts the vehicles at risk in that an attacker could compromise one or more vehicles and use them to share false information causing congestions and/or life-threatening accidents. Although several studies tried to address this issue, they assume that the network topology and/or attack behavior is stationary, which is not realistic as the cITS is dynamic in nature and the attackers may have the ability and resources to change their behavior continuously. Therefore, these assumptions are not suitable and lead to low detection accuracy and high false alarms. To this end, this paper proposes a misbehavior detection model that can cope with the dynamicity of both cITS topology and attack behavior. The model starts by addressing the issue of missing data that happen at the early stages of the model formation after a topology change. Then, the deep learning approach is used to select the discriminative features used to train. We expect that the proposed model will help to overcome the limitations of related solutions by detecting attacks that change their behavior continuously.


ABSTRACT
Cooperative Intelligent Transportation Systems (cITSs) is one of the Internet of Things (IoT) applications whose purpose is to improve road safety and traffic efficiency. Within this system, vehicles can communicate with one another by establishing a Vehicular Ad-Hoc Network (VANET) along the road section. Although such connectivity facilitates the exchange of information related to road safety and traffic efficiency, it puts the vehicles at risk in that an attacker could compromise one or more vehicles and use them to share false information causing congestions and/or life-threatening accidents. Although several studies tried to address this issue, they assume that the network topology and/or attack behavior is stationary, which is not realistic as the cITS is dynamic in nature and the attackers may have the ability and resources to change their behavior continuously. Therefore, these assumptions are not suitable and lead to low detection accuracy and high false alarms. To this end, this paper proposes a misbehavior detection model that can cope with the dynamicity of both cITS topology and attack behavior. The model starts by addressing the issue of missing data that happen at the early stages of the model formation after a topology change. Then, the deep learning approach is used to select the discriminative features used to train. We expect that the proposed model will help to overcome the limitations of related solutions by detecting attacks that change their behavior continuously.

INTRODUCTION
The purpose of Cooperative Intelligent Transportation Systems (cITSs) is to improve road safety and traffic efficiency. Within a road section, vehicles create a network to communicate and exchange different types of data regarding the traffic situation and safety information. However, such connectivity poses many threats to the cITSs vehicles (called nodes) outside as well as inside the network in the form of hijacked, rogue, and/or faulty nodes.

Research Motivation
-Vehicles in cITS are vulnerable to many forms of cyberattacks that compromise the data exchanged between the vehicles causing many operational disruptions like road congestions and accidents. Existing MDS solutions try to protect the cITS systems by introspecting the patterns of the behavioral signatures of the cyberattacks. However, these solutions are built based on non-realistic assumptions on the reliability of the data exchange between the nodes in the cITS systems. The attackers could hijack one or more vehicles and manipulate the data they generate and share with the neighboring nodes. Therefore, relying on such compromised information to build detection models adversely affects the accuracy of those solutions.
-Existing MDS solutions assume that the attacks target only a limited number of vehicles. However, this assumption does not necessarily hold as some advanced attacks like botnets can compromise most of the nodes in the network, which invalidates the majority honest. These solutions adopted the anomaly detection approach to define the boundaries of normal behavior and considers the instances falling outside this boundary as attack.
-Existing solution incorporated an adaptation mechanism to the detection model to adjust the security thresholds dynamically in real-time and cope with the non-stationary nature of the cITS topology. However, they assume that the data related to the new driving situation are sufficient. This does not hold as the amount of data collected at early stages after cITS's topology changes might not be sufficient to accurately determine and build the new thresholds of the security profiles. Therefore, these solutions lack the ability to detect the attacks accurately.
-Existing solutions assume that the attackers always follow identical or similar attack strategies. However, many attackers and malicious software use obfuscation techniques and change attack strategies to deceive the detection. Therefore, they will become outdated very quickly and unable to cope with the new attack strategies.

Problem Statement
Although several studies have proposed MDS solutions for the cITS that assess the trustworthiness of the data exchanged between the nodes and cope with the dynamic nature of the topology, these solutions have several limitations affect the detection accuracy.

Limitation #1
Amount of data collected at early stages after cITS's topology change is insufficient to accurately determine/build the new thresholds of the security profiles because they contain lots of missing (null) as well as immature values which have a  Based on the literature review, the existing solutions estimate redundancy and relevance coefficients by using a statistical approach so my contribution here I will use the autoencoder to estimate these coefficients.

Addressing Polymorphic Attack Strategies with Misbehavior Detection for ITS
-Semantical features will be extracted and fed into the detection model in order to cope with the dynamic nature of the attacks' behavior to improve detection polymorphic attacks and re-adjust the security parameters accordingly.
The semantics features meaning the feature related to the behavioral. For examples, How many time the vehicles are changing lane? How many data packets this vehicle sent per second? How many accidents this vehicle reports per second? it will give me more insights about the behavior itself so the model will become able to distinguish between the misbehaving vehicle and the honest one more accurately.

Dataset
The dataset in this research will be the Next Generation Simulation (NGSIM) Vehicle Trajectories. It is an open source and publicly available data set with a collection of real-world vehicles' trajectories collected by smart vehicles and consist of different patterns representing different drive situations and driver behavior. Particularly, NGSIM was built by collecting data from vehicles moving on a road section with 500 m long and seven lane highway. For each vehicle, the data is collected (recorded) for 45 minutes using 16 sensors.

CONCLUSION
A misbehavior detection model is proposed that address the limitation of existing solutions that assume that the nature of cITS topology as well as the attack strategy are stationary. The model consists of three main components, pre-processing, feature selection, and training/testing. The pre-processing estimates the missing data that occur at the early stages of model formulation after the change occurs. The discriminative features are chosen at the feature selection stage by employing a deep auto encoder to calculate the redundancy coefficients more accurately. Deep learning is then used to build the model by training the DBN using the data and features prepared and selected during the 1st and 2nd phases. In our future work, the three components of the model will be developed and integrated into an MDS model that can