An intrusion detection system (IDS) is a device or software application that monitors network or system activities for malicious activities or policy violations and produces reports to a management station. IDS come in a variety of âflavorsâ and approach the goal of detecting suspicious traffic in different ways. There are network based (NIDS) and host based (HIDS) intrusion detection systems. NIDS is a network security system focusing on the attacks that come from the inside of the network (authorized users). When we classify the designing of the NIDS according to the system interactivity property, there are two types: on-line and off-line NIDS. On-line NIDS deals with the network in real time and it analyses the Ethernet packet and applies it on the some rules to decide if it is an attack or not. Off-line NIDS deals with a stored data and pass it on a some process to decide if it is an attack or not. Some systems may attempt to stop an intrusion attempt but this is neither required nor expected of a monitoring system. Intrusion detection and prevention systems (IDPS) are primarily focused on identifying possible incidents, logging information about them, and reporting attempts. In addition, organizations use IDPSes for other purposes, such as identifying problems with security policies, documenting existing threats and deterring individuals from violating security policies. IDPSes have become a necessary addition to the security infrastructure of nearly every organization.
IDPSes typically record information related to observed events, notify security administrators of important observed events and produce reports. Many IDPSes can also respond to a detected threat by attempting to prevent it from succeeding. They use several response techniques, which involve the IDPS stopping the attack itself, changing the security environment (e.g. reconfiguring a firewall) or changing the attack’s content.
Intrusion detection systems are of two main types, network based (NIDS) and host based (HIDS) intrusion detection systems.
1.2 Network Intrusion Detection Systems
Network Intrusion Detection Systems (NIDS) are placed at a strategic point or points within the network to monitor traffic to and from all devices on the network. It performs an analysis of passing traffic on the entire subnet, and matches the traffic that is passed on the subnets to the library of known attacks. Once an attack is identified, or abnormal behavior is sensed, the alert can be sent to the administrator. An example of an NIDS would be installing it on the subnet where firewalls are located in order to see if someone is trying to break into the firewall. Ideally one would scan all inbound and outbound traffic, however doing so might create a bottleneck that would impair the overall speed of the network. OPNET and NetSim are commonly used tools for simulation network intrusion detection systems. NID Systems are also capable of comparing signatures for similar packets to link and drop harmful detected packets which have a signature matching the records in the NIDS.
1.3 Host Intrusion Detection Systems
Main article: Host-based intrusion detection system
Host Intrusion Detection Systems (HIDS) run on individual hosts or devices on the network. A HIDS monitors the inbound and outbound packets from the device only and will alert the user or administrator if suspicious activity is detected. It takes a snapshot of existing system files and matches it to the previous snapshot. If the critical system files were modified or deleted, an alert is sent to the administrator to investigate. An example of HIDS usage can be seen on mission critical machines, which are not expected to change their configurations.
â¢ True Positive: : Attack – Alert
â¢ False Positive: : No attack – Alert
â¢ False Negative: : Attack – No Alert
â¢ True Negative: : No attack – No Alert
â¢ True Positive: A legitimate attack which triggers an IDS to produce an alarm.
â¢ False Positive: An event signaling an IDS to produce an alarm when no attack has taken place.
â¢ False Negative: When no alarm is raised when an attack has taken place.
â¢ True Negative: An event when no attack has taken place and no detection is made.
â¢ Noise: Data or interference that can trigger a false positive or obscure a true positive.
1.4 K-means Clustering
K-means clustering  is one of the simplest unsupervised clustering algorithms. The algorithm takes input parameter âkâ and partition the ânâ dataset into k cluster so that the intra-cluster similarity is high and intercluster similarity is low. âKâ is a positive integer number given in advance. K means clustering takes less time as compared to the hierarchical clustering and yields better results.
With the help of clustering training dataset is clustered into 5 dataset wherein 4 dataset will be a type of intrusion called attack dataset and one with normal data type called normal dataset. Here are the four steps of the clustering algorithms: 1) Define the number of clusters K. 2) Initialize the K cluster centroids. This can be done by arbitrarily dividing all objects into K clusters, computing their centroids, and verifying that all centroids are different from each other. Alternatively, the centroids can be initialized to K arbitrarily chosen, different objects. 3) Iterate over all objects and compute the distances to the centroids of all clusters. Assign each object to the cluster with the nearest centroid. 4) Recalculate the centroids of both modified clusters. 5) Repeat step 3 until the centroids do not change any more. A distance function is required in order to compute the distance (i.e. similarity) between two objects. The most commonly used distance function is the Euclidean one which is defined as: d(x,y) = ââ’ Where x = (x1 . . . xm) and y = (y1â¦ym) are two input vectors with m quantitative features. In the Euclidean distance function, all features contribute equally to the function value. However, since different features are usually measured with different metrics or at different scales, they must be normalized before applying the
1.5 SVM Classifier
SVM classifier  is used to produce better result for binary classification when compared to other classifier. In our proposed technique non linear kernel function are used and resulting maximum margin hyper-plane fits in a transformed feature space is a Hilbert space of infinite dimensions. The Gaussian Radial Basics function is given by the equation below. The xâ defines the center of radial basis function, the vector âxâ is the pattern applied to the input. is a measure of width of â xâ â Gaussian function with center xâ. The input dataset having large number of attributes is changed into data having k+1 attributes by performing the above steps. The data is given to the SVM to detect if there is any intrusion or not.
1.6 Testing and Validation
For our experiments we are using KDD CUP 99 dataset. KDD CUP 1999 contains 41 fields as an attributes and 42nd field as a label. In our algorithm we have taken selected features. The 42nd field can be generalized as Normal, DoS, Probing, U2R, and R2L. The description of KDD CUP 99 used for our method shown in table 1. The performances of each method are measured according to the Accuracy, Detection Rate and False Positive Rate using the following expressions:
FN is False Negative
TN is True Negative
TP is True Positive
FP is False Positive
The detection rate is the number of attacks detected by the system divided by the number of attacks in the data set. The false positive rate is the number of normal connections that are misclassified as attacks divided by the number of normal connections in the data set.
1.7 Fuzzy Logic:
Fuzzy Logic is a problem solving control Structure approach that gives itself to implementation in the systems which are ranging from multichannel PC or Workstation acquisition and control systems. It can be engaged in hardware, software, or in both. It offers a simple manner to attain on a definite decision based upon indefinite, ambiguous, inaccurate, noisy, or absent input information.
The proposed system is based on fuzzy logic. It is a form
of multi valued logic derived from fuzzy set theory to
deal with reasoning that is approximate rather than
precise. In contrast with âcrisp logicâ,  where binary
sets have binary logic, fuzzy logic variables may have a
truth value that ranges between 0 and 1 and is not
constrained to the truth values of classic propositional
logic. Using Mamdani fuzzy module,
Ri : if xá¿ is Ai then yá¿ is Bi. i=1,2,â¦â¦,k
xá¿ ï ï input (antecedent) linguistic variable,
Ai ï ï antecedent linguistic terms (constants),
yá¿ ï ï output (consequent) linguistic variable,
Bi ï ï consequent linguistic terms.
The fuzzy mode is integrated with AODV
routing protocol. It consists of following four
components namely fuzzy factor withdrawal, Fuzzy
calculation, Fuzzy confirmation Module and Alarm
Packet Generation Module. The fuzzy factor withdrawal
is used to the system that extracts the parameters
required for analysis from network traffic. And these
requirements are passed to fuzzy calculation module,
which applies various fuzzy rules and membership
functions to calculate  fidelity level of the node. This
fidelity level is compared with threshold value in fuzzy
confirmation module to check the behavior of node and
the fidelity level is less than threshold level, an alarm
packet with the IP address of detected malicious node is
broadcasted in the network.
1.8 Problem Statement
Intrusion detection faces a number of challenges, it must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network. Intrusion detection systems are gauged base on its detection precision and detection stability.
The majority of the current existing system faces a number of challenges such as low detection rate and high false alarm rate which falsely classify a normal connection as an attack and this therefore obstructs legitimate user access to the network resources.
These problems are due to the sophistication of the attacks and their intended similarities to normal behaviors. More intelligence is brought into the IDS by means of machine learning, theoretically its possible for a machine learning algorithm to archieve the best performance by maximizing the detection accuracy.
However, it requires infinite training sample sizes. This give rise towards enhancing the detection precision and stability.
Early researchers focused on using expert system and statistical approaches.But when encountered large datasets, the results becomes worse.
Network security has become the key for lot of financial and business web applications. Intrusion detection is one of the looms to resolve the problem of network security. Imperfectness of intrusion detection systems (IDS) has given an opportunity for data mining to make several important contributions to the field of intrusion detection. In recent years, data mining techniques for building IDS Where used. A propose new approach by utilizing data mining techniques such as neuro-fuzzy and radial basis support vector machine (SVM) for helping IDS to attain higher detection rate. The proposed technique has four major steps: primarily, k-means clustering is used to generate different training subsets. Then, based on the obtained training subsets, different neuro-fuzzy models are trained. Subsequently, a vector for SVM classification is formed and in the end, classification using radial SVM is performed to detect intrusion has happened or not. To illustrate the applicability and capability of the new approach, the experiments Would use KDD CUP 1999 dataset
...(download the rest of the essay above)