This part of research aims to investigate and select critical features from a given dataset (KDD Cup 1999). Wrapper technique can be used to select critical features for both normal and malicious data. Figure 3.6 describes activity M2 in the research framework and shows how to utilize the technique to extract malicious and normal features based on the given dataset to the wrapper.
Consequently, the results of this technique will be critical feature set for each connection type.
General features selection aspects for wrapper approach is illustrated in figure 3.5. This study adopted wrapper approach to investigate critical features in traffic connections. The wrapper uses a classifier to determine the importance of the selected features. Then the wrapper selects and refines the correlated features that represent the behavior of each pattern in the dataset. Furthermore, candidate classification technique used as induction method in the wrapper is decision tree (DT). Moreover, Genetic algorithm (GA) will be highlighted and wrapped with the induction methods as a random search method due to its global search strategy, which is inspired by the natural selection principle. Where, different wrappers will be investigated, each one consists of the Genetic algorithm as a random search method wrapped with one of the suggested induction functions (classifiers). More details about these techniques can be found in chapter 2. Thus, chapter 4 discusses the details of design and experiments for phase 1. And a comparative analysis will compare the enhancements with other published studies.
Selection method generally consists of four steps described below.
genetic algorithm (GA).
- Whether a predefined number of features are selected - Whether a predefined number of iterations reached.
- Whether addition (or deletion) of any feature does not produce a better subset - Whether an optimal subset according to some evaluation function is obtained.
Stopping criteria based on an evaluation function will be:
* Whether addition (or deletion) of any feature does not produce a better subset
Genetic algorithm mainly composed of three operators:
Reproduction, crossover, and mutation. Reproduction selects good string; crossover combines good strings to try to generate better offspring’s; mutation alters a string locally to attempt to create a better string. In each generation, the population is evaluated and tested for termination of the algorithm. If the termination criterion is not satisfied, the population is operated upon by the three GA operators and then re-evaluated. This procedure is continued until the termination criterion is met. The working of proposed wrapper method is shown in the below figure. GA is used as random search method with one classifier namely decision tree (DT), as induction method wrapped with GA. Further the relevant attributes identified by proposed wrapper is validated by different classifiers.
For GA, population size is 20, number of generation is 20 as terminating condition, crossover rate is 0.6 and mutation rate is 0.033.
Assignment Writing Help
Engineering Assignment Services
Do My Assignment Help
Write My Essay Services