Use the 10% data from KDD Cup 1999 Dataset located at http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html for this exercise.
This assignment focuses on Data preprocessing. Good Data preprocessing leads to good analysis. Prepare this dataset by addressing
various preprocessing tasks such as removing missing values, fixing headers, normalizing columns if need be, discretizing continuous
columns, feature selection.
For this assignment the assumption is that you are preparing the data to be analyzed using a decision tree algorithm. You are not
expected to perform classification at this point. This assignment is mainly to get familiar with this data and data preprocessing
strategies which can impact your mining outcomes.
You use the preprocessing tool: WEKA for this exercise. Submit a document describing the preprocessing tasks you performed. Include
snapshots where necessary (please do not submit pictures of every single step unless necessary to your narrative).
The grade for this assignment is based on (a) Types of data preprocessing performed (20%) (b) articulation of reasoning (why did you do
such preprocessing) (40%) (c) Discussion of findings or obstacles in preprocessing (20%) (d) Clarity of writing (20%)
TAKE ADVANTAGE OF OUR PROMOTIONAL DISCOUNT DISPLAYED ON THE WEBSITE AND GET A DISCOUNT FOR YOUR PAPER NOW!