Data Preprocessing

Use the 10% data from KDD Cup 1999 Dataset located at http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html for this exercise.

This assignment focuses on Data preprocessing. Good Data preprocessing leads to good analysis. Prepare this dataset by addressing

various preprocessing tasks such as removing missing values, fixing headers, normalizing columns if need be, discretizing continuous

columns, feature selection.

For this assignment the assumption is that you are preparing the data to be analyzed using a decision tree algorithm. You are not

expected to perform classification at this point. This assignment is mainly to get familiar with this data and data preprocessing

strategies which can impact your mining outcomes.

You use the preprocessing tool: WEKA for this exercise. Submit a document describing the preprocessing tasks you performed. Include

snapshots where necessary (please do not submit pictures of every single step unless necessary to your narrative).

The grade for this assignment is based on (a) Types of data preprocessing performed (20%) (b) articulation of reasoning (why did you do

such preprocessing) (40%) (c) Discussion of findings or obstacles in preprocessing (20%) (d) Clarity of writing (20%)
TAKE ADVANTAGE OF OUR PROMOTIONAL DISCOUNT DISPLAYED ON THE WEBSITE AND GET A DISCOUNT FOR YOUR PAPER NOW!

READ ALSO :   Management