Data-Analysis

Applying pre-processing techniques and machine learning models to analyse data

Below are the details of the notebooks present in this repository:

DataA : This is a time-series dataset which is collected from a set of motion sensors for wearable activity recognition. The data is given in time order, with 19,000 samples and 81 features. Some missing values are denoted by Not Available (NA) and also some outliers are present.
DataB : Handwritten digits of 0, 1, 2, 3, and 4 (5 classes). This dataset contains 2066 samples with 784 features corresponding to a 28 x 28 gray-scale (0-255) image of the digit, arranged in column-wise.

bank-additional : The dataset is present here https://archive.ics.uci.edu/ml/datasets/bank+marketing. The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed.
DataDNA : This data is the splice junctions on DNA sequences. The given dataset includes 2200 samples with 57 features. It is a binary class problem. The class labels are either +1 or -1, which is given in the last column.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
README.md		README.md
classification.ipynb		classification.ipynb
pre-process-feature-extraction.ipynb		pre-process-feature-extraction.ipynb

Provide feedback