Predicting-Sales

Load the CSV file - Result is a Pandas dataframe
- Input:
  - Critic Score (Float with NaNs)
  - Publisher (Catagory)
  - User Score (Float with NaNs)
  - Genre (Catagory)
  - Name (String)
  - User Count (Float with NaNs)
  - Critic Count (Float with NaNs)
- Output: Global Sales (Float)
Tokenize Names - Bag of Words
- The loop that builds the vocab then counts how many times that is
```
Put the Bag-of-Words code here later
```
- Result: a list of Ints and how many Ints in that list is the size of the $n_b$ (this is vocab)
  - You will often see something like this - max vocab: 200 words words(drop infrequent) - This will keep the 200 most common words and drop the ones that don’t show up a lot
  NOTE: Sometimes you may see an error that comes up that looks something like this:
```
Mat1 * Mat2, 
	Shapes (1007, 2)
	and (4, 97)
```
  This is why the shape of you data matters
Collect results from all rows (16720, $n_b$) - Save as tensor and call t_names
Take user_score column
- Options:
  1. Remove rows with NaNs - this Pandas function is dropna()
  2. Imputation: inject mission values
    1. Use the average value to fill in all missing values - the Pandas function is fillna()
    2. Make it zero
    3. Train a model on the other columns to predict the user_score (do not use global_sales)
- Result: tensor(16720, 1)
Take publisher column - One-Hot encoding without having to tokenize it - this Pandas function is get_dummies()
- Result: (16720, $n_p$) - $p$ for publisher
Take genre
- Result: 916720, $n_g$) - $g$ for genre
Figure out the total size
```
torch.cat([t_names, t_user_score, ...])
```
- Result: (16720, $n_{v} + 1 + 1 + 1 + n_p + n_g$)
Build a neuron
```
model = nn.Linear(input_dim, 1) 
```
Where input_dim refers to the number of input features that the model, expects to receive for each individual sample in the dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
main.py		main.py
video_games_sales.csv		video_games_sales.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Predicting-Sales

About

Uh oh!

Releases

Packages

Languages

Mattrobby/Predicting-Sales

Folders and files

Latest commit

History

Repository files navigation

Predicting-Sales

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages