Skip to content

Book dataset statistics can't align #34

@Tokkiu

Description

@Tokkiu

Hi, I use your provided preprocess script to process book dataset. The data file is also downloaded at the website as you mentioned. However, I got the book statistics as follows:

total items: 367982
total users: 603668
total behaviors: 8898041

While the processed data you provided is as follows:
total items: 313966
total users: 459133
total behaviors: 8898041

All I just did was:

  1. Download the dataset from http://jmcauley.ucsd.edu/data/amazon/index.html
  2. Decompress the file to get reviews_Books_5.json
  3. Run script python preprocess/data.py book

The misalignment makes me confused. Could you elaborate on it or publish the latest version of data.py?

Thank you for your feedback!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions