Skip to content

Define the interface of Dataset.py #4

@MarcCote

Description

@MarcCote

We should discuss here the interface of the datasets the smartpy library will be manipulating. In contrast with https://github.com/SMART-Lab/mldata, this is not a generic dataset manager.

Some questions

  • What type of dataset we want to support?
  • How will we support the notion of inputs and targets (i.e. variable number of sets of data)?
  • Are trainset, validset and testset only three instances of Dataset or the split should be part of the class?
  • Should we provide symbolic variable for inputs and targets? Yes, it is a convenient place to put them their as they would be easy accessible instead of behind hidden in some obscure function responsible to compile the Theano graph.
  • How do we support targets for unlabeled dataset? Usually, models using unlabeled dataset use the input as their target, should this be the default behaviour when there are no targets?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions