Skip to content

Request for required newsdev.tok.en and newstest.tok.en training files #8

@GenTxt

Description

@GenTxt

Hello:
Trying to figure out format for newsdev.tok.en and newstest.tok.en required for SDAE training.
Have attempted using Penn Tree Bank test & valid files as renamed substitutes but throws this ValueError:

Nvidia 1070 gpu, ubuntu 16.04

Building model
Building f_log_probs... Done
Building f_ctx... Done
Building f_cost... Done
Computing gradient... Done
Building optimizers... Done
Optimization
Traceback (most recent call last):
File "train_book.py", line 54, in
'test_text': ['data/newstest.tok.en']}) # using PTB file
File "train_book.py", line 31, in main
embeddings=params['embeddings'][0])
File "/home/pixelhead/Desktop/FastSent/SentenceRepresentation/SDAE/desent.py", line 953, in train
if x == None:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Would appreciate links to these files or if not possible a simple example of the format required.

Also, a link or recommended substitute for "D_medium_cbow_pdw_8B.pkl" would be helpful.

embeddings='../Files/D_medium_cbow_pdw_8B.pkl',
dictionary='../Files/dict.pkl',
valid_text='../Files/newsdev.tok.en',
test_text='../Files/newstest.tok.en'

Have downloaded all other recommended models and files.

Cheers

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions