-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Hello !
Thank you for releasing your code for FastSent ! I tried to use it, but i got this error just before the training step :
2016-02-24 14:36:52,081 : INFO : training model with 1 workers on 52178 vocabulary and 200 features, using sg=0 hs=1 sample=0 and negative=0
2016-02-24 14:36:52,081 : INFO : expecting 10655070 examples, matching count from corpus used for vocabulary survey
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/fastsent/SentenceRepresentation/FastSent/gensim/gensim/models/word2vec.py", line 735, in worker_loop/fastsent/SentenceRepresentation/FastSent/gensim/gensim/models/word2vec.py", line 724, in worker_one_job
if not worker_one_job(job, init):
File "
tally, raw_tally = self._do_train_job(items, alpha, inits)
File "~/fastsent/SentenceRepresentation/FastSent/gensim/gensim/models/word2vec.py", line 661, in _do_train_job
tally += train_sentence_cbow(self, sentences, alpha, work, neu1)
File "gensim/models/word2vec_inner.pyx", line 397, in gensim.models.word2vec_inner.train_sentence_cbow (word2vec_inner.c:4436)
cdef int cbow_mean = model.cbow_mean
TypeError: unhashable type: 'list'
My sentence iterator is the same as yours, except I added an utf-8 decoding step just before yielding the split.