Skip to content

Python 3 string/byte comparison error #25

@sophiazhi

Description

@sophiazhi

When calling Task.generate_triplets, the following error occurred:

Writing ABX triplets to task file...
Traceback (most recent call last):
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/pdb.py", line 1705, in main
    pdb._runscript(mainpyfile)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/pdb.py", line 1573, in _runscript
    self.run(statement)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/bdb.py", line 580, in run
    exec(cmd, globals, locals)
  File "<string>", line 1, in <module>
  File "/home/szhi/multimodal_learning/abxpy_eval.py", line 1, in <module>
    import argparse
  File "/home/szhi/multimodal_learning/abxpy_eval.py", line 96, in df2task
    t.generate_triplets(output=task_file)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/task.py", line 773, in generate_triplets
    self._compute_triplets(
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/task.py", line 834, in _compute_triplets
    out_regs.write(regressors, indexed=True)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/h5tools/h5io.py", line 196, in write
    self.__initialize_datasets__(sample_data)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/h5tools/h5io.py", line 243, in __initialize_datasets__
    sample_data = self.__parse_input_data__(sample_data)
  File "/home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/h5tools/h5io.py", line 226, in __parse_input_data__
    raise ValueError(
ValueError: It is necessary to write to all of the managed datasets simultaneously.
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
> /home/szhi/miniconda3/envs/abx_eval/lib/python3.8/site-packages/ABXpy/h5tools/h5io.py(226)__parse_input_data__()
-> raise ValueError(
(Pdb) set(data.keys())
{'phone_2', 'phone_1'}
(Pdb) self.managed_datasets
[b'phone_1', b'phone_2']

The error was thrown because set(data.keys()) != set(self.managed_datasets), but the only difference is the string/byte encoding. I tried passing in the argument on=b'phone' instead of on='phone' to the Task object, but got AssertionError('ON attribute must be specified by a string'). I'm not sure if it's possible to edit my input or environment (while keeping python3) to fix this error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions