h2v_net = buildNetwork(2048, 5, 230400, hiddenclass=LSTMLayer, outputbias=False, recurrent=True)
Even with 2048 input neuron, 5 LSTM cell, 230400 output neuron, 5 epochs takes 10 seconds to complete.
For 500 epochs training every sample takes 16 minutes. Which is not an acceptable result.
It seems Hearing to Vision LSTM is impossible with today's hardware.
*It's not even starting if there are 16384 LSTM cells.
See commit 48e29c7