-
Notifications
You must be signed in to change notification settings - Fork 152
Refresh examples to ensure that they run in current version #494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Running the network word count example showed an attempt to await a list object in the Tornado framework. I tracked it down to the EmitServer.handle_stream assuming the result of source._emit was awaitable. Pivoting on inspect.isawaitable there makes the example as written work. I added a try/except block to trap the keyboard interrupt and suppress the stack spew to the example.
* Change the examples to only seed with the value 1
* The thread example blocks on emit due to the cycle in the stream
so it CAN only seed with one value.
* Take all explicit references to Tornado out of the asyncio variant
* Suppress the stack spew from KeyboardException
* Ensure all variants have the same output
Using BeautifulSoup4 and making sure to filter out empty lists before feeding them flatten, this runs again.
Trap the StopIteration that erupts from update when passed an empty iterable so that the stream can continue.
|
Thank you, and I have not forgotten #493 either. There is some code to ensure that if any of the nodes are asynchronous, then all of them should be. So |
The From going back through the code around Source/Stream creation, it seems that the example as written guarantees that the resultant stream is synchronous: streamz/examples/network_wordcount.py Lines 10 to 11 in 8c73290
Line 40 in 8c73290
Lines 258 to 259 in 8c73290
I believe that is why the
I think as a framework there is probably value in both threaded and asyncio support. The asyncio module/design appears to be fairly polarizing in the larger Python ecosystem. A significant number of projects that I run into in the wild forswear it entirely while others are asyncio-only. Our historical code is non-asyncio/multiprocessing and our more current work is all asyncio-native. We've definitely had difficulties supporting both. |
I found to bugs in the main code by refreshing the examples to work under the current API:
Source.from_tcpcould kill the TCP socket because it assumed the result ofself.source._emitwas always awaitable.core.flattenin 0.6.4 would skip over empty iterables, but the change to handle no-length iterables made it raise a spurious StopIteration under the same condition.Example refreshes:
Unit tests:
test_tcp_word_count_exampleto make it less flaky.