Skip to content

Test http4s servers fail release bound port on shutdown when using fs2 3.12.0 #3590

@mtomko

Description

@mtomko

This issue has been minorly discussed in discord in the cats-effect channel, but I have recently concluded that the most fruitful avenue for exploration may be in fs2 rather than cats-effect, although my suspicion is that the bug might be somewhere straddling the two libraries.

The general idea is that we have a number of integration tests that spin up http4s servers so the tests can run requests against them. The servers are run within Resource so we can guarantee that they shut down cleanly when the tests are finished. We are careful to not run tests in parallel and we've been setting the fork property so tests run in a forked JVM.

When cats-effect 3.6.0 came out, we upgraded cats-effect and fs2 and suddenly a number of our tests began failing with errors like:

==> X com.example.ce361.ReproSpec.shutdown 2  2.168s java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:565)
    at sun.nio.ch.ServerSocketChannelImpl.netBind(ServerSocketChannelImpl.java:344)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:301)
    at java.nio.channels.ServerSocketChannel.bind(ServerSocketChannel.java:224)
    at fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$7(SelectingSocketGroup.scala:112)
    at delay @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$6(SelectingSocketGroup.scala:107)
    at traverse @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$4(SelectingSocketGroup.scala:106)
    at flatMap @ fs2.io.net.SelectingSocketGroup.$anonfun$serverResource$4(SelectingSocketGroup.scala:106)
    at *>$extension @ org.typelevel.keypool.internal.RequestSemaphore$.apply(RequestSemaphore.scala:79)

Our interpretation of the failure is that at the end of the resource scope, when the http4s servers should have shut down, they have not unbound their ports in the OS, so when a new test starts up and tries to start a new server (on the same port, because we just use a fixed port), it cannot bind to the test port and fails. As a result, we have been stuck back on cats-effect 3.5.7 and fs2 3.11.0 since early this year, but it's becoming increasingly untenable to be in this situation.

I have made a repro case in this repository, mirroring a common basic setup for our tests:

https://github.com/mtomko/ce361

Note that if you use fs2 3.11.0, it seems to work fine, but when you bump to 3.12.0, it fails.

I'm happy to do some legwork here but I think without a bit of guidance I might be hard pressed to make any progress. I was planning to begin by looking at the polling change introduced in 3.12.0 but I'm not familiar with the internals of either fs2 or cats-effect so I have a pretty steep learning curve.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions