Skip to content

Conversation

@jlizen
Copy link
Owner

@jlizen jlizen commented Jan 6, 2025

The highlight here is future module, with some other small stuff.

Future module

Based on initial usage in tokio-rustls, I want to expand this package to also include 'batteries' for libraries that are manually implementing futures. Since, needing to progressively poll vacation work before or alongside inner futures, is going to be a frequent use case. And, it takes a lot of intrusive boilerplate to manually implement it.

It's all optional under feature flag, though the only new dependency is pin-project-lite. LMK if you think I should remove the flag.

Currently I am targeting two use cases:

  • do some compute-heavy work BEFORE starting to poll the future, in order to construct it
    • in tokio-rustls, this means the initial handshake crypto work before our first network call
  • do some compute-heavy work intermittently ALONGSIDE polling the inner future
    • in tokio-rustls, this means processing bytes that came in via I/O in order to generate the next crypto segment of a handshake

The first case is fairly simple. That is essentially just the futures-util Then combinator, with some convenience methods in the builder to construct the vacation work. I ended up rolling my own, but I'm happy to cut over to using futures-util if there are any issues with my impl.

The flow for VacationFirst is:

  • construct the vacation work via a builder that accepts the sync work closure
  • poll vacation work to completion on its own
  • after work completes, call a FnOnce 'resolver' closure to process the results into the inner future, which might pull in other existing resources that aren't Send + 'static
  • then just pass-through poll the inner future

The second case is a bit more complex. The constraint I'm operating in is, the inner future generally cannot be assumed to be fully composed of resources that are send + 'static. But, we do need access to certain elements of state derived from the inner future, both to generate the vacation work, and to handle the results of the vacation work.

We also would prefer to continue polling the inner future alongside the vacation future, since tokio-rustls's I/O work can sometimes proceed simultaneously. Anyway the inner future can always return Poll::Pending if it needs to wait on vacation.

The API I went with for VacationOngoing, is:

  • Expose a hook in the inner future, so the vacation future can use a mutable pointer to call something like inner_future.get_buffer_to_process_with_vacation().
  • Poll that hook after polling the inner future, using a closure to construct the vacation work (or surface an error)
    • That future doesn't have mutable access to the inner future, instead the closure can own any resources it needs
  • If there is vacation work, also drive it every poll, surfacing any errors
  • Once vacation work completes, call a 'resolver' closure in the same task as the inner future, which does any post-processing that needs mutable access to inner future

A couple notes:

  • Currently we only support tracking one piece of vacation work at a time, that's what tokio-rustls needed. I can cut an issue to add support for multiple, it wouldn't be hard to add
  • Currently I require that the caller sometimes Box things in the VacationFirst resolver closure or VacationAlongside get_work closure. I could do it for them, but often they need to return futures that already are sized and unpin, since they have defined them lower down. So it would have forced extra allocations to do it implicitly. Open to alternatives.

Concurrency handling changes

I noticed an edge case in concurrency handling, where the calling future might drop (which holds the vacation executor handle to the sync work). The concurrency permit was acquired and dropped in the foreground, so concurrency would be released.

However, our vacation work isn't actually cancellable since we cut over to sync API, since it doesn't yield anywhere to drop work. So, the work could keep running after concurrency permit dropped, resulting in going beyond the limit.

Moving the permit drop to after the vacation work finishes, instead.

Remove default features

The tokio default feature no longer seems sensible given that we don't implicitly update default strategies anyway. Just forces libraries to add no-default-features = true. Removing the implicit tokio default.

Test improvements

My concurrency tests were occasionally failing, and I realized why - I was racing initializing the strategy against multiple test cases per integ test binary. Which, for spawn_blocking and some custom executor tests, implicitly is loading a specific runtime handle to use with the strategy.

And, that runtime ends when the test ends. Which, if it's not the concurrency test, will generally finish quicker than the concurreny test. This results in the listener channel closing and the vacation work getting cancelled, from the caller's perspective, and thus exceeds concurrency limit.

I also added an assertion that all vacation work actually succeeded, I was silently ignoring the cancellation related errors.

Avoid async fn in trait

We had a trait definition for execute() that had a async function in it. Given that tokio-rustls supports 1.70, and AFIT is 1.75, I'd rather avoid forcing a MSRV bump to use vacation. We don't actually need the trait for anything since we are using enum dispatch anyway. Refactored to remove.

Add operation namespace to vacation::execute()

Updated the vacation::execute() input to include an operation namespace, that is just a &'static str. This will open up a world of customizability without adding config drilling into the crate.

I focused on how it impacts the input end (as well as custom closure, since that was easy). I'll add richer support for customizing strategy based on this operation namespace (and chance of blocking), later.

@jlizen
Copy link
Owner Author

jlizen commented Jan 6, 2025

Just want to call out, the usage of VacationOngoing future might still be a bit awkward with tokio-rustls. Since, even if i surface a buffer containing the bytes we need to process for the handshake, we also need mutable access to the rustls session to actually process the packets. I need to dig more into rustls internals to see whether I can avoid sending the entire session off with the work. Ref https://docs.rs/rustls/latest/rustls/client/struct.ClientConnection.html#method.process_new_packets

My previous impl was to split apart the tokio-rustls feature and send the full session into the vacation work. I technically could still do that, more concisely, since i'll have the mutable pointer to the inner future. So I could still add a new state transition to the inner future that just returns pending until vacation work is complete.

I'd prefer to avoid that, though, we'll see. I suspect it's going to be a tradeoff between adding complexity to the tokio-rustls future, or trying to tweak rustls to play nicer with this sort of usage.

Anyway I think probably this is the right API for vacaiton, since it both provides flexibility to do something like this.

@jlizen
Copy link
Owner Author

jlizen commented Jan 7, 2025

Updated it based on offline discussion:

  • renamed various things, it's now OffloadWith or OffloadFirst, resolver -> incorporator, etc
  • Add sub-builder for the get offload fn / offload future, and the incorporator, to more strongly couple the two
  • Take an explicit argument to specify for OffloadWith, whether to poll the inner or not
  • Return/accept already-constructed vacation future, in the get_offload_fn or offload_future calls, rather than implicitly calling vacation::execute() on those in/outputs

The examples are still somewhat contrived... I'm going to construct a richer example when I go to integrate this with tokio-rustls, which I can backport here. Between the integ and unit tests, though, I think the API is fairly clear.

The part that I like the least is that now we require the caller to call Box::new(Box::pin()) around their input vacation future (for OffloadFirst) or output work (for OffloadWith), which is necessary since a FnOnce() impl can't have impl bounds. Previously we did as well, but we handled it implicitly, so it was less verbose.

If we'd prefer to go back to the caller specifying synchronous work closures and implicitly executing, we could, but I agree that that's a bit unintuitive.

The other note is, I do think I want to add namespacing in the execute call (ref #3 (comment)). But, I'll do that in a subsequent PR to avoid overloading this one. I like that idea a lot though, since it allows quite granular strategy selection (including selective prioritization via custom closure), without a lot of config drilling.

Anyway, I'll try to at least have the input side of that (in execute()) in place in time to get the tokio-rustls change in, even if I don't have the caller / strategy handling of that namespace.

@jlizen
Copy link
Owner Author

jlizen commented Jan 7, 2025

Something to call out - while it might seem like we could just include the incorporator function behavior directly in the async block our get_offload_fn returns, that doesn't actually work until we have async closure support. Since we need a non 'static lifetime on the &mut inner future. And we are targeting older rust versions.

I'll add that to a FAQ (along with why not just use channels).

I think it might work ok on the OffloadFirst future... Do you think that is a better API? It will make the input offload work future busier, but make one less builder call? The downside is that it totally removes any vacation related bounds so maybe is more prone to misuse?

At that point I also really might as well just use futures util... In which case maybe I just ditch the OffloadFirst functionality entirely since it is trivial for libraries to implement with futures util then combinator?

On the fence about this, feedback welcome

@jlizen
Copy link
Owner Author

jlizen commented Jan 7, 2025

Just went ahead and threw in operation namespace on the input side. Since I want to pin down the input signature before PR-ing tokio-rustls.

The only thing actually using it right now is the custom closure. This will generally close out: #3

For #4 - we still will want to allow the caller to specify a mapping of strategies to namespace, presumably, rather than always using custom closures. Similar story to adding handling based on chance of blocking. That I'll keep as a separate PR.

@jlizen
Copy link
Owner Author

jlizen commented Jan 8, 2025

I'm actually thinking of refactoring to have the suppress inner poll mode accept an owned inner future into the get_offload_work closure, and then the caller can directly include the incorporate fn in that returned async block since it will own the inner future anyway. And then they would return the inner future olfeom the async block.

This is nice because it wouldn't require shuffling around a new state machine step inside of tokio-rustls. Presumably other libraries with a similar pattern too.

In general I think that api might be a bit more intuitive, so I might ditch the pass through poll inner entirely until I have a use case for it? It would be easy enough to add back in...

@jlizen
Copy link
Owner Author

jlizen commented Jan 9, 2025

I think I'm sold on removing the OffloadFirst api entirely since you can get essentially the same thing by using futures_util::then in the calling API, which has conservative MSRV support.

Meanwhile, the OffloadWith API has more utility, but I think I'm on board with limiting its scope to simply always own the inner future when get_offload_work returns something. That way we can have a single closure rather than this two-part get_work vs incorporate_work fn stuff. Keeps the library much leaner. Also simplifies usage in tokio-rustls.

So the signature struct would be something like:

pub struct OffloadWithFuture<InnerFut: Future, GetOffloadFn> {
    inner: OffloadWithInner<InnerFut>,
    get_offload_fn: GetOffloadFn,
}
enum OffloadWithInner<InnerFut: Future> {
    InnerFut(InnerFut),
    OffloadActive(Pin<Box<dyn Future<Output = Result<InnerFut, InnerFut::Output>> + Send>>)
}

Which would imply also only having suppress inner poll mode.

We can layer borrowed inner futures back in, if we want, but I think it's scope creep to do so without a direct use case.

The main awkward bit with this approach is, the GetOffloadFn now needs to return an enum either containing the work (which owns the inner future), or a none equivalent that returns the inner future. Since we can't call it with &mut. I think that's fine though.

@jlizen
Copy link
Owner Author

jlizen commented Jan 9, 2025

We discussed offline, I'll make the above changes but in a separate PR to avoid continuing to bloat this one up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants