Skip to content

Conversation

@mbolaris
Copy link

@mbolaris mbolaris commented Jan 5, 2026

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to significantly reduce object allocation overhead during playlist parsing.

Summary

Cache and reuse regex Matcher objects to prevent OOM and ANR issues during HLS playlist parsing.

Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap, creating two critical issues in production:

  1. Native heap exhaustion: Native allocations are substantial and not subject to normal Java GC pressure. When Matcher objects are created faster than they're garbage collected, the native heap can be exhausted even when Java heap has space available, causing OutOfMemoryError in the native allocator.

  2. GC-induced ANRs: Excessive Matcher allocation causes frequent GC cycles. This is particularly severe with multiple concurrent playback sessions on lower-performance devices, where sustained GC pressure from thousands of short-lived Matcher objects causes Application Not Responding (ANR) events.

Both issues are exacerbated by frequent HLS playlist refreshes (every 2-6 seconds), creating continuous allocation pressure.

Solution:

  • Uses ThreadLocal for lock-free per-thread isolation
  • Employs access-ordered LinkedHashMap as LRU cache (max 32 entries)
  • Reuses Matcher objects via reset() instead of creating new instances
  • Eliminates both Java heap AND native heap allocation pressure

Performance impact:

  • Reduces Matcher allocations by >99% in high-demand scenarios
    (e.g., live TV with large caches or multiple concurrent streams can
    generate 20M+ allocations, reduced to ~37K allocations + 19.9M reuses)
  • Eliminates native heap exhaustion risk from Matcher object churn
  • Drastically reduces GC frequency and duration, preventing ANRs
  • Removes dependency on GC timing for native memory reclamation
  • Typical cache occupancy: 6-12 patterns (well under 32 limit)
  • Zero synchronization overhead through thread-local storage

Testing:

  • Validated over 2+ hours with production HLS streams
  • 99.82% reuse rate across 3,692 loader threads
  • No native memory allocation errors observed
  • Significant reduction in GC events during multi-stream playback
  • No functional changes to parsing behavior
  • All existing tests pass

Introduce a per-thread regex Matcher cache in HlsPlaylistParser to
significantly reduce object allocation overhead during playlist parsing.

Problem:
Each Matcher object allocates memory in BOTH Java heap and native heap,
creating two critical issues in production:

1. Native heap exhaustion: Native allocations are substantial and not
   subject to normal Java GC pressure. When Matcher objects are created
   faster than they're garbage collected, the native heap can be
   exhausted even when Java heap has space available, causing
   OutOfMemoryError in the native allocator.

2. GC-induced ANRs: Excessive Matcher allocation causes frequent GC
   cycles. This is particularly severe with our MultiView feature
   (4 concurrent playback sessions) on lower-performance devices, where
   sustained GC pressure from thousands of short-lived Matcher objects
   causes Application Not Responding (ANR) events.

Both issues are exacerbated by frequent HLS playlist refreshes (every
2-6 seconds), creating continuous allocation pressure.

Solution:
- Uses ThreadLocal<MatcherCacheState> for lock-free per-thread isolation
- Employs access-ordered LinkedHashMap as LRU cache (max 32 entries)
- Reuses Matcher objects via reset() instead of creating new instances
- Eliminates both Java heap AND native heap allocation pressure

Performance impact:
- Reduces Matcher allocations by >99% in production workloads
  (from ~20M allocations to ~37K allocations + 19.9M reuses)
- Eliminates native heap exhaustion risk from Matcher object churn
- Drastically reduces GC frequency and duration, preventing ANRs
- Removes dependency on GC timing for native memory reclamation
- Typical cache occupancy: 6-12 patterns (well under 32 limit)
- Zero synchronization overhead through thread-local storage
- Critical for MultiView and lower-performance devices

Testing:
- Validated over 2+ hours with production HLS streams
- 99.82% reuse rate across 3,692 loader threads
- No native memory allocation errors observed
- Significant reduction in GC events during MultiView usage
- No functional changes to parsing behavior
- All existing tests pass
@google-cla
Copy link

google-cla bot commented Jan 5, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@mbolaris
Copy link
Author

mbolaris commented Jan 6, 2026

@googlebot I signed the CLA.

@marcbaechinger marcbaechinger self-requested a review January 6, 2026 11:25
@marcbaechinger marcbaechinger self-assigned this Jan 6, 2026
Copy link
Contributor

@marcbaechinger marcbaechinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added one comment to discuss use of thread local.

};
}

private static final ThreadLocal<MatcherCacheState> matcherCache =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR and the reasoning in the PR description. Also, many thanks for the data regarding your usage with your stream and the reduction of issues you see. That's all very convincing.

I'm a bit skeptical about the use of ThreadLocal. My assumption is you've chosen this approach mainly because all the methods that need access to the cache are static. So I think and appreciate you were kind of forced into such an approach by the design of the parser.

My issue with that approach is (probably besides of myself actually never having really used it :)) that from the perspective of the HlsPlaylistParser it's unclear how the parser is used with regards to threading. So we don't know if it used by only a short living thread like as part of a Loader or whether someone attempt to call this from a thread that is only short living, or then is part of a thread pool.

It also appears to me that the parser is created by DefaultPlaylistTracker on the playback thread and is then passed to the Loader where the parser is actually used on a separate (short-living I believe) thread.

I think what I'm trying to say is that, I've we would have an approach that is not bound to a thread, we would probably have much less to think now, and we also would not have to care about any future refactoring that attempts to change the threading model in which the parser may be used.

If I don't want to think about all that to much, then I would argue we can instantiate a MatcherCacheState class for instance per parser class or for a single parse(Uri) call. We would then need to pass that cache object to each of the static methods you changed like parseOptionalBooleanAttribute and friends. Given these are private methods that would be ok API wise. There are 7 such methods.

I think for your case for which yo9u have measured things. This would have the same effect. Currently a HlsPlaylistParser instance is create for each time an HLS playlist is parsed [1]. There isn't a case where a parser instance is used twice on the same thread.

WDYT when you hear this? My suggestion would therefore be to to keep pretty much everything from this change except the thread local?

[1]

private void loadPlaylistImmediately(Uri playlistRequestUri) {
ParsingLoadable.Parser<HlsPlaylist> mediaPlaylistParser =
playlistParserFactory.createPlaylistParser(multivariantPlaylist, playlistSnapshot);
DataSpec dataSpec =
new DataSpec.Builder()
.setUri(playlistRequestUri)
.setFlags(DataSpec.FLAG_ALLOW_GZIP)
.build();
if (cmcdConfiguration != null) {
CmcdData.Factory cmcdDataFactory =
new CmcdData.Factory(cmcdConfiguration, CmcdData.STREAMING_FORMAT_HLS)
.setObjectType(CmcdData.OBJECT_TYPE_MANIFEST);
if (primaryMediaPlaylistSnapshot != null) {
cmcdDataFactory.setIsLive(!primaryMediaPlaylistSnapshot.hasEndTag);
}
dataSpec = cmcdDataFactory.createCmcdData().addToDataSpec(dataSpec);
}
ParsingLoadable<HlsPlaylist> mediaPlaylistLoadable =
new ParsingLoadable<>(
mediaPlaylistDataSource, dataSpec, C.DATA_TYPE_MANIFEST, mediaPlaylistParser);
mediaPlaylistLoader.startLoading(
mediaPlaylistLoadable,
/* callback= */ this,
loadErrorHandlingPolicy.getMinimumLoadableRetryCount(mediaPlaylistLoadable.type));
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a case where a parser instance is used twice on the same thread.

Just to keep me honest. I don't think what I said in the quote above is true.

https://github.com/androidx/media/blob/release/libraries/exoplayer/src/main/java/androidx/media3/exoplayer/upstream/Loader.java#L239

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I've removed the ThreadLocal and now create a MatcherCacheState per parse() call, passed through the private helpers. This keeps matcher reuse scoped to a single parse operation while preserving the allocation benefits.

Instantiate a MatcherCacheState per parse(...) call and pass it through the
private parsing helpers.
Keeps regex matcher reuse without relying on ThreadLocal, aligning with
feedback and clarifying ownership/lifetime of cached matchers.
No functional change intended.
@marcbaechinger
Copy link
Contributor

Thanks a lot for the change. I wanted to import the pull to the internal source tree but for that I had to merge quite some changes mostly of the RELEASE_NOTES.md mostly. I'm not quite sure about why this is the case as you actually didn't touch these files. Part of taking a pull is to rebase your branch onto main where the conflicts happened.

Would it be possible for you to update your branch and sync it with the main branch? I wonder whether this is easier for you on your end. You could if it fails in a similar way for you as for me, just sync to head of 'main' and then copy back the single file you changed? This may be easier and this way we can preserve the credits you get with the pull as I take stuff from your branch. WDYT? Can you give this a try!

@marcbaechinger
Copy link
Contributor

Ah, sorry. I just got made aware that you've taken your pull from the release branch. You'd need to take this from the main branch.

Sorry, I haven't noticed this earlier. Can you drop this change and make a new one from main?

@mbolaris
Copy link
Author

Hi @marcbaechinger, oops, sure no problem. I recreated the change on top of main and opened a new PR here: #3008

@mbolaris mbolaris closed this Jan 13, 2026
@mbolaris
Copy link
Author

Closing this PR to avoid confusion; please review/land #3008 instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants