Optimize RMT interrupt handler #5
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This code optimizes the RMT fill interrupt handler for execution time. It does some small C optimizations but the majority of the effect is the replacement of the C inner loop with unrolled assembly language.
The assembly code writes the RMT pattern for all 32 bits of pixeldata4 into the RMT buffer.
It achieves a jump-free 4 cycles per bit by operating as follows:
First it shifts the target bit into the MSB (not necessary for the first bit) of reg %3. Then it executes 2 speculative move operations that copy the correct RMT pattern into a working register, based on the sign of %3. Since we shifted the target bit into MSB, that bit defines the sign, so the assembly instruction movgez (move if greater than or equal to zero) and movltz (move if less than zero), also has the semantics "move if MSB is 0" and "move if MSB is 1" respectively.
Finally we store the working register to memory, indexed by pItem with a specified offset. If the ESP32 was big endian, the offset would simply be incrementing, 0,4,8... However, the ESP32 is little endian, which means the bytes are backwards, but the bits within the bytes are forwards. Hence the non-incremental store offset order.