-
Notifications
You must be signed in to change notification settings - Fork 5
Optimize Cartridge::Mbc1 #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Uses the same style as: sacckey#24 main: ``` Ruby: 3.4.1 YJIT: true 1: 3.882817 sec 2: 3.83399 sec 3: 3.835251 sec FPS: 389.5409804902295 Ruby: 3.4.1 YJIT: true 1: 3.981531 sec 2: 3.804446 sec 3: 3.823725 sec FPS: 387.606848134431 ``` This branch: ``` Ruby: 3.4.1 YJIT: true 1: 3.503513 sec 2: 3.437195 sec 3: 3.52252 sec FPS: 430.07760129092094 ``` ``` Ruby: 3.4.1 YJIT: true 1: 3.620516 sec 2: 3.582359 sec 3: 3.484475 sec FPS: 421.05854117250766 ```
sacckey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! I've noticed a significant performance boost on my end as well.
Thank you so much.
It would be really helpful if you could fix one minor issue. I also plan to take a look at the other pull request shortly.
lib/rubyboy/cartridge/mbc1.rb
Outdated
| when 0xa, 0xb | ||
| if @ram_enable | ||
| if @ram_banking_mode | ||
| @ram.eram[addr - 0xa000 + @ram_bank * 0x800] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this to (@ram_bank << 11) as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
I think there might be other places where Integer#* could be replaced by << or >> but I haven't dug into it much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
I agree. I'm planning to create an issue to address these fixes all at once.
Closes: sacckey#23 Having to call a proc for every memory access isn't ideal, that quite a bit of overhead. If instead we use a `case` statement with only immediate values and a few bitwise operations we can let YJIT generate very efficient code. two runs on main with ruby 3.4.1 / YJIT: ``` Ruby: 3.4.1 YJIT: true 1: 4.051815 sec 2: 3.989457 sec 3: 3.963555 sec FPS: 374.849216902501 Ruby: 3.4.1 YJIT: true 1: 3.872855 sec 2: 3.943231 sec 3: 4.024268 sec FPS: 380.05620440064547 ``` On this branch: ``` Ruby: 3.4.1 YJIT: true 1: 3.625805 sec 2: 3.615032 sec 3: 3.572852 sec FPS: 416.1392102177157 Ruby: 3.4.1 YJIT: true 1: 3.543953 sec 2: 3.520738 sec 3: 3.506004 sec FPS: 425.70521616601366 ```
Uses the same style as: sacckey#24 main: ``` Ruby: 3.4.1 YJIT: true 1: 3.882817 sec 2: 3.83399 sec 3: 3.835251 sec FPS: 389.5409804902295 Ruby: 3.4.1 YJIT: true 1: 3.981531 sec 2: 3.804446 sec 3: 3.823725 sec FPS: 387.606848134431 ``` This branch: ``` Ruby: 3.4.1 YJIT: true 1: 3.503513 sec 2: 3.437195 sec 3: 3.52252 sec FPS: 430.07760129092094 ``` ``` Ruby: 3.4.1 YJIT: true 1: 3.620516 sec 2: 3.582359 sec 3: 3.484475 sec FPS: 421.05854117250766 ```
Uses the same style as: sacckey#24 main: ``` Ruby: 3.4.1 YJIT: true 1: 3.882817 sec 2: 3.83399 sec 3: 3.835251 sec FPS: 389.5409804902295 Ruby: 3.4.1 YJIT: true 1: 3.981531 sec 2: 3.804446 sec 3: 3.823725 sec FPS: 387.606848134431 ``` This branch: ``` Ruby: 3.4.1 YJIT: true 1: 3.503513 sec 2: 3.437195 sec 3: 3.52252 sec FPS: 430.07760129092094 ``` ``` Ruby: 3.4.1 YJIT: true 1: 3.620516 sec 2: 3.582359 sec 3: 3.484475 sec FPS: 421.05854117250766 ```
Uses the same style as: sacckey#24 main: ``` Ruby: 3.4.1 YJIT: true 1: 3.882817 sec 2: 3.83399 sec 3: 3.835251 sec FPS: 389.5409804902295 Ruby: 3.4.1 YJIT: true 1: 3.981531 sec 2: 3.804446 sec 3: 3.823725 sec FPS: 387.606848134431 ``` This branch: ``` Ruby: 3.4.1 YJIT: true 1: 3.503513 sec 2: 3.437195 sec 3: 3.52252 sec FPS: 430.07760129092094 ``` ``` Ruby: 3.4.1 YJIT: true 1: 3.620516 sec 2: 3.582359 sec 3: 3.484475 sec FPS: 421.05854117250766 ```
Closes: #23
Having to call a proc for every memory access isn't ideal, that quite a bit of overhead.
If instead we use a
casestatement with only immediate values and a few bitwise operations we can let YJIT generate very efficient code.two runs on main with ruby 3.4.1 / YJIT:
On this branch: