-
Notifications
You must be signed in to change notification settings - Fork 898
Get Min and Max of two Vectors a little bit faster. #515
base: master
Are you sure you want to change the base?
Conversation
|
Too few bithacks in there! ;< Just kidding. Nice little patch, that also keeps the code easy to read. Is this hot path code, or was it just you having fun? |
|
Is the only real difference combining Min and Max into one method? That might save the (small) overhead of a second method call but it seems excessive to add a whole new method to Vector3 for one use case. |
|
There is a bit more overhead save then just a method call (though the cost would still be highly minor), but even then, I would think having both a min and max combined method would have a bit more need all around the source, this does seem like a rather tiny scoped improvement. If more cases can't be found, this will probably become random useless bloat. |
|
From my point of view this discussion will be waste of time if half of the involved can't see the lesser overhead. |
|
The reason we are telling you this overhead save is not worth much time is the .NET library is already a huge overhead situation, in comparison, this save is like comparing an atom to at the least a galaxy, if not the universe. |
|
Every little counts. Few more atoms like this in crucial places and you On 22 April 2016 at 12:12, Geroge notifications@github.com wrote:
|
|
@devu First, this has shown to have almost no use cases (so the cost of the PR is already unlikely to be worth time, if there were actually a couple use cases, then it wouldn't be such an easy decision), second, the most expensive overhead is the one less function call, which is worth nothing in a .NET context, even in C, function calls cost near nothing, unless you use a recursive function and can bypass the buffer cutoff, you will never experience an optimization effect from saving function calls. |
|
Such a simple function might even me inlined by the .Net compiler already. Please verify with performance testing before making such a pull request :) |
|
You all spend this much time in commenting my PR that it seems more than obvious that you can at least request a performance test and so on... But this is your logic and not mine. So you can do performance test to verify what I mean with "a little piece faster" and optimize your commenting ability by adding a argument. |
|
You do realize you never gave us a performance test (I mean did you not expect such small changes to actually get this kind of feedback?), also, some of us understand how a compiler works and how it optimizes, so logic would dictate that this type of PR wouldn't play outside the bounds of that understanding, which pretty much means this PR is unlikely to have visible effects (even with a large amount of reduced function calls, still unlikely as they cost near nothing) especially with a singular use case shown. That is why a performance test should have been supplied first. |
|
@Spartan322 you can be sure that I know, that I don't give you a performance test and it's nice that you think that some of the involved know how a compiler works. I just can tell you, that I'm not that smart and don't know how a compiler works, I just can imagine some or more parts. Now there is my piece of code that obviously half's the method calls and comparisons of floats what means that this is twice as fast -- against the hypothesis that the compiler do this cross method optimizations... @mze9412 let me know if this is your logic too, because then I will remember that you owe me one... Performance test: C# http://pastebin.com/4eqFy5ny 1=00:00:00.0010736 (First run with Min Max separated) |
|
I looked at your results. The second run is faster btw because the JIT compiler is getting better at predicting what you want to do and maybe even inlines the function at machine code level (hard to directly find out). If the game calls this stuff that often, merging it could be useful. If the call is only done <100 times per game tick, no one will notice a difference. :) Thanks for doing the effort! On my machine the numbers look more or less the same, but it jitters depending on what else I do. Best result I got for number 4 was 3.624ms, worst result was 7.something. |
Your welcome! Having it would bring more performance, that's the only thing that counts in my eyes. :-) |
|
It seems this will only be worth a pull if it is called often as stated by mze, and seeing as the use cases are still slim, I'm still skeptical on whether this is worth investigating, as the addition to Vector3 is otherwise unnecessary bloat. (which we should be trying to cut, not add) |
I think this is a little piece faster than comparing each value two times...