Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 62 additions & 10 deletions engine/guidelines/optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,13 @@ Choosing what to optimize
-------------------------

Predicting which code would benefit from optimization can be difficult without
using performance analysis tools.
using performance analysis `tools <#tools-for-optimization>`_.

Oftentimes code that looks slow has no impact on overall performance, and code
that looks like it should be fast has a huge impact on performance. Further,
reasoning about why a certain chunk of code is slow is often impossible to do
without detailed metrics (e.g. from a profiler).

Instructions on using some common profilers with Godot can be found `here
<https://docs.godotengine.org/en/stable/engine_details/development/debugging/using_cpp_profilers.html>`_.

As an example, you may optimize a chunk of code by caching intermediate values.
However, if that code was slow due to memory constraints, caching the values and
reading them later may be even slower than calculating them from scratch!
Expand Down Expand Up @@ -96,13 +93,73 @@ Once you have your baseline profile/benchmark, make your changes and rebuild the
engine with the exact same build settings you used before. Then profile again
and compare the results.

Tools for optimization
~~~~~~~~~~~~~~~~~~~~~~

Profilers
^^^^^^^^^

Profilers are the most important tool for everyone optimizing code. They show you which
parts of the code are responsible for slow execution or heavy CPU load. Profilers are
therefore excellent for identifying what needs to be optimized, and to test whether
performance was improved after making changes. Godot has a built-in profiler, but it
does not provide very detailed information. Instead, use dedicated C++ profilers, which are
`explained in the Godot documentation <https://docs.godotengine.org/en/stable/engine_details/development/debugging/using_cpp_profilers.html>`__.

Benchmarks
^^^^^^^^^^

Benchmarks can be a good tool to test the impact of your changes of an isolated piece
of code. However, they can be misleading because it's easy to write them in a way that
doesn't reflect real-world performance. When using benchmarks to test the performance
of your code, always be aware of their potential caveats, and familiarize yourself
with good benchmark practices.

To start writing benchmarks in Godot, you can use the following code templates:

.. tabs::
.. code-tab:: gdscript GDScript

var start = Time.get_ticks_msec()
var s := "Lorem ipsum dolor sit amet";
for i in range(10000):
s.replace("e", "b") # Benchmarks the 'replace' function.
print(Time.get_ticks_msec() - start, "ms")

.. code-tab:: cpp

String s = "Lorem ipsum dolor sit amet";

auto t0 = std::chrono::high_resolution_clock::now();
for (int i = 0; i < 100000; i ++) {
String s1 = s.replace("e", "b"); // Benchmarks the 'replace' function.
}
auto t1 = std::chrono::high_resolution_clock::now();
std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";

.. note::

Results will fluctuate, so you'll need to make your test project or
benchmark intensive enough to isolate the code you're trying to optimize (ideally,
go for at least 2 seconds of real-life runtime). Additionally, you should run the
test multiple times, and observe how much the results fluctuate. Fluctuations of up
to 10% are common and expected. The fastest run is usually the most accurate number.
When you're not sure you understand the benchmark results, using assembly viewers
can be useful.

Assembly viewers
^^^^^^^^^^^^^^^^

When making low level optimizations, it can be a good idea to investigate the machine code
generated by the compiler. Assembly viewers make this possible, by showing it in a human
readable form called assembly. Viewing assembly allows you to compare the machine code
before and after your changes, to confirm hypotheses used to guide optimization, and to
see what the compiler is doing in general.

You may find the following resources useful:

* Agner Fog's `software optimization resources <https://www.agner.org/optimize/>`__, especially his `C++ optimization guide <https://agner.org/optimize/optimizing_cpp.pdf>`__.
* `Compiler Explorer <https://godbolt.org>`__, a popular multi-architecture assembly viewer.

Pull request requirements
-------------------------
Expand All @@ -111,13 +168,8 @@ When making an optimization PR you should:

- Explain why you chose to optimize this code (e.g. include the profiling result, link the issue report, etc.).
- Show that you improved the code either by profiling again, or running systematic benchmarks.
See `tools <#tools-for-optimization>`__ for more info.
- Test on multiple platforms where appropriate, especially mobile.
- When micro-optimizing, show assembly before / after where appropriate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't remove this, this is part of the reason for the whole document.

We were getting people submitting PRs without assembly in cases where it was needed to show what was happening, to show that their PR worked as described. The "where appropriate" is vague enough that it shows not needed in a lot of cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this point because I don't think it's relevant for most PRs, even most performance optimization PRs. Looking at assembly can be helpful, but it's far less important than actually measuring performance.


In particular, you should be aware that for micro-optimizations, C++ compilers will often
be aware of basic tricks and will already perform them in optimized builds. This is why
showing before / after assembly can be important in these cases.
(`godbolt <https://godbolt.org/>`_ can be particularly useful for this purpose.)

The most important point to get across in your PR is to highlight the source of
the performance issues, and have a clear explanation for how your PR fixes that
Expand Down