Making a RTS game #49: Optimisation tips & tricks (Unity/C#)

Let’s talk of possible optimisations for our Unity game!

This article is also available on Medium.

In the last episode, we talked about very down-to-earth fixes and refactors to improve our game. Those are of course essential, but they are not the only way to make your game better! In particular, as we are nearing the 1-year anniversary of this tutorial and implemented a lot of systems, I thought it would be a good idea to take a step back and discuss optimisation

Optimisation is obviously an important topic for video games since they usually require you to run code in real-time but with limited resources (especially on mobile platforms). It’s the key to running your game on low-end platforms, to meeting some app store criteria (like the time-to-first-display constraint) and overall just to using less resources (in terms of CPU, memory, energy…).

As explained in several official Unity talks (that I’ll try and sum up in this article), there are plenty of things you should be cautious about as you improve and expand your Unity project – when comes the time to actually build and optimise, there are various bottlenecks you should look out for (and those can vary a lot from one project to another)…

Important note: I won’t necessarily be doing all of these in my Github repository, because my RTS tutorial project is not really intended to be released and build in production; it’s more like a “shopping list” of possible improvements that one can read through and apply to their own issues.

Foreword: debugging and finding the bottlenecks

Before we talk about how to optimise the project and fix some issues, it’s important to assess whether your project indeed needs all this extra work! Depending on your target platform and your current requirements, it might be a bit “overkill” and time better spent actually improving the logic – especially in a production context where you and your teammates have deadlines to meet and will often prioritise the contents of the game over its performances, at least at the beginning.

Something crucial is that, generally speaking, you should profile on the target device: it’s obviously not very relevant to inspect your app performance on a high-end PC computer if you ultimately plan on releasing the game for tablets and phones. The moment you start to dive into optimisation and therefore profiling, you should try your best to do all your tests and assessments on the expected platform so that the results are reliable.

To let you analyse on the build platform, the tools I discuss below can be linked to your dev builds by checking the “Autoconnect Profiler” build setting:

Unity’s main profiler (built-in)

The most intuitive way to quickly spot what’s truly bogging your Unity project down is to use the built-in profiler to inspect the CPU/memory usage and other interesting parameters of your project when it’s run.

This debugger can be opened just by going to the “Window > Analysis > Profiler” menu, and it is split in two main parts: the top half is where you choose what to profile and you get a bird’s-eye view of the chosen metrics continuously, and the bottom half allows you to get detailed profiling for a specific selected frame.

To actually feed this window with data, make sure you enable the record button at the very top; as soon as you’ve started the recording/play mode, you will get a continuous analysis of your game performance while it’s running – usually, you record either the CPU or the memory usage.

CPU usage

Memory usage

Here, I clearly see that my GameScene takes a heavier toll on the CPU and memory, for example – which is to be expected!

If you pause the recording and click somewhere on the timeline, then you’ll get more detailed info on your metrics at this particular frame – you can even take a sample to fill the bottom half and get a more detailed view.

CPU usage

For the CPU usage, the detailed view can show us the info in various forms: a sorted list of the CPU usage of each process, or a timeline where we see the portion of the frame devoted to each of those processes…

We’ll see in just a second how to use this info to improve our game performances a bit! 🙂

Memory usage

If I show the detailed memory usage of my scene, Unity automatically sorts the result with the heaviest consumers at the top.

I see that there are DLLs to consume a lot, but also some 2D sprites for my UI, like the background of the top bar. We’ll see in one of the following sections how to optimise this.

The frame debugger (built-in)

Another useful tool is the frame debugger. Again, you can open it by going to the “Window > Analysis > Frame debugger” menu – and this time, this tool lets you see how a particular frame of your game is drawn, one call at a time, so that you understand exactly what passes Unity uses to construct the final render:

The memory profiler (additional package)

If you want a more in-depth analysis of the memory usage of your game, you can also add another separate package, the memory profiler, to quickly get an intuitive overview of how much of the total memory each asset in your game consumes.

It can be installed by going to the Package Manager window, and search for “memory profiler” in the Unity registry:

Note: because the memory profiler package is still in preview, be sure to enable the preview packages in your Project Settings to get it in the list!

If you open it while you game is running, you’ll be able to take a “snapshot” of the memory at this exact time, and the window then lists all the recorded snapshots so that you can re-analyse them in the future:

So, if you have some snapshots recorded, you can just click the “Open” button to instantly get a global overview of how your memory was split at the moment of the snapshot. You can even click on the large blocks to get a more detailed split!

You can also use the “Diff” button in the bottom-left corner to get a list of all the differences in the memory usage between two given snapshots.

A very game-specific improvement: optimising our FogRendererToggler class

If we take a deeper look at our CPU usage profiling from before, we notice that each frame, a lot of the time is dedicated to the FogRendererToggler class, and more precisely its LateUpdate() function.

If we take another profiling record with the “Deep Profile” option enabled, we get an even more detailed split of the call stack and we see that it’s actually the ReadPixels() function in our GetColorAtPosition() method that is slowing things down:

The problem here is that every frame of the game, we are reading all the pixels of the RenderTexture that contains the unexplored areas to toggle the renderer of the current unit on or off.

The best solution would obviously be to implement some logic to notify the FogRendererToggler of the changes of the RenderTexture, so that it only updates when those happen, instead of “polling” the texture all the time.

But as a quick improvement (not perfect, but already valuable), we can take advantage of C# coroutines to reduce the frequency of this consuming pixel-reading. In the following code (my updated FogRendererToggler.cs script), I’m designating one arbitrary instance of the script as the “main instance” and giving this instance a specific coroutine that’s run every 0.5 seconds and that updates the shared static _shadowTexture.

Also, we can actually completely remove this script on our own units since, by definition, they will always be in sight and visible! 😉

For the players, the difference between an “every-frame update” and an “every-0.5 seconds update” should be negligible. If you find it a bit too visible, though, you can of course reduce this refresh rate – just remember that the lower this number is, the more often the slow parts of the code will be called and so the less performant the game will be!

After this change, the CPU usage profiling shows me spikes and lower parts: every time I get a spike, it’s my 0.5 seconds rate and the script does it’s heavy duty of reading all the pixels; but at least the rest of the time, it’s far less of a CPU eater 😉

This is a short example of how profiling can help you identify and mitigate or even fix some CPU/memory usage issues in your game 🙂

Importing your images with the right settings (textures and sprites)

But the first real “quick-win” you can do in your project is to make sure you are not over-consuming memory due to too heavy images!

By default, Unity will import all of your textures and sprites at a pretty high res (2048×2048 pixels) – but sometimes, you don’t actually need that high a quality…

For example, lots of UI sprites are for buttons or icons that will ultimately be shown no more than 64 pixels wide on your screen; for all these “small” images, we can explicitly tell Unity to lower the size in the import settings so that they consume less memory upon loading 🙂

To do this, simply select one of your image asset and select a lower res in the size dropdown:

On large projects with lots of images, this can lead to significant improvements in the memory requirements! For example, in my case, here is a before/after of the 2D texture memory tree map if I reduce the size of my “wood plank” background images:

Before

After

Here, the disk size for my textures has gone from 4.5 MB to roughly 0.2 MB – cool, right? 🙂

As explained in this video, you can further reduce the size of your images by:

  • using Atlas textures or sizes that are a power-of-two (to insure all compression algorithms can be run)
  • removing the alpha channel for opaque images
  • toggling off the “Read/Write Enabled” option
  • using a 16bit colour format instead of 32bit
  • disabling mipmap for your UI elements

Of course, these optimisations require you to do some trade-offs between quality and size – so always ask yourself whether optimising this part of your project is the best solution, with regards to actual game experience!

Optimising your audio assets

Another problem I currently have in my project is with my audio files. If I look at the memory snapshot I took when my game scene was running, I see that the part devoted to AudioClips is actually quite big… and mostly because of this “construction site sound” that takes a lot of space!

And, even more interesting: it is not actually used right now. Yep, you heard me right (pun intended…): our sound isn’t playing anywhere, it has no referrer (see the table at the bottom), but it is still using more than 20 MB in memory!

Now, why is that? The problem is, again with our import settings. If I take a look at my construction_site.wav file, I see that it is currently using the default settings and, in particular, it is “decompressed on load” and the audio data is “preloaded”.

In other words, a somewhat long and heavy audio file that is rarely used is just sitting there, waiting, from the very moment I start my game. That’s just too bad!

A possible solution in this case is to instead switch to the “streaming” mode:

This way, the file will be read one small chunk at a time while it’s played. If we take a new snapshot after this change, we see a significant reduction of the memory usage for AudioClips, and we indeed get a very small size for our “construction site” sound 😉

Using public fields instead of properties when in prod

In this project, I’ve been using C# properties a lot: those are the getters and setters we created in our various classes and that allow us to interact with private fields from outside. They are pretty useful while in the dev phase because they allow you to neatly encapsulate your data in each class and control the accessing interfaces at a very granular level.

However, when you actually build your game and go into prod, these properties can cause a slight overhead. It’s usually not an issue, but it can be interesting to avoid them in tight loops. A nice solution can be to use preprocessor directives to conditionally use properties (in dev) or simple public variables (in prod):

Using hash values instead of strings

Lots of Unity setters accept either a key string or an int hash to refer to the setting to change; for example, the Animator.SetTrigger() and Material.SetTexture() methods can be used in the following way:

But they can also accept an int argument, where the int is the hash matching the string from before:

The thing is that, under the hood, Unity only works with hashes – so when you pass the function the string key, it has to re-hash every time. By “baking” the hash one time and then using this int value, you slightly reduce the required computation.

This tip is again pretty specific but it can be useful for parts of the codebase that are called very frequently.

The Resources/ folder

This is a somewhat well-known Unity gotcha, but the Resources/ folder, although it’s really handy, is not the most optimised way to store and load assets in your Unity project. One of the main problems with this folder is that whatever is inside it will get bundled in your game builds… even the unused assets!

What’s worse it that, because the Resources/ folder is actually converted to a large file upon build, with a look-up table at the beginning to reference the various assets, those unused assets will fill this table for nothing and slow down the start of your game.

Nowadays, it’s recommended to use the Addressable Assets system if possible because they offer better memory management, more optimised loading and make it easier to add bundles or DLCs to your game after a first release. They are, however, a bit harder to use than the plain old Resources/ folder so, if you’re working on a prototype or your project is not concerned with these issues, you can stick with the old method…

For more info on the Addressable Assets package, I really recommend you check out this great post by The Gamedev Guru 🙂

Creating a top-notch UI hierarchy!

And now, for the real treat: the UI. As explained in the aforementioned talks, properly organising and splitting your UI canvases is usually the key to gaining performance! Or, more precisely, it’s because of bad UI setups that your code runs slowly and inefficiently 😉

In his 2016 conference, Ian Dundore spends about a third of the time discussing how Unity’s UI system is far from perfect and still relies on some weird design decisions that can quickly result in slow games.

I really encourage you to take a look at the talk for more details, but one of the main take-aways is to always batch as much as possible. In theory, the Canvases are supposed to this for you… but the problem is that, at the moment, they’re not yet doing it perfectly.

In particular, any time a drawable object inside of a Unity UI Canvas changes (an image changes its Sprite reference, a text gets a different font colour or size…), the Canvas has to recalculate all its draw calls and analyse all of its drawable children to check which one should still be visible and rendered on-screen. To do this, it needs to sort them by depth because all of the UI is considered “Transparent” (for shader-savvies: it’s submitted to the “Transparent” queue on the GPU)… but sorting is always a tricky thing, because it scales worse than linearly 🙁

So, basically, when you start to have complex UIs with lots of elements that, statistically, change pretty often here and there, you have to (inefficiently) rebuild a massive hierarchy of game objects almost every single frame!

To fix things, you can try to:

  • merge sprites or text objects, if possible
  • or split up Canvases

Canvases can be nested inside one another, so you can create sub-Canvases to better organise and isolate the children. A clever trick is, for example, is to move all elements that change frequently do a separate Canvas from static elements.

A few additional tips

Choosing the right data structure

Whenever you need to store a collection of objects, you’ll need to choose which data structure to use among the ones that C# offers. Usually, it comes down to three choices: arrays, dictionaries or lists.

There is no absolute right or wrong answer, here: what you need to assess is what your code does with this collection most often, so that you can pick the best data structure for this task. For example, if you need quick indexing, then you should rely on arrays or lists. If you need to add and remove items periodically, dictionaries are probably more efficient.

Don’t hesitate to check out the Microsoft C# docs for more info on the advantages and drawbacks of each data structure! 😉

Caching references

If your code needs to access the same object or component frequently, you should avoid re-computing the reference every time and instead cache it once at the beginning (or as soon as the object/component exists).

So, for example, if you’re using GetComponent(), cache the result into a variable so that you don’t have to re-call it again and again later on (because this function needs to iterate through all the components on your object and is very inefficient).

Similarly, you should avoid calling Camera.main (to get a reference to your main camera) too often, since as explained in the docs:

Internally, Unity caches all GameObjects with the “MainCamera” tag. When you access this property, Unity returns the first valid result from its cache. Accessing this property has a small CPU overhead, comparable to calling GameObject.GetComponent. Where CPU performance is important, consider caching this property.

Cleaning up unused Unity Events

When you create a new C# script in Unity, it will be automatically filled with two methods: the Start() and the Update(). Those are of course empty, but they are nonetheless defined and ready to react to the “start” and “update” event calls Unity makes automatically.

So, if you’re not actually using those methods, remember to remove them from the script to avoid useless calls!

Avoiding nested/deep Game Objects hierarchy

To make it easier on the garbage collector and to avoid recomputing the child transforms too often, try to restrain as much as possible from creating nested hierarchies. If you do need some depth in your Game Object hierarchy, for example to transfer translations, rotations or scales, a first optimisation can be to cut it down into as many flattened hierarchies as possible.

Conclusion

Today’s article was fairly abstract and it doesn’t result in any new visual feature, or big bug fix… but it’s important to keep in mind that video games, being a real-time interactive medium, also have all these technical constraints associated with them at that optimisation is a crucial step to building production-ready projects.

I’ve discussed various optimisations and Unity tricks that you could use to boost the performance of your project and make the most out of the game engine, but I obviously didn’t cover everything and if you really want to dive into optimisation, you should definitely have a look at the resources I gave at the very beginning (for starters):

Next time, we’ll have a short interlude to see how shaders can help us improve our healthbar system…

Leave a Reply

Your email address will not be published.