What Are the Typical Game Performance Problems and How Optimization Solves Them?
A project runs smoothly on developer devices. On a five-year-old mid-range Android, it hits 20 fps and overheats after five minutes. On iPhone 11 it maintains 60 fps, but on iPhone XR it drops in heavy scenes. We encounter this every day. Optimization is not a later task—it is an architectural decision from the first commit. With eight years of experience, we have optimized over fifty games, from hypercasual to AAA on consoles. We guarantee: after our audit, you get not just a list of problems, but a concrete plan with measurable goals and deadlines. Order a performance audit — we evaluate your project in three days and show how to reduce draw calls by 40% without losing quality.
How to Profile Game Performance? Tools and Best Practices
Before any optimization, measure. Optimization without profiling is guesswork.
| Tool | Purpose |
|---|---|
| Unity Profiler | CPU/GPU time per system, GC allocations, audio |
| Frame Debugger | Inspect each draw call in a frame |
| Memory Profiler | Memory snapshot, asset dependency graph |
| RenderDoc | Deep GPU state analysis, relevant for PC/Console |
| Android GPU Inspector | GPU profiling on real Android devices |
| Xcode Instruments | GPU + memory on iOS (Metal Performance HUD) |
| Snapdragon Profiler | Qualcomm GPU—detailed shader statistics |
Profile on target hardware, not in the editor. The editor adds overhead — Play Mode numbers are not representative. Distinguish between GPU and CPU bottlenecks: CPU may cause many draw calls or heavy logic, GPU may have complex shaders or overdraw. Unity Documentation: profiling on device is mandatory for mobile games. Typical profiling time for one scenario is five to eight hours, including metrics collection on three to five devices of different generations.
Additional profiling tips
- Use the Profiler Capture tool with deep profile only when needed – it adds up to 50% overhead.
- Run at least three captures per scenario to get reliable averages.
- Record timeline markers for custom systems (AI, animation, network) to correlate spikes.
How Does Game Performance Optimization Reduce Draw Calls? Static Batching to SRP Batcher
A draw call is a command from CPU to GPU to “draw this.” Each call has overhead regardless of geometry complexity. On mobile, 200–300 draw calls per frame is typical. The goal is to minimize their number by combining geometry that shares the same material.
Static Batching combines stationary meshes during build. Requirement: Static flag on objects and the same material. Effective for static environments but increases memory usage – the combined mesh is stored separately. In scenes with thousands of static objects, monitor memory via Memory Profiler.
Dynamic Batching combines meshes at runtime with strict limits: fewer than 900 vertex attributes per mesh, same material and scale. In practice, it works only for small objects (particles, UI). Disabled by default in URP – replaced by SRP Batcher.
SRP Batcher is not classic batching – it optimizes CPU overhead when preparing draw calls. Instead of reloading shader uniform data each frame, SRP Batcher caches it in GPU memory and updates only when changed. Draw calls remain the same in number, but each takes less CPU time – sometimes 2–3 times less on render CPU time compared to standard batching. Requirement: shader must be compatible with SRP Batcher (declare per-object properties in UnityPerDraw CBUFFER). Standard URP Lit/Unlit shaders are compatible. Custom ones – check in Material Inspector: SRP Batcher compatible: Yes/No. To enable, ensure the SRP Batcher option is active in URP Asset, and for custom shaders use UNITY_INSTANCING_BUFFER macro and declare per-object properties in CBUFFER_START(UnityPerDraw). After enabling, the RenderLoop.Draw CPU time should decrease in Profiler.
GPU Instancing is for many copies of the same mesh with the same material (trees, grass, NPCs). It sends one draw call with an array of per-instance data. Enable on material: Enable GPU Instancing. Limitation: all instances in one batch must have the same material and mesh. Graphics.DrawMeshInstanced / Graphics.DrawMeshInstancedIndirect enable procedural rendering without GameObject overhead.
The choice of method depends on scenario. Static Batching for static environments. SRP Batcher wins in projects with many unique materials. GPU Instancing is indispensable for mass objects (forests, crowds). Dynamic Batching only for small and rare cases. In practice, we combine all methods, starting with profiling.
| Method | Object Type | CPU Impact | GPU Impact | RAM Consumption |
|---|---|---|---|---|
| Static Batching | Static | Moderate reduction | No change | Increases |
| Dynamic Batching | Small (≤900 verts) | Reduction | No change | No change |
| SRP Batcher | Any (compatible shaders) | Significant reduction (2–3×) | No change | No change |
| GPU Instancing | Copies of same mesh | Minimal | Significant reduction | Slight |
For more details on the technology, see Geometry instancing (Wikipedia).
Why Is Memory Optimization Critical for Mobile Games?
Mobile platforms have strict RAM limits. iOS kills apps without warning on memory pressure. Android does similarly with onLowMemory callback. Target budgets: iOS <1 GB for modern devices, <512 MB for iPhone 8/X support; Android <800 MB for broad compatibility (OS uses 400–600 MB). Typical savings after our optimization are 30–50% of RAM while maintaining quality.
Addressables and Asset Bundles: How to Stay Within Budget
Loading everything at startup is unacceptable for large projects. Addressables (a wrapper over Asset Bundles) provide addressable asynchronous asset loading. Explicit unload: Addressables.ReleaseInstance / Addressables.Release. Addressables do not automatically unload assets when objects are destroyed. A common mistake: Addressables.InstantiateAsync in a loop without Release – memory grows until crash. Reference counting: an asset is unloaded only when all its handles are freed. Architectural pattern: a service/manager holds the handle of the loaded asset and releases it during scene transitions.
Groups and Bundle Strategy: group assets by loading logic. For example, all assets of one level in one bundle, shared assets (UI, fonts) in a separate group with Prevent Updates. This strategy saves up to 30% memory.
Texture Memory: Where 70% of RAM Comes From
Textures are the main memory consumer. Analyze via Memory Profiler: All Of Memory -> Texture2D shows the heaviest textures immediately. In practice, we find textures with inflated Max Size (4096 for a mobile icon is a typical mistake). Measures: Mipmap for 3D textures (enable), for UI (disable); Streaming Mipmaps for open world – loads mip levels as camera approaches. A common problem: textures referenced by unused Materials remain in memory – Memory Profiler shows the reference chain. Remove unnecessary materials. After replacing all RGBA32 textures with ASTC 6×6 on Android, savings reach 60% without quality loss.
GC Allocations: How to Eliminate Freezes in Hot Path
C# garbage collector in Unity is stop-the-world. If heap memory is allocated per frame, GC pause causes visible freezes. Goal: zero allocations in hot path (Update, FixedUpdate, render). Typical sources: string concatenation in Update (replace with StringBuilder); LINQ in hot path (manual loops with pre-allocated lists); GetComponent<T>() every frame (cache in Awake/Start); boxing value types when passed as object parameters. After profiling with Unity Profiler, we reduce hot path allocations by 95% – freezes disappear.
How Can LOD and Culling Cut 40% of Draw Calls?
LOD Group switches to simplified geometry as the object moves away from camera. Standard for 3D environment: LOD0 (100% triangles), LOD1 (30–50%), LOD2 (10–15%), Culled. For mobile, set Culled threshold more aggressively – draw less per frame.
Occlusion Culling – Unity does not render objects behind walls. Requires baked occlusion data. For indoor scenes, reduces draw calls by 20–40%.
Frustum Culling works automatically – objects outside camera FOV are not rendered. But the draw call for the check still happens. For scenes with thousands of objects, use custom spatial partitioning (Quadtree, Octree). In one of our projects, implementing occlusion culling and LOD reduced total draw calls from 2800 to 450 on Android.
How to Maintain 72 FPS on Quest 3 with VR Optimization?
VR is a separate class of tasks. Frame rate of 72 or 90 Hz must not be violated, or motion sickness occurs. In addition to standard methods: Single Pass Instanced Rendering – renders both eyes in one pass (halves draw calls); Fixed Foveated Rendering (Quest) – reduces peripheral resolution; Late Latching (Quest 3) – updates controller position as late as possible before rendering; Dynamic Resolution in URP/HDRP – automatically lowers render resolution on fps drops. For Quest, profile via OVR Metrics Tool – displays CPU/GPU time directly in headset. After applying these methods, frame rate on Quest 2 stabilizes at 72 FPS even in scenes with 1.5 million polygons.
What Deliverables Do You Get from Game Performance Optimization? Stages and Timelines
We offer a comprehensive turnkey service. Here is exactly what you receive:
- Profiling report on target devices – metrics (FPS, draw calls, memory, GC) for 5–7 main scenarios. Delivery: 3–5 business days.
- Priority action plan – which problems are critical, which can be deferred. Priorities based on impact on gaming experience.
- Optimization implementation – batching, LOD, Addressables, shaders, occlusion culling. Average implementation cycle: 2–4 weeks.
- Re-profiling results – measured improvements. Typical FPS increase: 30–60% on mobile devices.
- Documentation – recommendations for maintenance and further development.
- Team training – how to prevent regressions. We conduct workshops on profiling and optimization.
Get a consultation – we evaluate your project for free and show the optimization potential. Contact us to learn exact timelines and details for your stack.





