tags: [vr-ar]
GPU resource profiling in VR games
When a VR game doesn't maintain the target frame rate, the first thing developers do is lower resolution or remove objects. This helps, but doesn't solve the problem: after the next sprint with new content, it repeats. The correct approach — find specific GPU bottlenecks through profiling and eliminate them.
VR specifics: on standalone Quest without PC connection there's no full GPU capture. Tools need to be selected for the platform. For standalone Quest — OVR Metrics Tool and Snapdragon Profiler. For PCVR (SteamVR, OpenXR on PC) — RenderDoc, NVIDIA Nsight or AMD Radeon GPU Profiler. For Quest via Link — hybrid approach.
RenderDoc: how to correctly read GPU capture in VR
RenderDoc is the standard for render pipeline analysis. For VR projects in Unity it connects via RenderDoc Integration or manually via UnityEngine.Rendering.DebugManager. In Single Pass Instanced mode in RenderDoc two viewports are visible in one draw call — this is normal, no need to worry.
What to look for first: the Timeline view in RenderDoc shows the duration of each pass in milliseconds. In a typical VR scene the most expensive passes are Shadow Map rendering, Opaque rendering, Transparent rendering. If Shadow Map takes 3+ ms with a budget of 8–9 ms per frame — that's the first target.
In the drawcall list look for repeating patterns: the same calls that aren't batched. If Mesh Renderer is drawn 50 times with separate calls instead of one instanced call — this is CPU overhead, not GPU. They're visible by the icon in RenderDoc and by small times but many calls.
To find GPU bottleneck use PIX markers (in Unity set via CommandBuffer.BeginSample/EndSample) — they divide the frame into named blocks and immediately show how much GPU time each block takes. Without markers you have to guess which code section a specific draw call belongs to.
Snapdragon Profiler for Meta Quest
RenderDoc doesn't work directly on Android-based Quest without PCVR Link. The main tool here is Snapdragon Profiler from Qualcomm. Connects to Quest via ADB (adb connect), requires developer mode on the headset.
Key metrics in Snapdragon Profiler: GPU Busy %, Fragment ALU Instructions/Vertex, Textures Fetches per Cycle, L2 Cache Hit Rate. If GPU Busy is constantly 95–100% — GPU-bound. If below 70% with frame drops — CPU-bound, and optimizing GPU is pointless.
Overdraw in Snapdragon Profiler is shown as Visibility Ratio — the ratio of rendered pixels to finally visible. For mobile VR a norm is 1.5–2x. Above 3x — serious problem with transparent object sorting or opaque geometry draw order.
Tile architecture GPU on Quest (Tile-Based Deferred Rendering) is fundamentally different from desktop GPU. Writes to render target aren't saved to RAM immediately, but are buffered in on-chip tile memory. If a shader or post-processing breaks this flow (e.g., reads from a render target just written to), a tile flush occurs — data is written to RAM and loaded again. This is expensive. In Snapdragon Profiler this is visible by spike in Tile Flush Count.
Case: 43% GPU time on one fog effect
Project — an adventure VR game for Quest 2. Complaint: consistently 60–65 fps at target 72, no obvious heavy objects in the scene.
Profiling via Snapdragon Profiler revealed: 43% of GPU time in outdoor scenes goes to volumetric fog. The effect was ported directly from the PC version: full-screen ray marching with 64 steps per pixel. On Quest this is a disaster — each pixel does 64 texture samples.
Solution: replace volumetric fog with pseudo-volumetric fog via Depth Fog in shader (fog by depth) plus several large particle-planes with fog texture atlas on near planes. GPU time — from 4.2 ms to 0.3 ms. Visually — practically indistinguishable for end user in VR, where detail perception is reduced due to distance and motion.
Work stages
Profiling starts with determining platform and installing tools. For Quest — physical headset in developer mode, Snapdragon Profiler, ADB. For PCVR — RenderDoc on target GPU. Collection of baseline metrics in several representative scenes.
After analysis a prioritized list of problems is compiled: by GPU time contribution and elimination difficulty. Usually 3–5 points give 60–70% performance gain. The rest — diminishing returns work.
| Stage | Timeline |
|---|---|
| Profiling + report with recommendations | 3–5 days |
| Profiling + fixing top-3 problems | 1–2 weeks |
| Full optimization cycle with verification | 3–5 weeks |
Cost calculated individually after project access and target platform information.





