Metrics view provides insight into how the skeletons may perform at runtime. The metrics shown are the sum for all visible skeletons.
The total number of bones in the skeletons.
At runtime, once a skeleton is posed by manipulating the local transforms of the bones (often by applying an animation), the world transforms for each bone are computed (typical code). Usually this happens when
Skeleton#updateWorldTransform is called.
Computing a bone's world transform uses the CPU. It is relatively cheap, but can add up with many skeletons on screen that each have many bones. For example, if each skeleton has 30 bones and 50 of these skeletons are on screen at once, bone 1,500 world transforms need to be computed.
On the desktop, many thousands of bones are generally not a problem. On mobile, more than a couple thousand bones may be an issue. For example, in some tests, a Nexus 4 can handle about 2000 bones before the framerate drops below 60 frames per second.
The total number of constraints in the skeletons.
Computing IK constraints uses the CPU. The computations are relatively cheap, even for 2 bone IK, but can add up with many skeletons on screen that are using IK.
The total number of slots in the skeletons.
The number of slots indicates the complexity of a skeleton, but has a negligible impact on performance. Since each slot can have 0 or 1 attachment visible, it also indicates the maximum number of attachments that can be visible at any given time.
The total number of timelines for the currently active animations.
At runtime, an animation is made up of a number of "timelines". Each timeline has a list of key frame values. When an animation is applied, each timeline for the animation is applied. The timeline looks through its list to find the key frame values for the current animation time, then manipulates the skeleton with those values (typical code). Even a timeline that has only 1 key at frame 0 is still applied every time the animation is applied to the skeleton.
Applying timelines uses the CPU. Applying a single timeline is relatively cheap, but can add up when many skeletons each have animations applied every frame. Also, multiple animations may be applied to a single skeleton each frame. For example, when running and shooting at the same time, or when animations are mixed for crossfading.
The total number of attachments in the skeletons.
The total number of attachments doesn't affect rendering performance and has a negligible affect on load time.
The number of visible attachments in the skeletons.
An attachment simply being visible may not affect rendering performance. For example, nothing is rendered for a bounding box attachment.
The number of vertices for all visible region and mesh attachments.
The number of vertices indicates how much geometry is sent to the GPU.
The number of vertices that are transformed for all visible region and mesh attachments.
Transforming a vertex uses the CPU. The computations are relatively cheap, but can add up with many skeletons on screen. The vertex count and how many bones affect each vertex determine the number of vertex transforms.
The number of triangles for all visible region and mesh attachments.
The number of triangles indicates how many vertex indices are sent to the GPU.
The number of pixels drawn when the skeletons are drawn at full size. This includes transparent pixels and pixels that are drawn multiple times due to overdraw.
The number of pixels drawn indicates how much GPU fill rate is used to render the skeletons.
When one or more attachments are selected, two sets of numbers are shown for some metrics. The numbers before the forward slash are for the selected attachments. The numbers after the forward slash are for all attachments.
Performance is interesting because it very often is not a problem. Even the worst application running on the most terrible hardware doesn't need optimizations as long as the performance is acceptable. The rule is: if improving performance would not be noticeable to the player, then any time spent improving performance is a complete waste. That time is always better spent on things noticeable to the player. Since optimization can hurt your visuals and your code's readability, it's important not to optimize needlessly.
This doesn't mean performance should be ignored during development, only that minimal performance considerations should be made as preemptive attempts to avoid performance problems. It's very easy to go too far and waste time, especially when so many variables are involved:
- what hardware the application is running on,
- everything done by the application and game toolkit besides rendering skeletons,
- how the game toolkit and Spine runtime perform rendering,
- how many skeletons are rendered,
- how those skeletons are configured,
- how many pixels are drawn,
- and much more.
When designing Spine animations, it's impossible for animators to take all of this into account and only deliver assets that perform well. Instead, reasonable effort should be made to avoid wasting performance, such as only using the number of mesh vertices required for the desired deformations. Only after a performance problem occurs should additional effort be made to simplify the skeletons.
Performance relies on so many variables that it isn't possible to generalize about which numbers in the
Metrics view are too high. If you intend to push performance to its limits, you will need to perform your own tests using your actual skeletons on a variety of hardware to get an idea of the limits in your specific environment. If you aren't pushing the CPU or GPU to its limits, then there is no need to waste time worrying about performance or to sacrifice the quality of your animations.
If a performance problem is identified, the first step is to identify the bottleneck that has the largest impact on performance, since focusing efforts there results in the largest gains. The most common bottlenecks when rendering Spine skeletons are CPU usage, fill rate, and draw calls.
When CPU usage is too high, it means your application uses so much CPU that rendering falls below an acceptable framerate (typically 60 or 30 frames per second). The most important metrics for CPU usage are bones, timelines, vertex transforms, and constraints. Any of these can be an issue if used in excess.
The number of timelines increase the processing needed to apply an animation. Animation Clean Up may reduce the number of timelines.
There may be some options besides reducing the complexity of your skeletons. If you have many skeletons, it may be acceptable to only update a subset of them each frame, in a round-robin fashion. Another option may be to only animate a small number of skeletons and use those poses to render a larger number of skeletons.
When GPU fill rate is exhausted, it means you are trying to draw more pixels than the GPU can draw at an acceptable framerate. Each time a pixel is drawn it counts against the fill rate. This is true even when the pixel is completely transparent or when the same pixel is drawn multiple times (this is called "overdraw"), for example when attachments overlap.
The solution to being fill rate limited is always the same: draw fewer pixels. Meshes can be used to avoid drawing transparent pixels, though this likely means more vertices to transform on the CPU. Completely transparent pixels still use up fill rate, so attachments should be hidden rather than their alpha set to zero. Sometimes art can be constructed in a way to minimize overdraw, for example by using a single, larger image in place of two or more overlapping images.
Asking the GPU to draw is an expensive operation. If asked to draw too many times, the GPU won't be able to complete drawing with an acceptable frame rate.
Most game toolkits batch up as much geometry as possible so the GPU is asked to draw fewer times. Usually a batch needs to be flushed when the GPU needs to draw from a different texture, so a texture atlas is used to reduce texture switches. Using additive blending for slots may also cause a batch flush for game toolkits that are not rendering with premultiplied alpha.