Hi, we're developing a new rpg/shooter side scroller game and we're using Spine for all the character animations (including mobs).
Our skeletons are quite simple with the legs and arms being one bone each (no knees or elbows) and the animations are quite simple with just a few key frames each.
We aimed at being able to have 50 animated characters on screen and we run into some performance issues and after running some simple time profiling we found spine to be responsible for most of the load.
In our case on an iPad 3 each character consumed more than 1.5% CPU, and with Corona being single threaded we ended up with lags ones we put more than 30 mobs on screen.
We've been able to improve the performance of spine by more than twice! I'll post here the changes we did, starting with changes in the generic lua runtime and I'll follow with Corona specific changes.
The first thing we noted is that bone:updateWorldTransform is responsible for most of the processing time, so we concentrated on optimizing this function as much as we could by following various lua performance tips we found on the internet:
- localize - the lua environment allocates 250 registers to each function, local variables are allocated in these registers and operations on these registers, especially math operations are much faster. So at the start of the function we copy anything we need to local variables, and at the end we copy back whatever we want to the actual table structure.
- reduce property lookups - what we did for localization also reduce the number of lua table lookups. For example if you have a bone, accessing bone.x requires lua to lookup the 'x' property of the bone table which is much slower than accessing local variable so by working on locals we reduce lookups as well. Aside from that any math.xxx function should also be localized to reduce these lookups.
- eliminate ipairs - using the ipairs command works with a next() iterator which generates a function call each iteration. working with simple 'for' loop is much faster (by %30 for).
In addition to these we also decided to remove any runtime calls to math.sin and math.cos because they are slow. instead we preloaded all the cos/sin values for angle -359 to 359 and we only work in full angles. we didn't notice any difference for our animation but you could decide to store more angles for example steps of .1 angles. it only has a cost of memory. We also used a lua trick to round the angle by using (n - n%1) instead of math.floor which is %28 faster.
So here are the results from profiling the execution time of bone:updateWorldTransform:
original runtime = 100%
original runtime + localized math functions = 93%
optimized + localized math functions = 85%
optimized + const sin/cosine = 74%
and here is the code for the optimized version:
local Bone = {}
local SIN, COS = {}, {}
for i = -359, 359 do
SIN[i] = math.sin( math.pi / 180 * i )
COS[i] = math.cos( math.pi / 180 * i )
end
function Bone.new (data, parent)
if not data then error("data cannot be nil", 2) end
local self = {
data = data,
parent = parent,
x = 0, y = 0,
rotation = 0,
scaleX = 1, scaleY = 1,
m00 = 0, m01 = 0, worldX = 0,
---
a b x
m10 = 0, m11 = 0, worldY = 0,
---
c d y
worldRotation = 0,
worldScaleX = 1, worldScaleY = 1,
}
---
local rad, cos, sin = math.rad, math.cos, math.sin
local inheritScale, inheritRotation = self.data.inheritScale, self.data.inheritRotation
function self:updateWorldTransform (flipX, flipY)
local parent, data = self.parent, self.data
local x, y = self.x, self.y
local rotation, scaleX, scaleY = self.rotation, self.scaleX, self.scaleY
local m00, m01, worldX, m10, m11, worldY
local worldRotation, worldScaleX, worldScaleY = rotation, scaleX, scaleY
if parent then
worldX = x * parent.m00 + y * parent.m01 + parent.worldX
worldY = x * parent.m10 + y * parent.m11 + parent.worldY
if inheritScale then
worldScaleX = parent.worldScaleX*worldScaleX
worlsScaleY = parent.worldScaleY*worldScaleY
end
if inheritRotation then
worldRotation = worldRotation + parent.worldRotation
end
else
worldX = (flipX and -x) or x
worldY = (flipY and -y) or y
end
---
local radians = rad(worldRotation)
---
local cosV = cos(radians)
---
local sinV = sin(radians)
local angle = (worldRotation - worldRotation % 1) % 360
local cosV = COS[angle]
local sinV = SIN[angle]
m00 = cosV * worldScaleX
m10 = sinV * worldScaleX
m01 = -sinV * worldScaleY
m11 = cosV * worldScaleY
if flipX then
m00 = -m00
m01 = -m01
end
if flipY then
m10 = -m10
m11 = -m11
end
self.m00, self.m01, self.worldX = m00, m01, worldX
self.m10, self.m11, self.worldY = m10, m11, worldY
self.worldRotation, self.worldScaleX, self.worldScaleY = worldRotation, worldScaleX, worldScaleY
end
function self:setToSetupPose ()
local data = self.data
self.x = data.x
self.y = data.y
self.rotation = data.rotation
self.scaleX = data.scaleX
self.scaleY = data.scaleY
end
return self
end
return Bone
So this is the first step in our optimization, when I get some free time I'll post about the other changes we made mostly on the corona-spine side..