Creating a binary format for game replay data
One of the features we wanted to implement in 'Don't Miss the Bell' was the ability for players to view "ghosts" (in-game replays), allowing them to learn from top player's techniques and routes.
One of the features we wanted to implement in 'Don't Miss the Bell' was the ability for players to view "ghosts" (in-game replays), allowing them to learn from top player's techniques and routes. This was to prevent one person working out the best route and dominating the leaderboard with nobody able to challenge it.
Data structure
I needed to decide what data had to be captured during gameplay and stored in the ghost file, along with how frequently I would capture changes to each data point.
A convennient solution for the frequency part of the problem is including my recording code in Unity's MonoBehaviour FixedUpdate() function. This means I can capture at a reasonably high accuracy of 50 Hz, with the added benefit of pausing recording when the player pauses their game, as FixedUpdate is not executed when the timeScale is set to zero.
In terms of data captured, I'd like to keep it to a minimum to reduce external calls and thus impact on performance, also reducing file size. The bare minimum we need in order to reproduce how someone moved through a level is just the player position and horizontal rotation. I'd like slightly more than that to keep it looking smooth, so let's add camera rotation and some movement states to record changes such as tweening the camera down when crouching, and hiding the player mesh when rolling. We don't need to store all of this data every frame, we only need to add it if the data has changed since the last frame was recorded, so lets add a prefix byte to specify if the frame contains that data.
Okay, so here's an example of what one frame will look like where both the player position and the camera rotation have changed, and the player is standing upright:
0x00 (cameraState)
0x01, <float x3>, <float x3> (hasTransform, position, rotation)
0x01, <float x3> (hasCameraRotation, cameraRotation)
And here's what a frame will look like where the player hasn't moved at all, but is crouching:
0x01 (cameraState)
0x00 (hasTransform)
0x00 (hasCameraRotation)
After all these frames, we need to add some metadata, both to help us play it back now, and to ensure future compatibility. Enter: the footer!
Let's start it off with a clear(ish) marker so we can check we're in the right place before reading it, and as a rough check that the file is the right format. 0xFF will do!
Now let's add the most important piece of data, the number of frames we've got stored in this replay. This is necessary to ensure we know when to stop playing back the data, so we don't accidentally try to interpret the footer data as a frame or run into an out of range exception. We'll store this length as an unsigned integer.
Next up, we want future compatibility, so we'll save a version code as another unsigned integer that we'll be able to check if we ever change the data structure of frames in the future. This means we'll still be able to play old replays that were recorded before potential new features and data sources were added.
We'll make sure the length of the footer is the last data in the file (another uint), so we'll be able to process footers even if they change in size in the future. And lastly, lets add some blank reserved space just so we can check that the footer offset flag works properly.
Here's how it's looking:
0xFF (start marker)
<uint> (number of frames in ghost)
<uint> (version code)
0x00 * 48 (reserved space)
<uint> (footer start offset in bytes, equal to 64 at the moment)
EOF
All of those potentially empty frames could become a bit of a space hog, so we can GZip compress the data before uploading it to the server. In my testing this resulted in an average space savings of 55%, with an un-noticeable increase in time taken to finalize a recording to the player!
I'm definitely happy with how I approached this, as it leaves plenty of opportunities to expand on functionality in the future without the need to break or convert old recordings, and the custom binary format is incredibly space efficient, clocking in at an average of a tiny 95KB per 60 seconds of gameplay over multiple runs and gameplay styles.
Playback
After saving all that beautiful data, we need to play it back somehow. This is fairly straightforward due to the constant frequency we recorded at, meaning all we have to do is iterate through all the frames in the file at the same speed, so we can solve that problem by putting our playback code in FixedUpdate too. This, again, has an extra benefit related to timeScale, meaning viewers can pause the replay at any point without us even needing to edit our existing pause menu code.
Here's the process we'll follow to decode the ghost data:
- Request the ghost binary from the server
- GZip inflate the data in memory
- Read the last 4 bytes of the data to discover the length of the footer
- Seek backwards the number of bytes we just read (64 in this version)
- Read the first byte and check it matches our footer start marker (0xFF)
- Read the version code and check it's one we are capable of processing in this build of the game
- Store the replay length as a variable, we'll be using this to stop playback at the right time
- Seek right to the beginning of the data, and export all of our frames to an array of ReplayFrame objects
Oh, "What is a ReplayFrame?", I hear you ask. Well, its a class I created while setting up the recording, with the ability to initialise an instance from binary data, and serialize itself to binary data for saving.
We used the serialization earlier, so now we'll be creating a ReplayFrame from a byte array consisting of a chunk of our replay file. This is as simple as ReplayFrame replayFrame = new(byteArray: frameData);
Playing back our ghost is easy now, we can just block player input and apply the values of each ReplayFrame in order, recreating the original movement with 50 Hz accuracy, then run some code to clean up and take the spectator back to the main menu once we hit the end of our data (once our current frame is equal to the length of the replay).
Reflection
For transparency, I've added this section based on some feedback I recieved in a workshop.
As soon as I managed to get the playback working in-game I was really pleased (and a little surprised) by how polished it already felt. I think that's a good example of how my planning approach paid off, as how thorough I was definitely reduced the amount of code and systems I needed to go back and edit for the final touches. If there was two things I'd want to do differently, I think I'd like to integrate it with the movement script better since I feel like my "movement state" implementation was quite bolted-on. The second thing I'd add would be an overlay showing which keys were pressed during that frame of recording, similar to what you see on some game speedruns. This would remove all elements of mystery and secrets to optimal routes, which some people would argue makes it less satisfying to be first, but I think it creates a stronger community if everyone is able to learn from each other.