Just to preface, there is way too much for me to list in this article entailing the creation of this project and its features. This page will detail each individual feature and attempt to explain the general concept associated for each feature. This project is unfinished and has some architectural issues I would approach differently/refactor now. The general principles are the same and functional because most of the issues come from structure and general robustness of the codebase not the strategies used themselves. I stopped working on this project due to legal ambiguity, regardless in its current state it still has general principle functionality for most features one could expect for this type of game.
When I started with this project I intended this to be a passion project I could share and play with my friends. I have a more minecraft-like clone made in DX11 that has features associated more closely with features one would expect from a minecraft-like game, although that project lacks many features this project hoped to modernize from my own personal taste.
Feature of note include :
UI
VMA ("Vulkan Memory Allocator" basically GPU "malloc") for graphics memory
Skeletal Mesh animating and rendering
Infinite cubic chunk generation using PCG, the world is infinite vertically and horizontally
HDR, Bloom, and Postprocessing
PBR lighting model
Audio Systems
Input Systems
Scene Graph for actors
Networking for actors and chunk modifications (Basic multiplayer)
Chunk save and load using packed region files with compression
Model loading using Assimp
Lua scripting (very barebones)
For this engine I wanted to use a modern graphics API that would eventually make cross-platform capabilities easier. My previous engine and projects only had capabilities to compile to Windows and when I started I wanted an API that would have the same behavior on multiple platforms.
Vulkan is considered infamously complex due to its low level nature and the fact that it introduces synchronization to the programmer that they would not have to worry about before using it. Older APIs, such as OpenGL, are user mode only so scheduling can be handled by the library. Vulkan on the other hand the user controls the scheduling process.
This is actually a benefit in my opinion though, because it means we have transparency to how our operations get scheduled, and more finer grain control over when we get that information back .
If I were to approach this problem today I would start with an RHI (render hardware interface) that would be consumed by my Renderer, because a render hardware interface guarantees graphics hardware capabilities for different graphics backends. This would make it so my game side code wouldn't have to use anything API specific, only engine specific. This makes porting easier because now I just have to build guarantees for that interface not the entire game.
It seems strange to use a skeletal mesh renderer for character who's shapes are primarily individual free moving blocks, but in my case skeletal mesh rendering was dramatically faster than individual draw calls for each block. The bottle-neck in optimization I had generally came from the draw calls themselves (more specifically the bandwidth between the CPU and GPU) rather than the gpu processing the mesh. I realized if I wanted more than 500 entities without my game borking or going under 20 fps I was going to have to create a better way to render the entities while being able to animate them.
There are multiple styles of skeletal mesh rendering depending on your use case and limitations. For most use cases there are two distinct calculations to worry about:
Bone chain calculation
Vertex skinning calculation
There are more calculations to worry about if you deal with things like mesh morph topology, but that is out of scope and was not necessary for the scope of this project.
The most confusing part of this for most developers is probably the concept of Quaternions, or "4D" mathematical objects that describe a 3D rotation. I don't necessarily think they need to be nearly as complicated as they have been presented to be though, and I will link to this amazing article by Marc ten Bosch who is currently working on the 4D game Miegakure who has a great explanation on how to think about using these mathematical objects (it should be noted the article is titled "Let's remove Quaternions from every 3D Engine" but it is by far the best explanation I have seen online). The problem in the confusion I believe comes from the ambiguous language from the word "dimension" and the coincidence of the complex numbers correlating with rotational transformation in geometric algebra for 2D only.
Why care about these annoying mathematical objects? There are many reasons why, but the generally important one is that they are trivially mixable. Animations are comprised of individual key frames representing the state of the animation for each moment in time. Storing the entire state of each bone at each frame is impractical and really large, so games interpolate between key frames to solve for the best fit result between the two.
Traditional mathematical representations of rotation such as Euler angles represent a 3D rotation as a sequence of rotations about three axes. Because each rotation is applied in order, the meaning of the later rotations depends on the earlier ones, especially when using local/body axes. Gimbal lock occurs when two of the three rotation axes become aligned, causing two Euler angle parameters to affect the same effective axis. This does not physically “lock” an object’s rotation, but it means the Euler-angle representation loses one degree of freedom at that orientation, making interpolation, editing, and control behave poorly.
Quaternions avoid gimbal lock because they do not represent orientation as a sequence of rotations around three ordered axes. Instead, a unit quaternion represents rotation using an axis and angle in a continuous 4D representation. When interpolated properly, such as with spherical linear interpolation, the rotation axis and rotation magnitude transition smoothly without two Euler parameters collapsing onto the same effective axis.
It's worth noting in my engine the Quaternion class is named Rotor3D.
I'm going to make an assumption that the reader knows about Matrices, Vectors, and other common mathematical concepts in game development when describing the skeletal animation system.
After calculating the rotation and translation blend between two key-frames we calculate local bone transform matrices for each bone in the hierarchy. In turn the bone transforms can be applied along the chain to calculate their world matrices when the animation updates.
The world transforms aren't enough to properly skin the vertices though. Remember each bone is offset in space by default, so the transform we want to apply to each vertex is the transform relative to the actual bone. So before uploading the world transform we apply the inverse binding matrix which changes the transform from world space to skin space now when the transform is applied it moves relative to the bone. Without the inverse bind matrix, the bind-pose offset already baked into the vertex positions would be applied again by the current global bone transform.
Once the transforms are uploaded they are trivially applicable to each vertex.
GLSL snippet for skinned mesh shader
I use multiple rendering techniques to achieve high-quality visuals, including HDR, bloom, and physically based rendering.
HDR, or High Dynamic Range rendering, allows the renderer to represent lighting values beyond the limited range of a standard 8-bit color buffer. Older rendering pipelines often stored each color channel using only 8 bits, giving each channel 256 possible values. My HDR implementation uses a 16-bit floating-point color buffer, which allows bright light sources and emissive materials to retain much more intensity before being displayed.
Bloom works especially well in combination with HDR. Because HDR preserves values brighter than pure white, bloom can use those high-intensity regions to create a convincing sense of overwhelming luminosity, such as glowing lights, fire, or bright reflections.
Before presenting the final image to the screen, the HDR buffer must be converted back into a standard displayable 8-bit image. This process is called tone mapping. In my implementation, I use Reinhard tone mapping to mathematically compress the high dynamic range image into a lower dynamic range while preserving the overall lighting feel.
OBB3 collision detection can be surprisingly tricky to implement correctly, especially when the object is rotated relative to the voxel grid. Performing collision checks against an entire voxel world would be far too expensive with a naïve approach, since most of the world is irrelevant to the object’s current position.
A more efficient solution is to take advantage of the fact that voxels are stored in a regular grid and can be indexed directly. First, compute an AABB3 that fully encloses the physics primitive, such as the oriented bounding box. This axis-aligned box gives a conservative region of space that is guaranteed to contain every voxel the primitive could possibly overlap.
Once that enclosing AABB is calculated, convert its minimum and maximum world-space coordinates into voxel grid coordinates. Then, iterate only over the voxels inside that rectangular region. For each occupied voxel, perform the actual narrow-phase collision test against the original physics primitive, such as an OBB-versus-AABB test. This avoids checking the entire voxel world while still ensuring that no possible collisions are missed.
In practice, this turns a potentially world-sized collision query into a small localized search around the moving object, making voxel collision detection much more practical for real-time gameplay.
Physics calculations against voxel grid.