Vulkan Renderer

A high-performance rendering engine built with Vulkan API, featuring PBR materials, advanced shadow techniques, and a custom shader pipeline.

vulkan
c++
pbr
rendering
graphics
shadows
shaders
Vulkan Renderer Screenshot 1
PBR materials and dynamic lighting in an indoor scene

Overview

This Vulkan-based rendering engine is designed for high-performance real-time graphics applications. It provides a flexible and extensible architecture that supports modern rendering techniques while maintaining excellent performance across different hardware configurations.

Physically Based Rendering

Implements a complete PBR pipeline with metallic-roughness workflow, image-based lighting, and energy conservation.

Advanced Shadow Techniques

Supports cascaded shadow mapping, percentage-closer filtering, and variance shadow maps for high-quality shadows.

Custom Shader Pipeline

Flexible shader system with hot-reloading, permutation management, and automatic resource binding.

Multi-threaded Rendering

Utilizes multiple CPU cores for command buffer generation, resource uploads, and scene management.

Project Details

Last Updated

3 months ago

Technologies

vulkan
c++
pbr
rendering
graphics

Screenshots

Vulkan Renderer Screenshot 2
Cascaded shadow mapping with soft shadows
Vulkan Renderer Screenshot 3
Screen space reflections and ambient occlusion

Implementation

Architecture

The renderer is built on a modular architecture with clear separation between the Vulkan abstraction layer, resource management, and rendering systems. This design allows for easy extension and maintenance while providing a clean API for client applications.

Vulkan Integration

The Vulkan API is wrapped in a thin abstraction layer that simplifies common operations while maintaining full access to Vulkan's features. This includes utilities for device selection, queue management, synchronization, and resource creation.

Resource Management

Resources like textures, buffers, and pipelines are managed through a centralized system that handles allocation, deallocation, and state tracking. This includes a custom memory allocator that efficiently manages Vulkan memory heaps and a descriptor set manager for optimal binding.

Rendering Pipeline

The rendering pipeline supports both forward and deferred rendering paths with a flexible material system. Post-processing effects are implemented using a composable graph-based approach that allows for easy addition and configuration of effects.

Challenges & Solutions

Synchronization Complexity

Challenge

Vulkan's explicit synchronization model required careful management of resource dependencies and command execution. This was addressed by implementing a high-level synchronization framework that tracks resource usage and automatically inserts appropriate barriers.

Solution

Developed a dependency tracking system that analyzes resource access patterns and automatically inserts the minimal set of barriers required for correct execution.

Shader Permutation Explosion

Challenge

Supporting various material features and rendering techniques led to a combinatorial explosion of shader variants, which increased compilation time and binary size.

Solution

Implemented a shader permutation system with runtime code generation and caching. Only the actually used combinations are compiled, and common code is shared between variants.

Cross-Platform Compatibility

Challenge

Ensuring consistent behavior across different GPU vendors and driver versions was challenging due to varying interpretations of the Vulkan specification and driver bugs.

Solution

Created a comprehensive validation suite that tests all renderer features across different hardware configurations. Implemented workarounds for known driver issues that are conditionally enabled based on vendor and driver version detection.

Performance

Benchmarks
Sponza Scene (4K)120 FPS

Classic Sponza scene with 262K triangles, PBR materials, and dynamic lighting

Forest Scene (4K)95 FPS

Dense forest environment with 1.2M triangles and volumetric lighting

City Scene (4K)85 FPS

Urban environment with 3.5M triangles, reflections, and global illumination

Optimizations
  • Hierarchical depth buffer for early fragment culling
  • Asynchronous compute for post-processing effects
  • Mesh clustering for efficient GPU culling
  • Texture streaming with priority-based loading
  • Shader hot reloading for rapid iteration

Code Snippets

cpp
VulkanDevice::VulkanDevice(const DeviceCreateInfo& createInfo) {
    // Select physical device
    m_physicalDevice = selectPhysicalDevice(createInfo.instance, createInfo.requiredExtensions);
    
    // Query queue family properties
    uint32_t queueFamilyCount = 0;
    vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &queueFamilyCount, nullptr);
    std::vector<VkQueueFamilyProperties> queueFamilies(queueFamilyCount);
    vkGetPhysicalDeviceQueueFamilyProperties(m_physicalDevice, &queueFamilyCount, queueFamilies.data());
    
    // Find queue families that support graphics, compute, and transfer operations
    QueueFamilyIndices indices = findQueueFamilies(queueFamilies, createInfo.surface);
    
    // Create logical device with requested queues and extensions
    std::vector<VkDeviceQueueCreateInfo> queueCreateInfos;
    std::set<uint32_t> uniqueQueueFamilies = {
        indices.graphicsFamily.value(),
        indices.computeFamily.value(),
        indices.transferFamily.value()
    };
    
    float queuePriority = 1.0f;
    for (uint32_t queueFamily : uniqueQueueFamilies) {
        VkDeviceQueueCreateInfo queueCreateInfo{};
        queueCreateInfo.sType = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO;
        queueCreateInfo.queueFamilyIndex = queueFamily;
        queueCreateInfo.queueCount = 1;
        queueCreateInfo.pQueuePriorities = &queuePriority;
        queueCreateInfos.push_back(queueCreateInfo);
    }
    
    VkDeviceCreateInfo deviceCreateInfo{};
    deviceCreateInfo.sType = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO;
    deviceCreateInfo.pQueueCreateInfos = queueCreateInfos.data();
    deviceCreateInfo.queueCreateInfoCount = static_cast<uint32_t>(queueCreateInfos.size());
    deviceCreateInfo.pEnabledFeatures = &createInfo.enabledFeatures;
    deviceCreateInfo.enabledExtensionCount = static_cast<uint32_t>(createInfo.requiredExtensions.size());
    deviceCreateInfo.ppEnabledExtensionNames = createInfo.requiredExtensions.data();
    
    // Create the logical device
    VK_CHECK(vkCreateDevice(m_physicalDevice, &deviceCreateInfo, nullptr, &m_device));
    
    // Get queue handles
    vkGetDeviceQueue(m_device, indices.graphicsFamily.value(), 0, &m_graphicsQueue);
    vkGetDeviceQueue(m_device, indices.computeFamily.value(), 0, &m_computeQueue);
    vkGetDeviceQueue(m_device, indices.transferFamily.value(), 0, &m_transferQueue);
    
    // Initialize memory allocator
    initializeMemoryAllocator();
}

Related Projects

Ray Tracing Framework

Ray Tracing Framework

A real-time ray tracing framework with BVH acceleration structures.

DirectX 12 Engine

DirectX 12 Engine

A modern DirectX 12 rendering engine with mesh shaders.