Add creation function for framegraphs and render targets

- The framegraph code currently does not attempt to alias render
  targets, but this should be pretty straightforward to add.
This commit is contained in:
Kevin Trogant 2024-02-05 21:49:09 +01:00
parent 1d8051747e
commit bdd3db98bb
27 changed files with 21047 additions and 89 deletions

179
contrib/vma/CHANGELOG.md Normal file
View File

@ -0,0 +1,179 @@
# 3.0.1 (2022-05-26)
- Fixes in defragmentation algorithm.
- Fixes in GpuMemDumpVis.py regarding image height calculation.
- Other bug fixes, optimizations, and improvements in the code and documentation.
# 3.0.0 (2022-03-25)
It has been a long time since the previous official release, so hopefully everyone has been using the latest code from "master" branch, which is always maintained in a good state, not the old version. For completeness, here is the list of changes since v2.3.0. The major version number has changed, so there are some compatibility-breaking changes, but the basic API stays the same and is mostly backward-compatible.
Major features added (some compatibility-breaking):
- Added new API for selecting preferred memory type: flags `VMA_MEMORY_USAGE_AUTO`, `VMA_MEMORY_USAGE_AUTO_PREFER_DEVICE`, `VMA_MEMORY_USAGE_AUTO_PREFER_HOST`, `VMA_ALLOCATION_CREATE_HOST_ACCESS_SEQUENTIAL_WRITE_BIT`, `VMA_ALLOCATION_CREATE_HOST_ACCESS_RANDOM_BIT`, `VMA_ALLOCATION_CREATE_HOST_ACCESS_ALLOW_TRANSFER_INSTEAD_BIT`. Old values like `VMA_MEMORY_USAGE_GPU_ONLY` still work as before, for backward compatibility, but are not recommended.
- Added new defragmentation API and algorithm, replacing the old one. See structure `VmaDefragmentationInfo`, `VmaDefragmentationMove`, `VmaDefragmentationPassMoveInfo`, `VmaDefragmentationStats`, function `vmaBeginDefragmentation`, `vmaEndDefragmentation`, `vmaBeginDefragmentationPass`, `vmaEndDefragmentationPass`.
- Redesigned API for statistics, replacing the old one. See structures: `VmaStatistics`, `VmaDetailedStatistics`, `VmaTotalStatistics`. `VmaBudget`, functions: `vmaGetHeapBudgets`, `vmaCalculateStatistics`, `vmaGetPoolStatistics`, `vmaCalculatePoolStatistics`, `vmaGetVirtualBlockStatistics`, `vmaCalculateVirtualBlockStatistics`.
- Added "Virtual allocator" feature - possibility to use core allocation algorithms for allocation of custom memory, not necessarily Vulkan device memory. See functions like `vmaCreateVirtualBlock`, `vmaDestroyVirtualBlock` and many more.
- `VmaAllocation` now keeps both `void* pUserData` and `char* pName`. Added function `vmaSetAllocationName`, member `VmaAllocationInfo::pName`. Flag `VMA_ALLOCATION_CREATE_USER_DATA_COPY_STRING_BIT` is now deprecated.
- Clarified and cleaned up various ways of importing Vulkan functions. See macros `VMA_STATIC_VULKAN_FUNCTIONS`, `VMA_DYNAMIC_VULKAN_FUNCTIONS`, structure `VmaVulkanFunctions`. Added members `VmaVulkanFunctions::vkGetInstanceProcAddr`, `vkGetDeviceProcAddr`, which are now required when using `VMA_DYNAMIC_VULKAN_FUNCTIONS`.
Removed (compatibility-breaking):
- Removed whole "lost allocations" feature. Removed from the interface: `VMA_ALLOCATION_CREATE_CAN_BECOME_LOST_BIT`, `VMA_ALLOCATION_CREATE_CAN_MAKE_OTHER_LOST_BIT`, `vmaCreateLostAllocation`, `vmaMakePoolAllocationsLost`, `vmaTouchAllocation`, `VmaAllocatorCreateInfo::frameInUseCount`, `VmaPoolCreateInfo::frameInUseCount`.
- Removed whole "record & replay" feature. Removed from the API: `VmaAllocatorCreateInfo::pRecordSettings`, `VmaRecordSettings`, `VmaRecordFlagBits`, `VmaRecordFlags`. Removed VmaReplay application.
- Removed "buddy" algorithm - removed flag `VMA_POOL_CREATE_BUDDY_ALGORITHM_BIT`.
Minor but compatibility-breaking changes:
- Changes in `ALLOCATION_CREATE_STRATEGY` flags. Removed flags: `VMA_ALLOCATION_CREATE_STRATEGY_MIN_FRAGMENTATION_BIT`, `VMA_ALLOCATION_CREATE_STRATEGY_WORST_FIT_BIT`, `VMA_VIRTUAL_ALLOCATION_CREATE_STRATEGY_MIN_FRAGMENTATION_BIT`, which were aliases to other existing flags.
- Added a member `void* pUserData` to `VmaDeviceMemoryCallbacks`. Updated `PFN_vmaAllocateDeviceMemoryFunction`, `PFN_vmaFreeDeviceMemoryFunction` to use the new `pUserData` member.
- Removed function `vmaResizeAllocation` that was already deprecated.
Other major changes:
- Added new features to custom pools: support for dedicated allocations, new member `VmaPoolCreateInfo::pMemoryAllocateNext`, `minAllocationAlignment`.
- Added support for Vulkan 1.2, 1.3.
- Added support for VK_KHR_buffer_device_address extension - flag `VMA_ALLOCATOR_CREATE_BUFFER_DEVICE_ADDRESS_BIT`.
- Added support for VK_EXT_memory_priority extension - flag `VMA_ALLOCATOR_CREATE_EXT_MEMORY_PRIORITY_BIT`, members `VmaAllocationCreateInfo::priority`, `VmaPoolCreateInfo::priority`.
- Added support for VK_AMD_device_coherent_memory extension - flag `VMA_ALLOCATOR_CREATE_AMD_DEVICE_COHERENT_MEMORY_BIT`.
- Added member `VmaAllocatorCreateInfo::pTypeExternalMemoryHandleTypes`.
- Added function `vmaGetAllocatorInfo`, structure `VmaAllocatorInfo`.
- Added functions `vmaFlushAllocations`, `vmaInvalidateAllocations` for multiple allocations at once.
- Added flag `VMA_ALLOCATION_CREATE_CAN_ALIAS_BIT`.
- Added function `vmaCreateBufferWithAlignment`.
- Added convenience function `vmaGetAllocationMemoryProperties`.
- Added convenience functions: `vmaCreateAliasingBuffer`, `vmaCreateAliasingImage`.
Other minor changes:
- Implemented Two-Level Segregated Fit (TLSF) allocation algorithm, replacing previous default one. It is much faster, especially when freeing many allocations at once or when `bufferImageGranularity` is large.
- Renamed debug macro `VMA_DEBUG_ALIGNMENT` to `VMA_MIN_ALIGNMENT`.
- Added CMake support - CMakeLists.txt files. Removed Premake support.
- Changed `vmaInvalidateAllocation` and `vmaFlushAllocation` to return `VkResult`.
- Added nullability annotations for Clang: `VMA_NULLABLE`, `VMA_NOT_NULL`, `VMA_NULLABLE_NON_DISPATCHABLE`, `VMA_NOT_NULL_NON_DISPATCHABLE`, `VMA_LEN_IF_NOT_NULL`.
- JSON dump format has changed.
- Countless fixes and improvements, including performance optimizations, compatibility with various platforms and compilers, documentation.
# 2.3.0 (2019-12-04)
Major release after a year of development in "master" branch and feature branches. Notable new features: supporting Vulkan 1.1, supporting query for memory budget.
Major changes:
- Added support for Vulkan 1.1.
- Added member `VmaAllocatorCreateInfo::vulkanApiVersion`.
- When Vulkan 1.1 is used, there is no need to enable VK_KHR_dedicated_allocation or VK_KHR_bind_memory2 extensions, as they are promoted to Vulkan itself.
- Added support for query for memory budget and staying within the budget.
- Added function `vmaGetBudget`, structure `VmaBudget`. This can also serve as simple statistics, more efficient than `vmaCalculateStats`.
- By default the budget it is estimated based on memory heap sizes. It may be queried from the system using VK_EXT_memory_budget extension if you use `VMA_ALLOCATOR_CREATE_EXT_MEMORY_BUDGET_BIT` flag and `VmaAllocatorCreateInfo::instance` member.
- Added flag `VMA_ALLOCATION_CREATE_WITHIN_BUDGET_BIT` that fails an allocation if it would exceed the budget.
- Added new memory usage options:
- `VMA_MEMORY_USAGE_CPU_COPY` for memory that is preferably not `DEVICE_LOCAL` but not guaranteed to be `HOST_VISIBLE`.
- `VMA_MEMORY_USAGE_GPU_LAZILY_ALLOCATED` for memory that is `LAZILY_ALLOCATED`.
- Added support for VK_KHR_bind_memory2 extension:
- Added `VMA_ALLOCATION_CREATE_DONT_BIND_BIT` flag that lets you create both buffer/image and allocation, but don't bind them together.
- Added flag `VMA_ALLOCATOR_CREATE_KHR_BIND_MEMORY2_BIT`, functions `vmaBindBufferMemory2`, `vmaBindImageMemory2` that let you specify additional local offset and `pNext` pointer while binding.
- Added functions `vmaSetPoolName`, `vmaGetPoolName` that let you assign string names to custom pools. JSON dump file format and VmaDumpVis tool is updated to show these names.
- Defragmentation is legal only on buffers and images in `VK_IMAGE_TILING_LINEAR`. This is due to the way it is currently implemented in the library and the restrictions of the Vulkan specification. Clarified documentation in this regard. See discussion in #59.
Minor changes:
- Made `vmaResizeAllocation` function deprecated, always returning failure.
- Made changes in the internal algorithm for the choice of memory type. Be careful! You may now get a type that is not `HOST_VISIBLE` or `HOST_COHERENT` if it's not stated as always ensured by some `VMA_MEMORY_USAGE_*` flag.
- Extended VmaReplay application with more detailed statistics printed at the end.
- Added macros `VMA_CALL_PRE`, `VMA_CALL_POST` that let you decorate declarations of all library functions if you want to e.g. export/import them as dynamically linked library.
- Optimized `VmaAllocation` objects to be allocated out of an internal free-list allocator. This makes allocation and deallocation causing 0 dynamic CPU heap allocations on average.
- Updated recording CSV file format version to 1.8, to support new functions.
- Many additions and fixes in documentation. Many compatibility fixes for various compilers and platforms. Other internal bugfixes, optimizations, updates, refactoring...
# 2.2.0 (2018-12-13)
Major release after many months of development in "master" branch and feature branches. Notable new features: defragmentation of GPU memory, buddy algorithm, convenience functions for sparse binding.
Major changes:
- New, more powerful defragmentation:
- Added structure `VmaDefragmentationInfo2`, functions `vmaDefragmentationBegin`, `vmaDefragmentationEnd`.
- Added support for defragmentation of GPU memory.
- Defragmentation of CPU memory now uses `memmove`, so it can move data to overlapping regions.
- Defragmentation of CPU memory is now available for memory types that are `HOST_VISIBLE` but not `HOST_COHERENT`.
- Added structure member `VmaVulkanFunctions::vkCmdCopyBuffer`.
- Major internal changes in defragmentation algorithm.
- VmaReplay: added parameters: `--DefragmentAfterLine`, `--DefragmentationFlags`.
- Old interface (structure `VmaDefragmentationInfo`, function `vmaDefragment`) is now deprecated.
- Added buddy algorithm, available for custom pools - flag `VMA_POOL_CREATE_BUDDY_ALGORITHM_BIT`.
- Added convenience functions for multiple allocations and deallocations at once, intended for sparse binding resources - functions `vmaAllocateMemoryPages`, `vmaFreeMemoryPages`.
- Added function that tries to resize existing allocation in place: `vmaResizeAllocation`.
- Added flags for allocation strategy: `VMA_ALLOCATION_CREATE_STRATEGY_BEST_FIT_BIT`, `VMA_ALLOCATION_CREATE_STRATEGY_WORST_FIT_BIT`, `VMA_ALLOCATION_CREATE_STRATEGY_FIRST_FIT_BIT`, and their aliases: `VMA_ALLOCATION_CREATE_STRATEGY_MIN_MEMORY_BIT`, `VMA_ALLOCATION_CREATE_STRATEGY_MIN_TIME_BIT`, `VMA_ALLOCATION_CREATE_STRATEGY_MIN_FRAGMENTATION_BIT`.
Minor changes:
- Changed behavior of allocation functions to return `VK_ERROR_VALIDATION_FAILED_EXT` when trying to allocate memory of size 0, create buffer with size 0, or image with one of the dimensions 0.
- VmaReplay: Added support for Windows end of lines.
- Updated recording CSV file format version to 1.5, to support new functions.
- Internal optimization: using read-write mutex on some platforms.
- Many additions and fixes in documentation. Many compatibility fixes for various compilers. Other internal bugfixes, optimizations, refactoring, added more internal validation...
# 2.1.0 (2018-09-10)
Minor bugfixes.
# 2.1.0-beta.1 (2018-08-27)
Major release after many months of development in "development" branch and features branches. Many new features added, some bugs fixed. API stays backward-compatible.
Major changes:
- Added linear allocation algorithm, accessible for custom pools, that can be used as free-at-once, stack, double stack, or ring buffer. See "Linear allocation algorithm" documentation chapter.
- Added `VMA_POOL_CREATE_LINEAR_ALGORITHM_BIT`, `VMA_ALLOCATION_CREATE_UPPER_ADDRESS_BIT`.
- Added feature to record sequence of calls to the library to a file and replay it using dedicated application. See documentation chapter "Record and replay".
- Recording: added `VmaAllocatorCreateInfo::pRecordSettings`.
- Replaying: added VmaReplay project.
- Recording file format: added document "docs/Recording file format.md".
- Improved support for non-coherent memory.
- Added functions: `vmaFlushAllocation`, `vmaInvalidateAllocation`.
- `nonCoherentAtomSize` is now respected automatically.
- Added `VmaVulkanFunctions::vkFlushMappedMemoryRanges`, `vkInvalidateMappedMemoryRanges`.
- Improved debug features related to detecting incorrect mapped memory usage. See documentation chapter "Debugging incorrect memory usage".
- Added debug macro `VMA_DEBUG_DETECT_CORRUPTION`, functions `vmaCheckCorruption`, `vmaCheckPoolCorruption`.
- Added debug macro `VMA_DEBUG_INITIALIZE_ALLOCATIONS` to initialize contents of allocations with a bit pattern.
- Changed behavior of `VMA_DEBUG_MARGIN` macro - it now adds margin also before first and after last allocation in a block.
- Changed format of JSON dump returned by `vmaBuildStatsString` (not backward compatible!).
- Custom pools and memory blocks now have IDs that don't change after sorting.
- Added properties: "CreationFrameIndex", "LastUseFrameIndex", "Usage".
- Changed VmaDumpVis tool to use these new properties for better coloring.
- Changed behavior of `vmaGetAllocationInfo` and `vmaTouchAllocation` to update `allocation.lastUseFrameIndex` even if allocation cannot become lost.
Minor changes:
- Changes in custom pools:
- Added new structure member `VmaPoolStats::blockCount`.
- Changed behavior of `VmaPoolCreateInfo::blockSize` = 0 (default) - it now means that pool may use variable block sizes, just like default pools do.
- Improved logic of `vmaFindMemoryTypeIndex` for some cases, especially integrated GPUs.
- VulkanSample application: Removed dependency on external library MathFu. Added own vector and matrix structures.
- Changes that improve compatibility with various platforms, including: Visual Studio 2012, 32-bit code, C compilers.
- Changed usage of "VK_KHR_dedicated_allocation" extension in the code to be optional, driven by macro `VMA_DEDICATED_ALLOCATION`, for compatibility with Android.
- Many additions and fixes in documentation, including description of new features, as well as "Validation layer warnings".
- Other bugfixes.
# 2.0.0 (2018-03-19)
A major release with many compatibility-breaking changes.
Notable new features:
- Introduction of `VmaAllocation` handle that you must retrieve from allocation functions and pass to deallocation functions next to normal `VkBuffer` and `VkImage`.
- Introduction of `VmaAllocationInfo` structure that you can retrieve from `VmaAllocation` handle to access parameters of the allocation (like `VkDeviceMemory` and offset) instead of retrieving them directly from allocation functions.
- Support for reference-counted mapping and persistently mapped allocations - see `vmaMapMemory`, `VMA_ALLOCATION_CREATE_MAPPED_BIT`.
- Support for custom memory pools - see `VmaPool` handle, `VmaPoolCreateInfo` structure, `vmaCreatePool` function.
- Support for defragmentation (compaction) of allocations - see function `vmaDefragment` and related structures.
- Support for "lost allocations" - see appropriate chapter on documentation Main Page.
# 1.0.1 (2017-07-04)
- Fixes for Linux GCC compilation.
- Changed "CONFIGURATION SECTION" to contain #ifndef so you can define these macros before including this header, not necessarily change them in the file.
# 1.0.0 (2017-06-16)
First public release.

19
contrib/vma/LICENSE.txt Normal file
View File

@ -0,0 +1,19 @@
Copyright (c) 2017-2022 Advanced Micro Devices, Inc. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

175
contrib/vma/README.md Normal file
View File

@ -0,0 +1,175 @@
# Vulkan Memory Allocator
Easy to integrate Vulkan memory allocation library.
**Documentation:** Browse online: [Vulkan Memory Allocator](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/) (generated from Doxygen-style comments in [include/vk_mem_alloc.h](include/vk_mem_alloc.h))
**License:** MIT. See [LICENSE.txt](LICENSE.txt)
**Changelog:** See [CHANGELOG.md](CHANGELOG.md)
**Product page:** [Vulkan Memory Allocator on GPUOpen](https://gpuopen.com/gaming-product/vulkan-memory-allocator/)
**Build status:**
- Windows: [![Build status](https://ci.appveyor.com/api/projects/status/4vlcrb0emkaio2pn/branch/master?svg=true)](https://ci.appveyor.com/project/adam-sawicki-amd/vulkanmemoryallocator/branch/master)
- Linux: [![Build Status](https://app.travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg?branch=master)](https://app.travis-ci.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator)
[![Average time to resolve an issue](http://isitmaintained.com/badge/resolution/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator.svg)](http://isitmaintained.com/project/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator "Average time to resolve an issue")
# Problem
Memory allocation and resource (buffer and image) creation in Vulkan is difficult (comparing to older graphics APIs, like D3D11 or OpenGL) for several reasons:
- It requires a lot of boilerplate code, just like everything else in Vulkan, because it is a low-level and high-performance API.
- There is additional level of indirection: `VkDeviceMemory` is allocated separately from creating `VkBuffer`/`VkImage` and they must be bound together.
- Driver must be queried for supported memory heaps and memory types. Different GPU vendors provide different types of it.
- It is recommended to allocate bigger chunks of memory and assign parts of them to particular resources, as there is a limit on maximum number of memory blocks that can be allocated.
# Features
This library can help game developers to manage memory allocations and resource creation by offering some higher-level functions:
1. Functions that help to choose correct and optimal memory type based on intended usage of the memory.
- Required or preferred traits of the memory are expressed using higher-level description comparing to Vulkan flags.
2. Functions that allocate memory blocks, reserve and return parts of them (`VkDeviceMemory` + offset + size) to the user.
- Library keeps track of allocated memory blocks, used and unused ranges inside them, finds best matching unused ranges for new allocations, respects all the rules of alignment and buffer/image granularity.
3. Functions that can create an image/buffer, allocate memory for it and bind them together - all in one call.
Additional features:
- Well-documented - description of all functions and structures provided, along with chapters that contain general description and example code.
- Thread-safety: Library is designed to be used in multithreaded code. Access to a single device memory block referred by different buffers and textures (binding, mapping) is synchronized internally. Memory mapping is reference-counted.
- Configuration: Fill optional members of `VmaAllocatorCreateInfo` structure to provide custom CPU memory allocator, pointers to Vulkan functions and other parameters.
- Customization and integration with custom engines: Predefine appropriate macros to provide your own implementation of all external facilities used by the library like assert, mutex, atomic.
- Support for memory mapping, reference-counted internally. Support for persistently mapped memory: Just allocate with appropriate flag and access the pointer to already mapped memory.
- Support for non-coherent memory. Functions that flush/invalidate memory. `nonCoherentAtomSize` is respected automatically.
- Support for resource aliasing (overlap).
- Support for sparse binding and sparse residency: Convenience functions that allocate or free multiple memory pages at once.
- Custom memory pools: Create a pool with desired parameters (e.g. fixed or limited maximum size) and allocate memory out of it.
- Linear allocator: Create a pool with linear algorithm and use it for much faster allocations and deallocations in free-at-once, stack, double stack, or ring buffer fashion.
- Support for Vulkan 1.0, 1.1, 1.2, 1.3.
- Support for extensions (and equivalent functionality included in new Vulkan versions):
- VK_KHR_dedicated_allocation: Just enable it and it will be used automatically by the library.
- VK_KHR_buffer_device_address: Flag `VK_MEMORY_ALLOCATE_DEVICE_ADDRESS_BIT_KHR` is automatically added to memory allocations where needed.
- VK_EXT_memory_budget: Used internally if available to query for current usage and budget. If not available, it falls back to an estimation based on memory heap sizes.
- VK_EXT_memory_priority: Set `priority` of allocations or custom pools and it will be set automatically using this extension.
- VK_AMD_device_coherent_memory
- Defragmentation of GPU and CPU memory: Let the library move data around to free some memory blocks and make your allocations better compacted.
- Statistics: Obtain brief or detailed statistics about the amount of memory used, unused, number of allocated blocks, number of allocations etc. - globally, per memory heap, and per memory type.
- Debug annotations: Associate custom `void* pUserData` and debug `char* pName` with each allocation.
- JSON dump: Obtain a string in JSON format with detailed map of internal state, including list of allocations, their string names, and gaps between them.
- Convert this JSON dump into a picture to visualize your memory. See [tools/GpuMemDumpVis](tools/GpuMemDumpVis/README.md).
- Debugging incorrect memory usage: Enable initialization of all allocated memory with a bit pattern to detect usage of uninitialized or freed memory. Enable validation of a magic number after every allocation to detect out-of-bounds memory corruption.
- Support for interoperability with OpenGL.
- Virtual allocator: Interface for using core allocation algorithm to allocate any custom data, e.g. pieces of one large buffer.
# Prerequisites
- Self-contained C++ library in single header file. No external dependencies other than standard C and C++ library and of course Vulkan. Some features of C++14 used. STL containers, RTTI, or C++ exceptions are not used.
- Public interface in C, in same convention as Vulkan API. Implementation in C++.
- Error handling implemented by returning `VkResult` error codes - same way as in Vulkan.
- Interface documented using Doxygen-style comments.
- Platform-independent, but developed and tested on Windows using Visual Studio. Continuous integration setup for Windows and Linux. Used also on Android, MacOS, and other platforms.
# Example
Basic usage of this library is very simple. Advanced features are optional. After you created global `VmaAllocator` object, a complete code needed to create a buffer may look like this:
```cpp
VkBufferCreateInfo bufferInfo = { VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO };
bufferInfo.size = 65536;
bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT | VK_BUFFER_USAGE_TRANSFER_DST_BIT;
VmaAllocationCreateInfo allocInfo = {};
allocInfo.usage = VMA_MEMORY_USAGE_AUTO;
VkBuffer buffer;
VmaAllocation allocation;
vmaCreateBuffer(allocator, &bufferInfo, &allocInfo, &buffer, &allocation, nullptr);
```
With this one function call:
1. `VkBuffer` is created.
2. `VkDeviceMemory` block is allocated if needed.
3. An unused region of the memory block is bound to this buffer.
`VmaAllocation` is an object that represents memory assigned to this buffer. It can be queried for parameters like `VkDeviceMemory` handle and offset.
# How to build
On Windows it is recommended to use [CMake UI](https://cmake.org/runningcmake/). Alternatively you can generate a Visual Studio project map using CMake in command line: `cmake -B./build/ -DCMAKE_BUILD_TYPE=Debug -G "Visual Studio 16 2019" -A x64 ./`
On Linux:
```
mkdir build
cd build
cmake ..
make
```
The following targets are available
| Target | Description | CMake option | Default setting |
| ------------- | ------------- | ------------- | ------------- |
| VmaSample | VMA sample application | `VMA_BUILD_SAMPLE` | `OFF` |
| VmaBuildSampleShaders | Shaders for VmaSample | `VMA_BUILD_SAMPLE_SHADERS` | `OFF` |
Please note that while VulkanMemoryAllocator library is supported on other platforms besides Windows, VmaSample is not.
These CMake options are available
| CMake option | Description | Default setting |
| ------------- | ------------- | ------------- |
| `VMA_RECORDING_ENABLED` | Enable VMA memory recording for debugging | `OFF` |
| `VMA_USE_STL_CONTAINERS` | Use C++ STL containers instead of VMA's containers | `OFF` |
| `VMA_STATIC_VULKAN_FUNCTIONS` | Link statically with Vulkan API | `OFF` |
| `VMA_DYNAMIC_VULKAN_FUNCTIONS` | Fetch pointers to Vulkan functions internally (no static linking) | `ON` |
| `VMA_DEBUG_ALWAYS_DEDICATED_MEMORY` | Every allocation will have its own memory block | `OFF` |
| `VMA_DEBUG_INITIALIZE_ALLOCATIONS` | Automatically fill new allocations and destroyed allocations with some bit pattern | `OFF` |
| `VMA_DEBUG_GLOBAL_MUTEX` | Enable single mutex protecting all entry calls to the library | `OFF` |
| `VMA_DEBUG_DONT_EXCEED_MAX_MEMORY_ALLOCATION_COUNT` | Never exceed [VkPhysicalDeviceLimits::maxMemoryAllocationCount](https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#limits-maxMemoryAllocationCount) and return error | `OFF` |
# Binaries
The release comes with precompiled binary executable for "VulkanSample" application which contains test suite. It is compiled using Visual Studio 2019, so it requires appropriate libraries to work, including "MSVCP140.dll", "VCRUNTIME140.dll", "VCRUNTIME140_1.dll". If the launch fails with error message telling about those files missing, please download and install [Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019](https://support.microsoft.com/en-us/help/2977003/the-latest-supported-visual-c-downloads), "x64" version.
# Read more
See **[Documentation](https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/)**.
# Software using this library
- **[X-Plane](https://x-plane.com/)**
- **[Detroit: Become Human](https://gpuopen.com/learn/porting-detroit-3/)**
- **[Vulkan Samples](https://github.com/LunarG/VulkanSamples)** - official Khronos Vulkan samples. License: Apache-style.
- **[Anvil](https://github.com/GPUOpen-LibrariesAndSDKs/Anvil)** - cross-platform framework for Vulkan. License: MIT.
- **[Filament](https://github.com/google/filament)** - physically based rendering engine for Android, Windows, Linux and macOS, from Google. Apache License 2.0.
- **[Atypical Games - proprietary game engine](https://developer.samsung.com/galaxy-gamedev/gamedev-blog/infinitejet.html)**
- **[Flax Engine](https://flaxengine.com/)**
- **[Godot Engine](https://github.com/godotengine/godot/)** - multi-platform 2D and 3D game engine. License: MIT.
- **[Lightweight Java Game Library (LWJGL)](https://www.lwjgl.org/)** - includes binding of the library for Java. License: BSD.
- **[PowerVR SDK](https://github.com/powervr-graphics/Native_SDK)** - C++ cross-platform 3D graphics SDK, from Imagination. License: MIT.
- **[Skia](https://github.com/google/skia)** - complete 2D graphic library for drawing Text, Geometries, and Images, from Google.
- **[The Forge](https://github.com/ConfettiFX/The-Forge)** - cross-platform rendering framework. Apache License 2.0.
- **[VK9](https://github.com/disks86/VK9)** - Direct3D 9 compatibility layer using Vulkan. Zlib lincese.
- **[vkDOOM3](https://github.com/DustinHLand/vkDOOM3)** - Vulkan port of GPL DOOM 3 BFG Edition. License: GNU GPL.
- **[vkQuake2](https://github.com/kondrak/vkQuake2)** - vanilla Quake 2 with Vulkan support. License: GNU GPL.
- **[Vulkan Best Practice for Mobile Developers](https://github.com/ARM-software/vulkan_best_practice_for_mobile_developers)** from ARM. License: MIT.
- **[RPCS3](https://github.com/RPCS3/rpcs3)** - PlayStation 3 emulator/debugger. License: GNU GPLv2.
- **[PPSSPP](https://github.com/hrydgard/ppsspp)** - Playstation Portable emulator/debugger. License: GNU GPLv2+.
[Many other projects on GitHub](https://github.com/search?q=AMD_VULKAN_MEMORY_ALLOCATOR_H&type=Code) and some game development studios that use Vulkan in their games.
# See also
- **[D3D12 Memory Allocator](https://github.com/GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator)** - equivalent library for Direct3D 12. License: MIT.
- **[Awesome Vulkan](https://github.com/vinjn/awesome-vulkan)** - a curated list of awesome Vulkan libraries, debuggers and resources.
- **[VulkanMemoryAllocator-Hpp](https://github.com/malte-v/VulkanMemoryAllocator-Hpp)** - C++ binding for this library. License: CC0-1.0.
- **[PyVMA](https://github.com/realitix/pyvma)** - Python wrapper for this library. Author: Jean-Sébastien B. (@realitix). License: Apache 2.0.
- **[vk-mem](https://github.com/gwihlidal/vk-mem-rs)** - Rust binding for this library. Author: Graham Wihlidal. License: Apache 2.0 or MIT.
- **[Haskell bindings](https://hackage.haskell.org/package/VulkanMemoryAllocator)**, **[github](https://github.com/expipiplus1/vulkan/tree/master/VulkanMemoryAllocator)** - Haskell bindings for this library. Author: Ellie Hermaszewska (@expipiplus1). License BSD-3-Clause.
- **[vma_sample_sdl](https://github.com/rextimmy/vma_sample_sdl)** - SDL port of the sample app of this library (with the goal of running it on multiple platforms, including MacOS). Author: @rextimmy. License: MIT.
- **[vulkan-malloc](https://github.com/dylanede/vulkan-malloc)** - Vulkan memory allocation library for Rust. Based on version 1 of this library. Author: Dylan Ede (@dylanede). License: MIT / Apache 2.0.

19558
contrib/vma/vk_mem_alloc.h Normal file

File diff suppressed because it is too large Load Diff

View File

@ -135,8 +135,8 @@ runtime_lib = library('rt',
'src/runtime/error_report.c', 'src/runtime/error_report.c',
'src/runtime/file_tab.c', 'src/runtime/file_tab.c',
'src/runtime/fsutils.c', 'src/runtime/fsutils.c',
'src/runtime/gfx_framegraph.c',
'src/runtime/gfx_main.c', 'src/runtime/gfx_main.c',
'src/runtime/gfx_object_renderer.c',
'src/runtime/hashing.c', 'src/runtime/hashing.c',
'src/runtime/init.c', 'src/runtime/init.c',
'src/runtime/jobs.c', 'src/runtime/jobs.c',
@ -178,20 +178,27 @@ if vk_dep.found()
# Project Sources # Project Sources
'src/renderer/vk/gpu.h', 'src/renderer/vk/gpu.h',
'src/renderer/vk/pipelines.h', 'src/renderer/vk/pipelines.h',
'src/renderer/vk/render_targets.h',
'src/renderer/vk/swapchain.h', 'src/renderer/vk/swapchain.h',
'src/renderer/vk/helper.c',
'src/renderer/vk/init.c', 'src/renderer/vk/init.c',
'src/renderer/vk/pipelines.c', 'src/renderer/vk/pipelines.c',
'src/renderer/vk/render_targets.c',
'src/renderer/vk/swapchain.c', 'src/renderer/vk/swapchain.c',
# Contrib Sources # Contrib Sources
'contrib/volk/volk.h', 'contrib/volk/volk.h',
'contrib/volk/volk.c', 'contrib/volk/volk.c',
'contrib/vma/vk_mem_alloc.h',
'src/renderer/vk/vma_impl.cpp',
dependencies : [m_dep, vk_inc_dep, windowing_dep], dependencies : [m_dep, vk_inc_dep, windowing_dep],
include_directories : common_incdirs, include_directories : common_incdirs,
link_with : [runtime_lib], link_with : [runtime_lib],
c_pch : 'pch/vk_pch.h', c_pch : 'pch/vk_pch.h',
c_args : platform_defs) c_args : platform_defs,
cpp_pch : 'pch/vk_pch.h',
cpp_args : platform_defs)
static_renderer_lib = vk_renderer_lib static_renderer_lib = vk_renderer_lib
endif endif
@ -207,7 +214,8 @@ endif
# Game # Game
executable('voyage', executable('voyage',
'src/game/voyage.c', 'src/game/entry.c',
'src/game/main.c',
include_directories : common_incdirs, include_directories : common_incdirs,
link_with : engine_link_libs, link_with : engine_link_libs,
win_subsystem : 'windows') win_subsystem : 'windows')

View File

@ -1,18 +1,26 @@
#include "runtime/app.h" #include "runtime/app.h"
extern void Init(void);
extern void Shutdown(void);
static rt_app_callbacks _callbacks = {
.Init = Init,
.Shutdown = Shutdown,
};
#ifdef _WIN32 #ifdef _WIN32
#define WIN32_LEAN_AND_MEAN #define WIN32_LEAN_AND_MEAN
#include <Windows.h> #include <Windows.h>
int WINAPI wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int nCmdShow) { int WINAPI wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int nCmdShow) {
return rtWin32Entry(hInstance, hPrevInstance, pCmdLine, nCmdShow); return rtWin32Entry(hInstance, hPrevInstance, pCmdLine, nCmdShow, _callbacks);
} }
#elif defined(__linux__) #elif defined(__linux__)
int main(int argc, char **argv) { int main(int argc, char **argv) {
return rtXlibEntry(argc, argv); return rtXlibEntry(argc, argv, _callbacks);
} }
#endif #endif

65
src/game/main.c Normal file
View File

@ -0,0 +1,65 @@
#include "runtime/gfx.h"
static rt_framegraph *_framegraph;
/* Called after the runtime has finished its initialization and before entering the main-loop*/
void Init(void) {
rtLog("GAME", "Init");
rt_render_target_id rt_ids[4] = {rtCalculateRenderTargetID("rt0", sizeof("rt0")),
rtCalculateRenderTargetID("rt1", sizeof("rt1")),
rtCalculateRenderTargetID("rt2", sizeof("rt2")),
rtCalculateRenderTargetID("rt3", sizeof("rt3")),
};
rt_render_target_info rts[4] = {
{.id = rt_ids[0], .width = 1024, .height = 768, .format = RT_PIXEL_FORMAT_R8G8B8A8_SRGB, .sample_count = 4},
{.id = rt_ids[1], .width = 1024, .height = 768, .format = RT_PIXEL_FORMAT_R8G8B8A8_SRGB, .sample_count = 4},
{.id = rt_ids[2], .width = 1024, .height = 768, .format = RT_PIXEL_FORMAT_R8G8B8A8_SRGB, .sample_count = 4},
{.id = rt_ids[3], .width = 1024, .height = 768, .format = RT_PIXEL_FORMAT_R8G8B8A8_SRGB, .sample_count = 4}
};
rt_render_target_write pass0_writes[] = {{.render_target = rt_ids[0]},
{.render_target = rt_ids[1]}};
rt_render_target_read pass1_reads[] = {{.render_target = rt_ids[0]},
{.render_target = rt_ids[1]}};
rt_render_target_write pass1_writes[] = {{.render_target = rt_ids[2]}};
rt_render_target_read pass2_reads[] = {{.render_target = rt_ids[2]},
{.render_target = rt_ids[1]}};
rt_render_target_write pass2_writes[] = {{.render_target = rt_ids[3]}};
rt_render_target_write pass3_writes[] = {{.render_target = rt_ids[2]}};
rt_render_pass_info passes[4];
rtSetRelptr(&passes[0].read_render_targets, NULL);
rtSetRelptr(&passes[0].write_render_targets, pass0_writes);
passes[0].read_render_target_count = 0;
passes[0].write_render_target_count = RT_ARRAY_COUNT(pass0_writes);
rtSetRelptr(&passes[1].read_render_targets, pass1_reads);
rtSetRelptr(&passes[1].write_render_targets, pass1_writes);
passes[1].read_render_target_count = RT_ARRAY_COUNT(pass1_reads);
passes[1].write_render_target_count = RT_ARRAY_COUNT(pass1_writes);
rtSetRelptr(&passes[2].read_render_targets, pass2_reads);
rtSetRelptr(&passes[2].write_render_targets, pass2_writes);
passes[2].read_render_target_count = RT_ARRAY_COUNT(pass2_reads);
passes[2].write_render_target_count = RT_ARRAY_COUNT(pass2_writes);
rtSetRelptr(&passes[3].read_render_targets, NULL);
rtSetRelptr(&passes[3].write_render_targets, pass3_writes);
passes[3].read_render_target_count = 0;
passes[3].write_render_target_count = RT_ARRAY_COUNT(pass3_writes);
rt_framegraph_info info = {
.render_pass_count = 4,
.render_target_count = 4,
};
rtSetRelptr(&info.render_passes, passes);
rtSetRelptr(&info.render_targets, rts);
_framegraph = rtCreateFramegraph(&info);
}
/* Called after exiting the main-loop and before the runtime starts its shutdown */
void Shutdown(void) {
rtLog("GAME", "Shutdown");
rtDestroyFramegraph(_framegraph);
}

View File

@ -3,6 +3,12 @@
#include <volk/volk.h> #include <volk/volk.h>
#define VMA_STATIC_VULKAN_FUNCTIONS 0
#define VMA_DYNAMI_VULKAN_FUNCTIONS 0
#include <vma/vk_mem_alloc.h>
#include "runtime/renderer_api.h"
#ifdef _WIN32 #ifdef _WIN32
struct HINSTANCE__; struct HINSTANCE__;
struct HWND__; struct HWND__;
@ -40,10 +46,18 @@ typedef struct {
VkPhysicalDeviceProperties phys_device_props; VkPhysicalDeviceProperties phys_device_props;
VkPhysicalDeviceDescriptorIndexingFeatures descriptor_indexing_features; VkPhysicalDeviceDescriptorIndexingFeatures descriptor_indexing_features;
VkPhysicalDeviceFeatures phys_device_features; VkPhysicalDeviceFeatures phys_device_features;
VmaAllocator allocator;
} rt_vk_gpu; } rt_vk_gpu;
#ifndef RT_VK_DONT_DEFINE_GPU_GLOBAL #ifndef RT_VK_DONT_DEFINE_GPU_GLOBAL
extern rt_vk_gpu g_gpu; extern rt_vk_gpu g_gpu;
#endif #endif
/* Helper functions */
VkFormat rtPixelFormatToVkFormat(rt_pixel_format format);
VkSampleCountFlagBits rtSampleCountToFlags(unsigned int count);
#endif #endif

25
src/renderer/vk/helper.c Normal file
View File

@ -0,0 +1,25 @@
#include "gpu.h"
VkFormat rtPixelFormatToVkFormat(rt_pixel_format format) {
switch (format) {
case RT_PIXEL_FORMAT_R8G8B8A8_SRGB:
return VK_FORMAT_R8G8B8A8_SRGB;
case RT_PIXEL_FORMAT_DEPTH24_STENCIL8:
return VK_FORMAT_D24_UNORM_S8_UINT;
default:
return VK_FORMAT_UNDEFINED;
}
}
VkSampleCountFlagBits rtSampleCountToFlags(unsigned int count) {
/* Limit to what the gpu supports */
VkSampleCountFlags counts = g_gpu.phys_device_props.limits.framebufferColorSampleCounts &
g_gpu.phys_device_props.limits.framebufferDepthSampleCounts &
g_gpu.phys_device_props.limits.sampledImageColorSampleCounts &
g_gpu.phys_device_props.limits.sampledImageDepthSampleCounts;
while (count > 1) {
if ((counts & count) == 0)
count >>= 1;
}
return (VkSampleCountFlagBits)count;
}

View File

@ -1,4 +1,5 @@
#include <malloc.h> #include <malloc.h>
#include <stdbool.h>
#include <stdlib.h> #include <stdlib.h>
#include <string.h> #include <string.h>
@ -10,6 +11,8 @@
#include "runtime/renderer_api.h" #include "runtime/renderer_api.h"
#include "runtime/runtime.h" #include "runtime/runtime.h"
#define TARGET_API_VERSION VK_API_VERSION_1_2
RT_CVAR_I(r_VkEnableAPIAllocTracking, RT_CVAR_I(r_VkEnableAPIAllocTracking,
"Enable tracking of allocations done by the vulkan api. [0/1] Default: 0", "Enable tracking of allocations done by the vulkan api. [0/1] Default: 0",
0); 0);
@ -95,7 +98,7 @@ static rt_result CreateInstance(void) {
} }
VkApplicationInfo app_info = { VkApplicationInfo app_info = {
.apiVersion = VK_API_VERSION_1_2, .apiVersion = TARGET_API_VERSION,
.applicationVersion = 0x00001000, .applicationVersion = 0x00001000,
.engineVersion = 0x00001000, .engineVersion = 0x00001000,
.pEngineName = "voyageEngine", .pEngineName = "voyageEngine",
@ -464,8 +467,58 @@ static rt_result CreateDevice(void) {
return RT_SUCCESS; return RT_SUCCESS;
} }
static rt_result CreateAllocator(void) {
#define SET_FNC(name) fncs.name = name
#define SET_KHR_FNC(name) (fncs).name##KHR = name
VmaVulkanFunctions fncs = {NULL};
SET_FNC(vkGetInstanceProcAddr);
SET_FNC(vkGetDeviceProcAddr);
SET_FNC(vkGetPhysicalDeviceProperties);
SET_FNC(vkGetPhysicalDeviceMemoryProperties);
SET_FNC(vkAllocateMemory);
SET_FNC(vkFreeMemory);
SET_FNC(vkMapMemory);
SET_FNC(vkUnmapMemory);
SET_FNC(vkFlushMappedMemoryRanges);
SET_FNC(vkInvalidateMappedMemoryRanges);
SET_FNC(vkBindBufferMemory);
SET_FNC(vkBindImageMemory);
SET_FNC(vkGetBufferMemoryRequirements);
SET_FNC(vkGetImageMemoryRequirements);
SET_FNC(vkCreateBuffer);
SET_FNC(vkDestroyBuffer);
SET_FNC(vkCreateImage);
SET_FNC(vkDestroyImage);
SET_FNC(vkCmdCopyBuffer);
SET_KHR_FNC(vkGetBufferMemoryRequirements2);
SET_KHR_FNC(vkGetImageMemoryRequirements2);
SET_KHR_FNC(vkBindBufferMemory2);
SET_KHR_FNC(vkBindImageMemory2);
SET_KHR_FNC(vkGetPhysicalDeviceMemoryProperties2);
#undef SET_FNC
#undef SET_KHR_FNC
VmaAllocatorCreateInfo allocator_info = {
.instance = g_gpu.instance,
.physicalDevice = g_gpu.phys_device,
.device = g_gpu.device,
.pAllocationCallbacks = g_gpu.alloc_cb,
.vulkanApiVersion = TARGET_API_VERSION,
.pVulkanFunctions = &fncs,
};
return vmaCreateAllocator(&allocator_info, &g_gpu.allocator) == VK_SUCCESS ? RT_SUCCESS
: RT_UNKNOWN_ERROR;
}
static void DestroyAllocator(void) {
vmaDestroyAllocator(g_gpu.allocator);
}
extern rt_result InitPipelineManagement(void); extern rt_result InitPipelineManagement(void);
extern void ShutdownPipelineManagement(void); extern void ShutdownPipelineManagement(void);
extern rt_result InitRenderTargetManagement(void);
extern void ShutdownRenderTargetManagement(void);
rt_result RT_RENDERER_API_FN(Init)(const rt_renderer_init_info *info) { rt_result RT_RENDERER_API_FN(Init)(const rt_renderer_init_info *info) {
rtLog("vk", "Init"); rtLog("vk", "Init");
@ -491,9 +544,15 @@ rt_result RT_RENDERER_API_FN(Init)(const rt_renderer_init_info *info) {
if (res != RT_SUCCESS) if (res != RT_SUCCESS)
return res; return res;
res = CreateDevice(); res = CreateDevice();
if (res != RT_SUCCESS)
return res;
res = CreateAllocator();
if (res != RT_SUCCESS) if (res != RT_SUCCESS)
return res; return res;
res = InitPipelineManagement(); res = InitPipelineManagement();
if (res != RT_SUCCESS)
return res;
res = InitRenderTargetManagement();
if (res != RT_SUCCESS) if (res != RT_SUCCESS)
return res; return res;
res = rtCreateSwapchain(); res = rtCreateSwapchain();
@ -507,7 +566,9 @@ void RT_RENDERER_API_FN(Shutdown)(void) {
rtLog("vk", "Shutdown"); rtLog("vk", "Shutdown");
vkDeviceWaitIdle(g_gpu.device); vkDeviceWaitIdle(g_gpu.device);
rtDestroySwapchain(); rtDestroySwapchain();
ShutdownRenderTargetManagement();
ShutdownPipelineManagement(); ShutdownPipelineManagement();
DestroyAllocator();
vkDestroyDevice(g_gpu.device, g_gpu.alloc_cb); vkDestroyDevice(g_gpu.device, g_gpu.alloc_cb);
vkDestroySurfaceKHR(g_gpu.instance, g_gpu.surface, g_gpu.alloc_cb); vkDestroySurfaceKHR(g_gpu.instance, g_gpu.surface, g_gpu.alloc_cb);
vkDestroyInstance(g_gpu.instance, g_gpu.alloc_cb); vkDestroyInstance(g_gpu.instance, g_gpu.alloc_cb);

View File

@ -3,7 +3,7 @@
#include <volk/volk.h> #include <volk/volk.h>
#include "runtime/gfx.h" #include "runtime/renderer_api.h"
typedef struct { typedef struct {
VkPipeline pipeline; VkPipeline pipeline;

View File

@ -0,0 +1,298 @@
#include "runtime/config.h"
#include "runtime/renderer_api.h"
#include "runtime/threading.h"
#include "gpu.h"
#include "render_targets.h"
#include "swapchain.h"
#include <stdlib.h>
#include <volk/volk.h>
RT_CVAR_I(r_VkMaxRenderTargetCount, "Maximum number of render target objects. Default: 1024", 1024);
typedef struct rt_render_target_slot_s {
uint32_t version;
rt_render_target render_target;
struct rt_render_target_slot_s *next_free;
} rt_render_target_slot;
static rt_render_target_slot *_render_targets;
static rt_render_target_slot *_first_free;
static rt_rwlock _lock;
static void DestroyRenderTarget(rt_render_target_slot *slot) {
for (unsigned int i = 0; i < slot->render_target.image_count; ++i) {
vkDestroyImageView(g_gpu.device, slot->render_target.view[i], g_gpu.alloc_cb);
vmaDestroyImage(g_gpu.allocator,
slot->render_target.image[i],
slot->render_target.allocation[i]);
}
slot->next_free = _first_free;
_first_free = slot;
}
static bool CreateImageAndView(VkExtent2D extent,
VkFormat format,
VkSampleCountFlagBits sample_count,
VkImageUsageFlagBits usage,
VkImageAspectFlagBits aspect,
VkImage *p_image,
VmaAllocation *p_allocation,
VkImageView *p_view) {
uint32_t queue_families[3];
uint32_t distinct_queue_families = 1;
queue_families[0] = g_gpu.graphics_family;
if (g_gpu.compute_family != g_gpu.graphics_family)
queue_families[distinct_queue_families++] = g_gpu.compute_family;
if (g_gpu.present_family != g_gpu.graphics_family &&
g_gpu.present_family != g_gpu.compute_family)
queue_families[distinct_queue_families++] = g_gpu.present_family;
VkImageCreateInfo image_info = {
.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
.imageType = VK_IMAGE_TYPE_2D,
.format = format,
.extent = {.width = extent.width, .height = extent.height, .depth = 1},
.mipLevels = 1,
.arrayLayers = 1,
.samples = sample_count,
.tiling = VK_IMAGE_TILING_OPTIMAL,
.usage = usage,
.sharingMode =
(distinct_queue_families > 1) ? VK_SHARING_MODE_CONCURRENT : VK_SHARING_MODE_EXCLUSIVE,
.pQueueFamilyIndices = (distinct_queue_families > 1) ? queue_families : NULL,
.queueFamilyIndexCount = distinct_queue_families,
};
VmaAllocationCreateInfo alloc_info = {
.usage = VMA_MEMORY_USAGE_GPU_ONLY,
};
VkImage image;
VmaAllocation allocation;
if (vmaCreateImage(g_gpu.allocator, &image_info, &alloc_info, &image, &allocation, NULL) !=
VK_SUCCESS) {
return false;
}
VkImageViewCreateInfo view_info = {
.sType = VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
.image = image,
.viewType = VK_IMAGE_VIEW_TYPE_2D,
.format = format,
.components = {.r = VK_COMPONENT_SWIZZLE_IDENTITY,
.g = VK_COMPONENT_SWIZZLE_IDENTITY,
.b = VK_COMPONENT_SWIZZLE_IDENTITY,
.a = VK_COMPONENT_SWIZZLE_IDENTITY},
/* clang-format off */
.subresourceRange = {
.aspectMask = aspect,
.baseArrayLayer = 0,
.baseMipLevel = 0,
.layerCount = 1,
.levelCount = 1,
},
/* clang-format on */
};
VkImageView view;
if (vkCreateImageView(g_gpu.device, &view_info, g_gpu.alloc_cb, &view) != VK_SUCCESS) {
rtLog("VK", "Failed to create render target image view");
vmaDestroyImage(g_gpu.allocator, image, allocation);
return false;
}
*p_image = image;
*p_allocation = allocation;
*p_view = view;
return true;
}
rt_result InitRenderTargetManagement(void) {
rt_create_rwlock_result lock_res = rtCreateRWLock();
if (!lock_res.ok)
return RT_UNKNOWN_ERROR;
_lock = lock_res.lock;
_render_targets = calloc(r_VkMaxRenderTargetCount.i, sizeof(rt_render_target_slot));
if (!_render_targets) {
rtDestroyRWLock(&_lock);
return RT_OUT_OF_MEMORY;
}
/* Keep [0] unused to preserve 0 as the invalid handle */
_first_free = &_render_targets[1];
for (int i = 1; i < r_VkMaxRenderTargetCount.i - 1; ++i) {
_render_targets[i].next_free = &_render_targets[i + 1];
}
return RT_SUCCESS;
}
void ShutdownRenderTargetManagement(void) {
for (int i = 1; i < r_VkMaxRenderTargetCount.i; ++i) {
DestroyRenderTarget(&_render_targets[i]);
}
free(_render_targets);
rtDestroyRWLock(&_lock);
_first_free = NULL;
}
rt_render_target_handle RT_RENDERER_API_FN(CreateRenderTarget)(const rt_render_target_info *info) {
rt_render_target_handle handle = {0};
rtLockWrite(&_lock);
if (!_first_free) {
rtLog("VK", "No free render target slots!");
rtUnlockWrite(&_lock);
return handle;
}
rt_render_target_slot *slot = _first_free;
_first_free = slot->next_free;
slot->version = (slot->version + 1) & RT_GFX_HANDLE_MAX_VERSION;
/* No other thread that calls compile gets the same slot.
* Another thread accessing the slot via GetPipeline would get a version mismatch.
* The same holds for DestroyPipeline
*/
rtUnlockWrite(&_lock);
slot->render_target.match_swapchain = 0;
slot->render_target.image_count = g_swapchain.image_count;
for (unsigned int i = 0; i < g_swapchain.image_count; ++i) {
uint32_t width = info->width, height = info->height;
if (width == RT_RENDER_TARGET_SIZE_SWAPCHAIN) {
width = g_swapchain.extent.width;
slot->render_target.match_swapchain |= RT_RENDER_TARGET_MATCH_SWAPCHAIN_SIZE;
}
if (height == RT_RENDER_TARGET_SIZE_SWAPCHAIN) {
height = g_swapchain.extent.height;
slot->render_target.match_swapchain |= RT_RENDER_TARGET_MATCH_SWAPCHAIN_SIZE;
}
slot->render_target.extent = (VkExtent2D){.width = width, .height = height};
if (info->format != RT_PIXEL_FORMAT_SWAPCHAIN)
slot->render_target.format = rtPixelFormatToVkFormat(info->format);
else {
slot->render_target.format = g_swapchain.format;
slot->render_target.match_swapchain |= RT_RENDER_TARGET_MATCH_SWAPCHAIN_FORMAT;
}
if (info->format == RT_PIXEL_FORMAT_DEPTH24_STENCIL8) {
slot->render_target.usage = VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT;
slot->render_target.aspect = VK_IMAGE_ASPECT_DEPTH_BIT;
} else {
slot->render_target.usage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT |
VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT;
slot->render_target.aspect = VK_IMAGE_ASPECT_COLOR_BIT;
}
slot->render_target.sample_count = rtSampleCountToFlags(info->sample_count);
if (!CreateImageAndView(slot->render_target.extent,
slot->render_target.format,
slot->render_target.sample_count,
slot->render_target.usage,
slot->render_target.aspect,
&slot->render_target.image[i],
&slot->render_target.allocation[i],
&slot->render_target.view[i])) {
slot->render_target.image_count = i;
DestroyRenderTarget(slot);
goto out;
}
}
handle.version = slot->version;
handle.index = (uint32_t)(slot - _render_targets);
out:
return handle;
}
void RT_RENDERER_API_FN(DestroyRenderTarget)(rt_render_target_handle handle) {
if (handle.index >= (uint32_t)r_VkMaxRenderTargetCount.i)
return;
rtLockWrite(&_lock);
if (_render_targets[handle.index].version == handle.version)
DestroyRenderTarget(&_render_targets[handle.index]);
else
rtLog("VK", "Tried to destroy a render target using an outdated handle.");
rtUnlockWrite(&_lock);
}
rt_render_target *rtGetRenderTarget(rt_render_target_handle handle) {
if (handle.index >= (uint32_t)r_VkMaxRenderTargetCount.i)
return NULL;
rtLockRead(&_lock);
rt_render_target *res = NULL;
if (_render_targets[handle.index].version == handle.version)
res = &_render_targets[handle.index].render_target;
else
rtLog("VK", "Tried to access a render target using an outdated handle.");
rtUnlockRead(&_lock);
return res;
}
void rtUpdateRenderTargetsFromSwapchain(uint32_t image_count, VkFormat format, VkExtent2D extent) {
rtLockWrite(&_lock);
for (uint32_t i = 1; i < (uint32_t)r_VkMaxRenderTargetCount.i; ++i) {
if (_render_targets[i].render_target.image_count == 0)
continue;
rt_render_target *render_target = &_render_targets[i].render_target;
if (render_target->match_swapchain != 0) {
for (uint32_t j = 0; j < render_target->image_count; ++j) {
vkDestroyImageView(g_gpu.device, render_target->view[j], g_gpu.alloc_cb);
vmaDestroyImage(g_gpu.allocator,
render_target->image[j],
render_target->allocation[j]);
}
if ((render_target->match_swapchain & RT_RENDER_TARGET_MATCH_SWAPCHAIN_FORMAT) != 0) {
render_target->format = format;
} else if ((render_target->match_swapchain & RT_RENDER_TARGET_MATCH_SWAPCHAIN_SIZE) !=
0) {
render_target->extent = extent;
}
for (uint32_t j = 0; j < image_count; ++j) {
if (!CreateImageAndView(render_target->extent,
render_target->format,
render_target->sample_count,
render_target->usage,
render_target->aspect,
&render_target->image[j],
&render_target->allocation[j],
&render_target->view[j])) {
render_target->image_count = j;
DestroyRenderTarget(&_render_targets[i]);
rtReportError("VK", "Failed to recreate swapchain-matching render target");
break;
}
}
} else if (render_target->image_count < image_count) {
/* Create additional images */
for (uint32_t j = render_target->image_count; j < image_count; ++j) {
if (!CreateImageAndView(render_target->extent,
render_target->format,
render_target->sample_count,
render_target->usage,
render_target->aspect,
&render_target->image[j],
&render_target->allocation[j],
&render_target->view[j])) {
render_target->image_count = j;
DestroyRenderTarget(&_render_targets[i]);
rtReportError("VK", "Failed to create additional render target images");
break;
}
}
} else if (render_target->image_count > image_count) {
/* Delete unnecessary images */
for (uint32_t j = image_count; j < render_target->image_count; ++j) {
vkDestroyImageView(g_gpu.device, render_target->view[j], g_gpu.alloc_cb);
vmaDestroyImage(g_gpu.allocator,
render_target->image[j],
render_target->allocation[j]);
}
}
render_target->image_count = image_count;
}
rtUnlockWrite(&_lock);
}

View File

@ -0,0 +1,32 @@
#ifndef RT_VK_RENDER_TARGETS_H
#define RT_VK_RENDER_TARGETS_H
#include "gpu.h"
#include "runtime/renderer_api.h"
/* Must match RT_VK_MAX_SWAPCHAIN_IMAGES */
#define RT_VK_RENDER_TARGET_MAX_IMAGES 3
typedef enum {
RT_RENDER_TARGET_MATCH_SWAPCHAIN_SIZE = 0x01,
RT_RENDER_TARGET_MATCH_SWAPCHAIN_FORMAT = 0x02,
} rt_render_target_match_swapchain_flags;
typedef struct {
VkImage image[RT_VK_RENDER_TARGET_MAX_IMAGES];
VkImageView view[RT_VK_RENDER_TARGET_MAX_IMAGES];
VmaAllocation allocation[RT_VK_RENDER_TARGET_MAX_IMAGES];
VkSampleCountFlagBits sample_count;
VkFormat format;
VkExtent2D extent;
VkImageUsageFlagBits usage;
VkImageAspectFlags aspect;
unsigned int image_count;
rt_render_target_match_swapchain_flags match_swapchain;
} rt_render_target;
rt_render_target *rtGetRenderTarget(rt_render_target_handle handle);
void rtUpdateRenderTargetsFromSwapchain(uint32_t image_count, VkFormat format, VkExtent2D extent);
#endif

View File

@ -121,6 +121,12 @@ rt_result rtCreateSwapchain(void) {
return 50; return 50;
} }
g_swapchain.format = device_params.surface_format.format; g_swapchain.format = device_params.surface_format.format;
g_swapchain.extent =
device_params.extent;
/* Retrieve images */ /* Retrieve images */
g_swapchain.image_count = 0; g_swapchain.image_count = 0;

View File

@ -13,6 +13,7 @@ typedef struct {
VkImageView image_views[RT_VK_MAX_SWAPCHAIN_IMAGES]; VkImageView image_views[RT_VK_MAX_SWAPCHAIN_IMAGES];
uint32_t image_count; uint32_t image_count;
VkFormat format; VkFormat format;
VkExtent2D extent;
} rt_swapchain; } rt_swapchain;
#ifndef RT_VK_DONT_DEFINE_SWAPCHAIN_GLOBAL #ifndef RT_VK_DONT_DEFINE_SWAPCHAIN_GLOBAL

View File

@ -0,0 +1,7 @@
#pragma warning(push, 0)
#include <volk/volk.h>
#define VMA_STATIC_VULKAN_FUNCTIONS 0
#define VMA_DYNAMIC_VULKAN_FUNCTIONS 0
#define VMA_IMPLEMENTATION
#include <vma/vk_mem_alloc.h>
#pragma warning(pop)

View File

@ -5,6 +5,8 @@
#include "gfx.h" #include "gfx.h"
#include "renderer_api.h" #include "renderer_api.h"
#include <stdbool.h>
RT_CVAR_I(rt_Fullscreen, "Show window in fullscreen mode. [0/1] Default: 0", 0); RT_CVAR_I(rt_Fullscreen, "Show window in fullscreen mode. [0/1] Default: 0", 0);
RT_CVAR_I(rt_WindowWidth, "Window width. Default: 1024", 1024); RT_CVAR_I(rt_WindowWidth, "Window width. Default: 1024", 1024);
RT_CVAR_I(rt_WindowHeight, "Window height. Default: 768", 768); RT_CVAR_I(rt_WindowHeight, "Window height. Default: 768", 768);
@ -24,7 +26,7 @@ static LRESULT CALLBACK win32WndProc(HWND hWnd, UINT uMsg, WPARAM wParam, LPARAM
} }
RT_DLLEXPORT int RT_DLLEXPORT int
rtWin32Entry(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int nCmdShow) { rtWin32Entry(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int nCmdShow, rt_app_callbacks app_callbacks) {
if (rtInitRuntime() != RT_SUCCESS) if (rtInitRuntime() != RT_SUCCESS)
return 1; return 1;
@ -90,6 +92,8 @@ rtWin32Entry(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int n
return 1; return 1;
} }
app_callbacks.Init();
/* Main Loop */ /* Main Loop */
bool keep_running = true; bool keep_running = true;
while (keep_running) { while (keep_running) {
@ -104,6 +108,8 @@ rtWin32Entry(HINSTANCE hInstance, HINSTANCE hPrevInstance, PWSTR pCmdLine, int n
} }
} }
app_callbacks.Shutdown();
rtShutdownGFX(); rtShutdownGFX();
DestroyWindow(wnd); DestroyWindow(wnd);
@ -154,7 +160,7 @@ static void xlibSetFullscreen(Display *dpy, int screen, Window window, bool enab
#undef EVENT_SOURCE_APPLICATION #undef EVENT_SOURCE_APPLICATION
} }
RT_DLLEXPORT int rtXlibEntry(int argc, char **argv) { RT_DLLEXPORT int rtXlibEntry(int argc, char **argv, rt_app_callbacks app_callbacks) {
if (rtInitRuntime() != RT_SUCCESS) if (rtInitRuntime() != RT_SUCCESS)
return 1; return 1;
@ -201,6 +207,8 @@ RT_DLLEXPORT int rtXlibEntry(int argc, char **argv) {
return 1; return 1;
} }
app_callbacks.Init();
/* Main Loop */ /* Main Loop */
bool keep_running = true; bool keep_running = true;
while (keep_running) { while (keep_running) {
@ -226,6 +234,8 @@ RT_DLLEXPORT int rtXlibEntry(int argc, char **argv) {
} }
} }
app_callbacks.Shutdown();
rtShutdownGFX(); rtShutdownGFX();
XDestroyWindow(dpy, window); XDestroyWindow(dpy, window);
XCloseDisplay(dpy); XCloseDisplay(dpy);

View File

@ -9,6 +9,18 @@
extern "C" { extern "C" {
#endif #endif
typedef void rt_app_init_fn(void);
typedef void rt_app_shutdown_fn(void);
typedef struct {
/* Called after the runtime has finished initialization and
* before entering the main-loop */
rt_app_init_fn *Init;
/* Called after the main-loop and before the runtime starts its shutdown. */
rt_app_shutdown_fn *Shutdown;
} rt_app_callbacks;
#ifdef _WIN32 #ifdef _WIN32
/* Forward declared here, to avoid including windows.h */ /* Forward declared here, to avoid including windows.h */
@ -17,11 +29,12 @@ struct HINSTANCE__;
RT_DLLEXPORT int rtWin32Entry(struct HINSTANCE__ *hInstance, RT_DLLEXPORT int rtWin32Entry(struct HINSTANCE__ *hInstance,
struct HINSTANCE__ *hPrevInstance, struct HINSTANCE__ *hPrevInstance,
wchar_t *pCmdLine, wchar_t *pCmdLine,
int nCmdShow); int nCmdShow,
rt_app_callbacks app_callbacks);
#elif defined(RT_USE_XLIB) #elif defined(RT_USE_XLIB)
RT_DLLEXPORT int rtXlibEntry(int argc, char **argv); RT_DLLEXPORT int rtXlibEntry(int argc, char **argv, rt_app_callbacks app_callbacks);
#endif #endif

View File

@ -9,16 +9,99 @@
* - object renderer (for static models) * - object renderer (for static models)
*/ */
#include <stdbool.h>
#include <stdint.h> #include <stdint.h>
#include "runtime.h" #include "runtime.h"
#include "resources.h"
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
#endif #endif
typedef enum {
RT_PIXEL_FORMAT_INVALID,
RT_PIXEL_FORMAT_R8G8B8A8_SRGB,
RT_PIXEL_FORMAT_DEPTH24_STENCIL8,
/* Special value indicating whichever format the swapchain uses */
RT_PIXEL_FORMAT_SWAPCHAIN,
RT_PIXEL_FORMAT_count,
} rt_pixel_format;
/* Special value for the .width and .height fields of rt_render_target_info
* to indicate that these should be set to the width or height of the swapchain, respectively. */
#define RT_RENDER_TARGET_SIZE_SWAPCHAIN 0
/* 32 bit string hashes */
typedef uint32_t rt_render_target_id;
typedef uint32_t rt_render_pass_id;
typedef struct {
rt_render_target_id id;
rt_pixel_format format;
uint32_t width;
uint32_t height;
uint32_t sample_count;
} rt_render_target_info;
typedef enum {
RT_RENDER_TARGET_READ_INPUT_ATTACHMENT,
RT_RENDER_TARGET_READ_SAMPLED,
RT_RENDER_TARGET_READ_count,
} rt_render_target_read_mode;
typedef struct {
rt_render_target_id render_target;
rt_render_target_read_mode mode;
} rt_render_target_read;
typedef enum {
/* Clears the render target with the clear value before executing the pass */
RT_RENDER_TARGET_WRITE_CLEAR = 0x01,
/* Discards the written values after the pass has finished executing */
RT_RENDER_TARGET_WRITE_DISCARD = 0x02,
} rt_render_target_write_flags;
typedef struct {
rt_render_target_id render_target;
union {
float color[4];
struct {
float depth;
int32_t stencil;
} depth_stencil;
} clear;
rt_render_target_write_flags flags;
} rt_render_target_write;
typedef struct {
rt_render_pass_id id;
/* list of rt_render_target_reads */
rt_relptr read_render_targets;
/* list of rt_render_target_writes */
rt_relptr write_render_targets;
uint32_t read_render_target_count;
uint32_t write_render_target_count;
} rt_render_pass_info;
typedef struct {
rt_relptr render_targets;
rt_relptr render_passes;
uint32_t render_target_count;
uint32_t render_pass_count;
} rt_framegraph_info;
typedef void rt_render_pass_prepare_fn(rt_render_pass_id id);
typedef void rt_render_pass_execute_fn(rt_render_pass_id id);
typedef void rt_render_pass_finalize_fn(rt_render_pass_id id);
typedef struct {
rt_render_pass_prepare_fn *Prepare;
rt_render_pass_execute_fn *Execute;
rt_render_pass_finalize_fn *Finalize;
} rt_render_pass_bind_fns;
/* In renderer_api.h -> Not necessary for almost all gfx usage */ /* In renderer_api.h -> Not necessary for almost all gfx usage */
typedef struct rt_renderer_init_info_s rt_renderer_init_info; typedef struct rt_renderer_init_info_s rt_renderer_init_info;
@ -28,6 +111,26 @@ RT_DLLEXPORT rt_result rtInitGFX(rt_renderer_init_info *renderer_info);
RT_DLLEXPORT void rtShutdownGFX(void); RT_DLLEXPORT void rtShutdownGFX(void);
/* Framegraph API
*
* The framegraph is used to organize and schedule the work for a frame.
*/
typedef struct rt_framegraph_s rt_framegraph;
RT_DLLEXPORT rt_framegraph *rtCreateFramegraph(const rt_framegraph_info *info);
RT_DLLEXPORT void rtDestroyFramegraph(rt_framegraph *framegraph);
RT_DLLEXPORT void rtBindRenderPass(rt_framegraph *framegraph, rt_render_pass_id pass, const rt_render_pass_bind_fns *bind_fns);
/* Utility to turn a string into a usable render target id. */
RT_DLLEXPORT rt_render_target_id rtCalculateRenderTargetID(const char *name, size_t len);
/* Utility to turn a string into a usable render pass id. */
RT_DLLEXPORT rt_render_pass_id rtCalculateRenderPassID(const char *name, size_t len);
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif

View File

@ -0,0 +1,398 @@
#include "config.h"
#include "gfx.h"
#include "handles.h"
#include "hashing.h"
#include "mem_arena.h"
#include "renderer_api.h"
#include "threading.h"
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
RT_CVAR_I(rt_MaxFramegraphs, "Maximum number of framegraphs. Default 16", 16);
#define RT_FRAMEGRAPH_MAX_PASSES 32
#define RT_FRAMEGRAPH_MAX_RENDER_TARGETS 32
#define RT_RENDERPASS_MAX_READS 8
#define RT_RENDERPASS_MAX_WRITES 8
typedef struct {
rt_pixel_format format;
unsigned int width;
unsigned int height;
unsigned int sample_count;
rt_render_target_handle api_render_target;
} rt_render_target;
typedef struct {
int execution_level;
unsigned int read_count;
unsigned int write_count;
rt_render_pass_bind_fns bound_fns;
rt_render_target_read reads[RT_RENDERPASS_MAX_READS];
rt_render_target_write writes[RT_RENDERPASS_MAX_WRITES];
} rt_render_pass;
struct rt_framegraph_s {
uint32_t pass_count;
uint32_t render_target_count;
rt_framegraph *next_free;
rt_render_pass_id pass_ids[RT_FRAMEGRAPH_MAX_PASSES];
rt_render_pass passes[RT_FRAMEGRAPH_MAX_PASSES];
rt_render_target_id render_target_ids[RT_FRAMEGRAPH_MAX_RENDER_TARGETS];
rt_render_target render_targets[RT_FRAMEGRAPH_MAX_RENDER_TARGETS];
};
static rt_framegraph *_framegraphs;
static rt_framegraph *_first_free;
static rt_mutex *_free_list_lock;
static void ReturnFrameGraph(rt_framegraph *framegraph) {
rtLockMutex(_free_list_lock);
framegraph->next_free = _first_free;
_first_free = framegraph;
rtUnlockMutex(_free_list_lock);
}
rt_result InitFramegraphManager(void) {
_free_list_lock = rtCreateMutex();
if (!_free_list_lock)
return RT_UNKNOWN_ERROR;
_framegraphs = calloc((size_t)rt_MaxFramegraphs.i, sizeof(rt_framegraph));
if (!_framegraphs)
return RT_OUT_OF_MEMORY;
for (int i = 0; i < rt_MaxFramegraphs.i; ++i)
_framegraphs[i].next_free = (i < rt_MaxFramegraphs.i - 1) ? &_framegraphs[i + 1] : NULL;
_first_free = &_framegraphs[0];
return RT_SUCCESS;
}
void ShutdownFramegraphManager(void) {
free(_framegraphs);
rtDestroyMutex(_free_list_lock);
}
typedef struct {
unsigned int dependency_count;
int execution_level;
} rt_pass_construct;
static int CompareRenderPassExecutionLevels(const void *a, const void *b) {
const rt_render_pass *pass_a = a, *pass_b = b;
return pass_a->execution_level - pass_b->execution_level;
}
static bool
CreateRenderPasses(rt_framegraph *graph, const rt_framegraph_info *info, rt_arena *arena) {
uint32_t render_pass_count = info->render_pass_count;
bool result = false;
/* Pass A depends on pass B, if:
* B preceeds A in the list of render passes AND
* B writes to a render target that A reads from. */
bool *dependency_matrix =
rtArenaPushZero(arena, render_pass_count * render_pass_count * sizeof(bool));
if (!dependency_matrix) {
rtLog("GFX",
"Not enough memory to allocate a %ux%u dependency matrix.",
render_pass_count,
render_pass_count);
goto out;
}
/* Checks if pass "dependent_idx" depends on pass "dependency_idx" */
#define PASS_DEPENDS(dependent_idx, dependency_idx) \
dependency_matrix[(dependency_idx)*render_pass_count + (dependent_idx)]
rt_pass_construct *construct_passes =
RT_ARENA_PUSH_ARRAY_ZERO(arena, rt_pass_construct, render_pass_count);
if (!construct_passes) {
rtLog("GFX",
"Not enough memory to allocate construction information for %u passes.",
render_pass_count);
goto out;
}
const rt_render_pass_info *pass_info = rtResolveConstRelptr(&info->render_passes);
for (uint32_t i = 0; i < render_pass_count; ++i) {
construct_passes[i].execution_level = -1; /* not scheduled yet */
const rt_render_target_write *writes_i =
rtResolveConstRelptr(&pass_info[i].write_render_targets);
for (uint32_t j = i + 1; j < render_pass_count; ++j) {
const rt_render_target_read *reads_j =
rtResolveConstRelptr(&pass_info[j].read_render_targets);
bool depends = false;
for (uint32_t read_idx = 0; read_idx < pass_info[j].read_render_target_count;
++read_idx) {
for (uint32_t write_idx = 0; write_idx < pass_info[i].write_render_target_count;
++write_idx) {
if (writes_i[write_idx].render_target == reads_j[read_idx].render_target)
depends = true;
}
}
PASS_DEPENDS(j, i) = depends;
if (depends)
++construct_passes[j].dependency_count;
}
}
/* Pass A can be executed concurrently with pass B if:
* 1. A and B don't write to the same render target AND
* 2. A's dependencies and B's dependencies have finished executing. */
/* We can have at most render_pass_count execution levels */
uint32_t *level_passes = RT_ARENA_PUSH_ARRAY_ZERO(arena, uint32_t, render_pass_count);
if (!level_passes) {
rtLog("GFX", "Failed to allocate a temporary array for constructing execution levels.");
goto out;
}
uint32_t unscheduled_passes = render_pass_count;
for (int level = 0; level < (int)render_pass_count; ++level) {
unsigned int level_pass_count = 0;
for (uint32_t i = 0; i < render_pass_count; ++i) {
if (construct_passes[i].execution_level == -1 &&
construct_passes[i].dependency_count == 0) {
/* Check that no writes conflict */
bool write_conflict = false;
const rt_render_target_write *writes_i =
rtResolveConstRelptr(&pass_info[i].write_render_targets);
for (unsigned int j = 0; j < level_pass_count; ++j) {
uint32_t pass_idx = level_passes[i];
const rt_render_target_write *pass_writes =
rtResolveConstRelptr(&pass_info[pass_idx].write_render_targets);
for (uint32_t k = 0; k < pass_info[i].write_render_target_count; ++k) {
for (uint32_t l = 0; l < pass_info[pass_idx].write_render_target_count;
++l) {
if (writes_i[k].render_target == pass_writes[l].render_target) {
write_conflict = true;
break;
}
}
if (write_conflict)
break;
}
if (write_conflict)
break;
}
if (!write_conflict) {
RT_ASSERT(level_pass_count < render_pass_count, "");
level_passes[level_pass_count++] = i;
construct_passes[i].execution_level = level;
}
}
}
if (level_pass_count == 0) {
rtLog("GFX", "Failed to compute a valid schedule for the provided framegraph.");
goto out;
}
/* level passes now contains the passes we can execute concurrently.
* Decrement dependency count for all passes that depend on a pass in this level */
for (uint32_t i = 0; i < level_pass_count; ++i) {
for (uint32_t j = 0; j < render_pass_count; ++j) {
if (PASS_DEPENDS(j, level_passes[i]))
--construct_passes[j].dependency_count;
}
}
unscheduled_passes -= level_pass_count;
if (unscheduled_passes == 0)
break;
}
RT_ASSERT(unscheduled_passes == 0, "Did not schedule all passes");
/* Construct passes now contains the "execution level" for each pass.
* We execute passes in that order, those with the same execution level can be executed
* concurrently. */
graph->pass_count = render_pass_count;
for (uint32_t i = 0; i < render_pass_count; ++i) {
graph->passes[i].execution_level = construct_passes[i].execution_level;
const rt_render_target_write *writes =
rtResolveConstRelptr(&pass_info[i].write_render_targets);
const rt_render_target_read *reads =
rtResolveConstRelptr(&pass_info[i].read_render_targets);
memcpy(graph->passes[i].writes,
writes,
pass_info[i].write_render_target_count * sizeof(rt_render_target_write));
memcpy(graph->passes[i].reads,
reads,
pass_info[i].read_render_target_count * sizeof(rt_render_target_read));
graph->passes[i].write_count = pass_info[i].write_render_target_count;
graph->passes[i].read_count = pass_info[i].read_render_target_count;
}
/* Sort by execution level */
qsort(graph->passes,
render_pass_count,
sizeof(rt_render_pass),
CompareRenderPassExecutionLevels);
result = true;
out:
return result;
#undef PASS_DEPENDS
}
static bool
CreateRenderTargets(rt_framegraph *graph, const rt_framegraph_info *info, rt_arena *arena) {
bool result = false;
/* TODO(Kevin): determine aliasing opportunities */
const rt_render_target_info *render_targets = rtResolveConstRelptr(&info->render_targets);
for (uint32_t i = 0; i < info->render_target_count; ++i) {
graph->render_target_ids[i] = render_targets[i].id;
graph->render_targets[i].format = render_targets[i].format;
graph->render_targets[i].width = render_targets[i].width;
graph->render_targets[i].height = render_targets[i].height;
graph->render_targets[i].sample_count = render_targets[i].sample_count;
graph->render_targets[i].api_render_target =
g_renderer.CreateRenderTarget(&render_targets[i]);
if (!RT_IS_HANDLE_VALID(graph->render_targets[i].api_render_target)) {
rtReportError("GFX", "Failed to create render target %u of framegraph.", i);
for (uint32_t j = 0; j < i; ++j)
g_renderer.DestroyRenderTarget(graph->render_targets[j].api_render_target);
goto out;
}
}
result = true;
out:
return result;
}
static bool ValidateInfo(const rt_framegraph_info *info) {
if (info->render_pass_count > RT_FRAMEGRAPH_MAX_PASSES) {
rtReportError("GFX",
"Framegraph has too many passes: %u (maximum allowed is %u)",
info->render_pass_count,
RT_FRAMEGRAPH_MAX_PASSES);
return false;
}
if (info->render_target_count > RT_FRAMEGRAPH_MAX_RENDER_TARGETS) {
rtReportError("GFX",
"Framegraph has too many render targets: %u (maximum allowed is %u)",
info->render_target_count,
RT_FRAMEGRAPH_MAX_RENDER_TARGETS);
return false;
}
const rt_render_target_info *render_targets = rtResolveConstRelptr(&info->render_targets);
for (uint32_t i = 0; i < info->render_target_count; ++i) {
if (render_targets[i].id == 0) {
rtReportError("GFX", "Framegraph render target %u has invalid id 0", i);
return false;
} else if ((render_targets[i].width == RT_RENDER_TARGET_SIZE_SWAPCHAIN ||
render_targets[i].height == RT_RENDER_TARGET_SIZE_SWAPCHAIN) &&
(render_targets[i].width != render_targets[i].height)) {
rtReportError("GFX",
"Framegraph render target %u: If width or height is set to "
"SWAPCHAIN, both values must be set to SWAPCHAIN.",
i);
return false;
} else if (render_targets[i].format >= RT_PIXEL_FORMAT_count) {
rtReportError("GFX",
"Framegraph render target %u format is outside the allowed range.",
i);
return false;
}
}
const rt_render_pass_info *passes = rtResolveConstRelptr(&info->render_passes);
for (uint32_t i = 0; i < info->render_pass_count; ++i) {
if (passes[i].id == 0) {
rtReportError("GFX", "Framegraph pass %u has invalid id 0", i);
return false;
} else if (passes[i].read_render_target_count > RT_RENDERPASS_MAX_READS) {
rtReportError(
"GFX",
"Framegraph pass %u reads too many rendertargets: %u (maximum allowed is %u)",
passes[i].read_render_target_count,
RT_RENDERPASS_MAX_READS);
return false;
} else if (passes[i].write_render_target_count > RT_RENDERPASS_MAX_WRITES) {
rtReportError(
"GFX",
"Framegraph pass %u writes too many rendertargets: %u (maximum allowed is %u)",
passes[i].write_render_target_count,
RT_RENDERPASS_MAX_WRITES);
return false;
}
}
return true;
}
RT_DLLEXPORT rt_framegraph *rtCreateFramegraph(const rt_framegraph_info *info) {
if (!ValidateInfo(info)) {
return NULL;
}
rt_temp_arena temp = rtGetTemporaryArena(NULL, 0);
if (!temp.arena) {
rtReportError("GFX", "Failed to acquire a temporary arena for constructing a framegraph");
return NULL;
}
rt_framegraph *graph = NULL;
/* Acquire a unused framegraph */
rtLockMutex(_free_list_lock);
graph = _first_free;
if (graph)
_first_free = graph->next_free;
rtUnlockMutex(_free_list_lock);
if (!graph)
goto out;
memset(graph, 0, sizeof(*graph));
if (!CreateRenderPasses(graph, info, temp.arena)) {
ReturnFrameGraph(graph);
graph = NULL;
goto out;
}
if (!CreateRenderTargets(graph, info, temp.arena)) {
ReturnFrameGraph(graph);
graph = NULL;
goto out;
}
out:
rtReturnTemporaryArena(temp);
return graph;
}
RT_DLLEXPORT void rtDestroyFramegraph(rt_framegraph *framegraph) {
ReturnFrameGraph(framegraph);
}
RT_DLLEXPORT void rtBindRenderPass(rt_framegraph *framegraph,
rt_render_pass_id id,
const rt_render_pass_bind_fns *bind_fns) {
for (uint32_t i = 0; i < framegraph->pass_count; ++i) {
if (framegraph->pass_ids[i] == id) {
if (framegraph->passes[i].bound_fns.Execute)
rtLog("GFX", "Rebound pass %x to new functions", id);
framegraph->passes[i].bound_fns = *bind_fns;
return;
}
}
rtLog("GFX", "Tried to bind functions to unknown render pass %x", id);
}
RT_DLLEXPORT rt_render_target_id rtCalculateRenderTargetID(const char *name, size_t len) {
rt_render_target_id id = rtHashBytes32(name, len);
if (id == 0)
id = ~id;
return id;
}
RT_DLLEXPORT rt_render_pass_id rtCalculateRenderPassID(const char *name, size_t len) {
rt_render_pass_id id = rtHashBytes32(name, len);
if (id == 0)
id = ~id;
return id;
}

View File

@ -1,3 +1,4 @@
#include <stdbool.h>
#include <string.h> #include <string.h>
#define RT_DONT_DEFINE_RENDERER_GLOBAL #define RT_DONT_DEFINE_RENDERER_GLOBAL
@ -19,17 +20,20 @@ static bool _renderer_loaded = false;
RT_CVAR_S(rt_Renderer, "Select the render backend. Available options: [vk], Default: vk", "vk"); RT_CVAR_S(rt_Renderer, "Select the render backend. Available options: [vk], Default: vk", "vk");
extern rt_result InitObjectRenderer(void);
extern void ShutdownObjectRenderer(void);
#ifdef RT_STATIC_LIB #ifdef RT_STATIC_LIB
extern void RT_RENDERER_API_FN(RegisterCVars)(void); extern void RT_RENDERER_API_FN(RegisterCVars)(void);
extern rt_result RT_RENDERER_API_FN(Init)(const rt_renderer_init_info *); extern rt_result RT_RENDERER_API_FN(Init)(const rt_renderer_init_info *);
extern void RT_RENDERER_API_FN(Shutdown)(void); extern void RT_RENDERER_API_FN(Shutdown)(void);
extern rt_pipeline_handle RT_RENDERER_API_FN(CompilePipeline)(const rt_pipeline_info *); extern rt_pipeline_handle RT_RENDERER_API_FN(CompilePipeline)(const rt_pipeline_info *);
extern void RT_RENDERER_API_FN(DestroyPipeline)(rt_pipeline_handle handle); extern void RT_RENDERER_API_FN(DestroyPipeline)(rt_pipeline_handle);
extern rt_render_target_handle
RT_RENDERER_API_FN(CreateRenderTarget)(const rt_render_target_info *);
extern void RT_RENDERER_API_FN(DestroyRenderTarget)(rt_render_target_handle);
#endif #endif
extern rt_result InitFramegraphManager(void);
extern void ShutdownFramegraphManager(void);
static bool LoadRenderer(void) { static bool LoadRenderer(void) {
#if !defined(RT_STATIC_LIB) #if !defined(RT_STATIC_LIB)
@ -53,6 +57,8 @@ static bool LoadRenderer(void) {
RETRIEVE_SYMBOL(Shutdown, rt_shutdown_renderer_fn); RETRIEVE_SYMBOL(Shutdown, rt_shutdown_renderer_fn);
RETRIEVE_SYMBOL(CompilePipeline, rt_compile_pipeline_fn); RETRIEVE_SYMBOL(CompilePipeline, rt_compile_pipeline_fn);
RETRIEVE_SYMBOL(DestroyPipeline, rt_destroy_pipeline_fn); RETRIEVE_SYMBOL(DestroyPipeline, rt_destroy_pipeline_fn);
RETRIEVE_SYMBOL(CreateRenderTarget, rt_create_render_target_fn);
RETRIEVE_SYMBOL(DestroyRenderTarget, rt_destroy_render_target_fn);
} else { } else {
rtReportError("GFX", rtReportError("GFX",
"Unsupported renderer backend: (%s) %s", "Unsupported renderer backend: (%s) %s",
@ -67,6 +73,8 @@ static bool LoadRenderer(void) {
g_renderer.Shutdown = &rtRenShutdown; g_renderer.Shutdown = &rtRenShutdown;
g_renderer.CompilePipeline = &rtRenCompilePipeline; g_renderer.CompilePipeline = &rtRenCompilePipeline;
g_renderer.DestroyPipeline = &rtRenDestroyPipeline; g_renderer.DestroyPipeline = &rtRenDestroyPipeline;
g_renderer.CreateRenderTarget = &rtRenCreateRenderTarget;
g_renderer.DestroyRenderTarget = &rtRenDestroyRenderTarget;
#endif #endif
return true; return true;
} }
@ -87,17 +95,18 @@ RT_DLLEXPORT rt_result rtInitGFX(rt_renderer_init_info *renderer_info) {
g_renderer.RegisterCVars(); g_renderer.RegisterCVars();
} }
if (g_renderer.Init(renderer_info) != RT_SUCCESS) rt_result result;
return RT_UNKNOWN_ERROR;
rt_result result = RT_SUCCESS; if ((result = g_renderer.Init(renderer_info)) != RT_SUCCESS)
if ((result = InitObjectRenderer()) != RT_SUCCESS)
return result; return result;
return RT_SUCCESS; if ((result = InitFramegraphManager()) != RT_SUCCESS)
return result;
return result;
} }
RT_DLLEXPORT void rtShutdownGFX(void) { RT_DLLEXPORT void rtShutdownGFX(void) {
ShutdownObjectRenderer(); ShutdownFramegraphManager();
g_renderer.Shutdown(); g_renderer.Shutdown();
} }

View File

@ -1,59 +0,0 @@
#include "renderer_api.h"
#include "mem_arena.h"
#include "handles.h"
typedef struct {
rt_pipeline_handle pipeline;
} rt_object_renderer;
static rt_object_renderer _object_renderer;
#define PIPELINE_ID 0xdee414bba9b4f5bdLL
rt_result InitObjectRenderer(void) {
rt_result result = RT_SUCCESS;
rt_temp_arena temp = rtGetTemporaryArena(NULL, 0);
if (!temp.arena) {
result = RT_OUT_OF_MEMORY;
goto out;
}
/* Init the pipeline */
size_t pipeline_size = rtGetResourceSize(PIPELINE_ID);
if (pipeline_size == 0) {
rtReportError("GFX", "Failed to determine size of object pipeline %llx", PIPELINE_ID);
result = RT_INVALID_VALUE;
goto out;
}
rt_resource *pipeline_resource = rtArenaPush(temp.arena, pipeline_size);
if (!pipeline_resource) {
rtReportError("GFX", "Failed to allocate memory for object pipeline %llx", PIPELINE_ID);
result = RT_OUT_OF_MEMORY;
goto out;
}
result = rtGetResource(PIPELINE_ID, pipeline_resource);
if (result != RT_SUCCESS) {
rtReportError("GFX", "Failed to load the object pipeline %llx", PIPELINE_ID);
goto out;
}
rt_pipeline_info *info = pipeline_resource->data;
if (!info) {
rtReportError("GFX", "Malformed object pipeline %llx (missing pipeline_info)", PIPELINE_ID);
result = RT_INVALID_VALUE;
goto out;
}
_object_renderer.pipeline = g_renderer.CompilePipeline(info);
if (!RT_IS_HANDLE_VALID(_object_renderer.pipeline)) {
rtReportError("GFX", "Failed to compile the object pipeline %llx", PIPELINE_ID);
result = RT_UNKNOWN_ERROR;
goto out;
}
out:
rtReturnTemporaryArena(temp);
return result;
}
void ShutdownObjectRenderer(void) {
g_renderer.DestroyPipeline(_object_renderer.pipeline);
}

View File

@ -3,8 +3,18 @@
#include <xxhash/xxhash.h> #include <xxhash/xxhash.h>
#include <assert.h> #include <assert.h>
/* XXH32 of "recreational.tech" */
#define HASH32_SEED 0xd9035e35
static_assert(sizeof(rt_hash64) == sizeof(XXH64_hash_t), "Size mismatch between rt_hash64 and XXH64_hash_t!"); static_assert(sizeof(rt_hash64) == sizeof(XXH64_hash_t), "Size mismatch between rt_hash64 and XXH64_hash_t!");
static_assert(sizeof(rt_hash32) == sizeof(XXH32_hash_t),
"Size mismatch between rt_hash32 and XXH32_hash_t!");
RT_DLLEXPORT rt_hash64 rtHashBytes(const void *begin, size_t count) { RT_DLLEXPORT rt_hash64 rtHashBytes(const void *begin, size_t count) {
return XXH3_64bits(begin, count); return XXH3_64bits(begin, count);
} }
RT_DLLEXPORT rt_hash32 rtHashBytes32(const void *begin, size_t count) {
return XXH32(begin, count, HASH32_SEED);
}

View File

@ -10,9 +10,12 @@ extern "C" {
#endif #endif
typedef uint64_t rt_hash64; typedef uint64_t rt_hash64;
typedef uint32_t rt_hash32;
RT_DLLEXPORT rt_hash64 rtHashBytes(const void *begin, size_t count); RT_DLLEXPORT rt_hash64 rtHashBytes(const void *begin, size_t count);
RT_DLLEXPORT rt_hash32 rtHashBytes32(const void *begin, size_t count);
#ifdef __cplusplus #ifdef __cplusplus
} }
#endif #endif

View File

@ -13,7 +13,10 @@ extern rt_cvar rt_FileTabCapacity;
extern rt_cvar rt_MaxConcurrentAsyncIO; extern rt_cvar rt_MaxConcurrentAsyncIO;
extern rt_cvar rt_ResourceDirectory; extern rt_cvar rt_ResourceDirectory;
extern rt_cvar rt_ResourceCacheSize; extern rt_cvar rt_ResourceCacheSize;
extern rt_cvar rt_MaxCachedResources;
extern rt_cvar rt_ResourceNamespaceSize; extern rt_cvar rt_ResourceNamespaceSize;
extern rt_cvar rt_DisableResourceNamespaceLoad;
extern rt_cvar rt_MaxFramegraphs;
#ifdef RT_BUILD_ASSET_COMPILER #ifdef RT_BUILD_ASSET_COMPILER
extern rt_cvar rt_AssetDirectory; extern rt_cvar rt_AssetDirectory;
@ -29,7 +32,10 @@ void RegisterRuntimeCVars(void) {
rtRegisterCVAR(&rt_MaxConcurrentAsyncIO); rtRegisterCVAR(&rt_MaxConcurrentAsyncIO);
rtRegisterCVAR(&rt_ResourceDirectory); rtRegisterCVAR(&rt_ResourceDirectory);
rtRegisterCVAR(&rt_ResourceCacheSize); rtRegisterCVAR(&rt_ResourceCacheSize);
rtRegisterCVAR(&rt_MaxCachedResources);
rtRegisterCVAR(&rt_ResourceNamespaceSize); rtRegisterCVAR(&rt_ResourceNamespaceSize);
rtRegisterCVAR(&rt_DisableResourceNamespaceLoad);
rtRegisterCVAR(&rt_MaxFramegraphs);
#ifdef RT_BUILD_ASSET_COMPILER #ifdef RT_BUILD_ASSET_COMPILER
rtRegisterCVAR(&rt_AssetDirectory); rtRegisterCVAR(&rt_AssetDirectory);
#endif #endif

View File

@ -10,6 +10,7 @@
#include <assert.h> #include <assert.h>
#include <limits.h> #include <limits.h>
#include <string.h> #include <string.h>
#include <stdbool.h>
typedef struct { typedef struct {
rt_attribute_binding *uniform_bindings; rt_attribute_binding *uniform_bindings;

View File

@ -6,8 +6,8 @@
#include <stddef.h> #include <stddef.h>
#include "gfx.h" #include "gfx.h"
#include "runtime.h"
#include "resources.h" #include "resources.h"
#include "runtime.h"
#ifdef __cplusplus #ifdef __cplusplus
extern "C" { extern "C" {
@ -86,7 +86,6 @@ typedef struct {
size_t bytecode_length; size_t bytecode_length;
} rt_shader_info; } rt_shader_info;
/* Handles for backend objects */ /* Handles for backend objects */
#define RT_GFX_HANDLE_MAX_VERSION 255 #define RT_GFX_HANDLE_MAX_VERSION 255
@ -96,11 +95,18 @@ typedef struct {
uint32_t index : 24; uint32_t index : 24;
} rt_pipeline_handle; } rt_pipeline_handle;
typedef struct {
uint32_t version : 8;
uint32_t index : 24;
} rt_render_target_handle;
typedef void rt_register_renderer_cvars_fn(void); typedef void rt_register_renderer_cvars_fn(void);
typedef rt_result rt_init_renderer_fn(const rt_renderer_init_info *info); typedef rt_result rt_init_renderer_fn(const rt_renderer_init_info *info);
typedef void rt_shutdown_renderer_fn(void); typedef void rt_shutdown_renderer_fn(void);
typedef rt_pipeline_handle rt_compile_pipeline_fn(const rt_pipeline_info *info); typedef rt_pipeline_handle rt_compile_pipeline_fn(const rt_pipeline_info *info);
typedef void rt_destroy_pipeline_fn(rt_pipeline_handle handle); typedef void rt_destroy_pipeline_fn(rt_pipeline_handle handle);
typedef rt_render_target_handle rt_create_render_target_fn(const rt_render_target_info *info);
typedef void rt_destroy_render_target_fn(rt_render_target_handle handle);
typedef struct { typedef struct {
rt_register_renderer_cvars_fn *RegisterCVars; rt_register_renderer_cvars_fn *RegisterCVars;
@ -108,6 +114,8 @@ typedef struct {
rt_shutdown_renderer_fn *Shutdown; rt_shutdown_renderer_fn *Shutdown;
rt_compile_pipeline_fn *CompilePipeline; rt_compile_pipeline_fn *CompilePipeline;
rt_destroy_pipeline_fn *DestroyPipeline; rt_destroy_pipeline_fn *DestroyPipeline;
rt_create_render_target_fn *CreateRenderTarget;
rt_destroy_render_target_fn *DestroyRenderTarget;
} rt_renderer_api; } rt_renderer_api;
#define RT_RENDERER_API_FN(name) RT_DLLEXPORT rtRen##name #define RT_RENDERER_API_FN(name) RT_DLLEXPORT rtRen##name