Core concepts

Vuk lets you write normal imperative code where vuk::Value types stand in for ordinary variables, and passes created with vuk::make_pass() behave like regular functions. The key difference is that execution is lazy - work is deferred until you explicitly observe a result.

Think of it like this:

vuk::Value<T> = like a variable of type T
vuk::make_pass() = like a defining a function
Function calls chain together like normal C++ code
Nothing actually executes until requested or required

Example - looks like imperative code:

// These look like normal variable declarations and function calls
auto uploaded_data = upload_vertices(allocator, vertex_data);
auto cleared_image = clear_pass(render_target);
auto rendered_image = draw_pass(uploaded_data, cleared_image);
auto post_processed = blur_pass(rendered_image);

// But nothing has executed yet! The GPU work is lazily evaluated.
// Only when we observe the result does execution happen:
post_processed.wait(allocator, compiler);  // Now everything runs

The idea is that you write straightforward, easy-to-read code, while vuk automatically take care of translating it for the underlying graphics API. When you need to actually use a result on the CPU, you force the computations to happen. When you need a result on the GPU, you just make new passes that use the result, and vuk figures out the dependencies. This lazy evaluation model means you can build complex render graphs using familiar programming patterns, without worrying about the underlying complexity of GPU synchronization and resource management.

Values

At the heart of vuk’s execution model is the vuk::Value type. A vuk::Value represents a resource (like an image or buffer, or even an integer) that will be available after some GPU work completes.

// Create a buffer with data - returns the buffer handle and a Value
auto [buf_handle, buffer_value] = create_buffer(
    allocator,
    MemoryUsage::eGPUonly,
    DomainFlagBits::eTransferOnTransfer,
    std::span(my_data)
);

// buffer_value is a Value<Buffer> representing the buffer after upload completes
// The actual upload hasn't happened yet!

vuk::Value types are composable - you can chain operations on them to build complex pipelines without explicitly managing dependencies or synchronization.

Building GPU computation with make_pass

The vuk::make_pass() function creates a render graph node that performs some GPU work. It automatically infers dependencies and synchronization from the resources you use. The first argument is a name for the pass (for debugging), followed by a lambda that records commands into a vuk::CommandBuffer. The lambda’s parameters specify the resources it uses, annotated with access patterns. The lambda must always take a vuk::CommandBuffer& as the first parameter, followed by any number of resources annotated with access patterns. The lambda returns the output resources. If the lambda returns multiple resources, return them as a std::tuple.

After making the pass, you can call it like a normal function, passing in vuk::Value s representing the input resources. The result is a vuk::Value representing the output resource(s) or a tuple of vuk::Value s.

Note

Buffers and ImageAttachments can be mutated - imagine as if the function takes them by reference and modifies them in place.

For now we are focusing on how to create the whole program - see the section on CommandBuffer for details on what goes into the callback.

Basic Structure

auto my_pass = make_pass(
    "pass_name",
    [](CommandBuffer& cbuf, VUK_IA(eColorWrite) target) {
        // Record commands into cbuf
        cbuf.draw(3, 1, 0, 0);

        // Return the output resource
        return target;
    }
);

We have to annotate resource parameters with access patterns so vuk can manage synchronization. Use these macros for convenience:

VUK_IA(access) - vuk::ImageAttachment with specified access pattern
VUK_BA(access) - vuk::Buffer with specified access pattern
VUK_ARG(type, access) - Generic argument with access pattern

Warning

While it is possible to capture resources into the lambda via captures (e.g., [my_buffer]), doing so bypasses vuk’s automatic synchronization. When you capture resources directly, you become responsible for managing synchronization manually. It is recommended to pass all resources as lambda parameters with proper access annotations instead, allowing vuk to handle synchronization automatically.

Building up a program

We have seen that Values represent resources that will be available after GPU work, and that vuk::make_pass() creates functions that operate on these resources. By combining these two concepts, we can build complex GPU computations in a straightforward way. First, we have to make the input resources available as vuk::Value s. This is done via resource declaration and acquisition functions. After that, we can chain together passes using normal function call syntax. Finally, we force evaluation.

Resource Declaration and Acquisition

Vuk provides several functions to create and import resources:

declare_ia / declare_buf - Create placeholder resources that will be allocated later
acquire_ia / acquire_buf - Import existing resources (e.g., from previous frames)
discard_ia / discard_buf - Specify that previous contents don’t matter (will overwrite)

Note

The first argument to these functions is a debug name for the resource.

// Declare an image that will be allocated later
auto temp_image = declare_ia("temp", ImageAttachment{
    .format = Format::eR8G8B8A8Srgb,
    .extent = {1024, 768, 1}
});

// Import an existing image (e.g., from a previous frame)
auto imported = acquire_ia(
    "imported",
    existing_image_attachment,
    Access::eFragmentSampled  // Last known access
);

// We don't care about previous contents, will overwrite
auto fresh_target = discard_ia("target", render_target);

Resources

vuk::Buffer and vuk::ImageAttachment are the fundamental resource types in vuk:

Buffer - Represents GPU memory for storing arbitrary data (vertices, indices, uniforms, storage buffers). Contains information about size, memory usage, and optional device address for buffer device address features.
ImageAttachment - Represents a GPU image/texture with specified format, dimensions, sample count, and mip/array layer configuration. Used for render targets, textures, and framebuffer attachments.

Both types are handles to memory that can be copied freely. The actual GPU resources are managed by allocators.

Letting vuk allocate for you

When you use vuk::declare_ia() or vuk::declare_buf(), you’re creating resources without allocating actual GPU upfront. vuk will automatically allocate the necessary memory when the render graph executes, based on how the resources are used in your passes. This is great for transient resources that only exist within a single frame or render pass. For resources that need to persist across frames (like geometry buffers or persistent textures), it can be more convenient to create these upfront and then import with vuk::acquire_ia() / vuk::acquire_buf().

// Transient resources - let vuk allocate and manage
auto temp_rt = declare_ia("temp_render_target", ImageAttachment{
    .format = Format::eR16G16B16A16Sfloat,
    .extent = {1920, 1080, 1}
});

// During startup:
// Persistent resources - allocate once, reuse across frames
auto [persistent_buf, upload_future] = create_buffer(
    allocator,
    MemoryUsage::eGPUonly,
    DomainFlagBits::eTransferOnGraphics,
    geometry_data
);
upload_future.wait(allocator, compiler);  // Ensure upload completes before use

// During frame rendering:
auto persistent_buf_val = acquire_buf(
        "persistent_geometry",
        persistent_buf,
        Access::eTransferWrite  // Last known access
);

Building up complex computations from passes

Passes created with vuk::make_pass() can be chained together like normal functions. Each pass takes vuk::Value s as inputs and produces vuk::Value s as outputs.

// Each pass takes inputs and produces outputs
auto pass1 = make_pass("clear",
    [](CommandBuffer& cbuf, VUK_IA(eColorWrite) img) {
        cbuf.clear_image(img, ClearColor{0.f, 0.f, 0.f, 1.f});
        return img;
    });

auto pass2 = make_pass("draw",
    [](CommandBuffer& cbuf, VUK_IA(eColorWrite) img) {
        cbuf.bind_graphics_pipeline("my_pipeline");
        cbuf.draw(3, 1, 0, 0);
        return img;
    });

// Chain them together
Value<ImageAttachment> result = pass2(pass1(my_image));

Presentation

To display rendered images on screen, use the swapchain functions to acquire images, render to them, and present:

// Acquire the swapchain as a Value
auto swapchain_val = acquire_swapchain(my_swapchain);

// Get the next image from the swapchain
auto swapchain_image = acquire_next_image("swapchain_img", swapchain_val);

// additional rendering passes...
// Render to the swapchain image
auto rendered = render_pass(swapchain_image);
// additional rendering passes...

// Mark the image ready for presentation
auto presentable = enqueue_presentation(rendered);

Execution

Once you have built up your computation using passes and values, you need to trigger execution. vuk::Value<T> provides several methods to control when work executes:

submit() - Queue work for execution without waiting (non-blocking)
wait() - Submit and wait for completion (blocking)
get() - Submit, wait, and retrieve the result (blocking with data retrieval)

// Build the graph
auto result = my_pass(input_image);

// Option 1: Submit without waiting (non-blocking)
result.submit(allocator, compiler);
// Do other work while GPU executes...

// Option 2: Submit and wait for completion
result.wait(allocator, compiler);

// Option 3: For CPU readback - submit, wait, and retrieve
auto final_buffer = download_buffer(gpu_buffer);
auto cpu_result = final_buffer.get(allocator, compiler);
auto data = std::span((uint32_t*)cpu_result->mapped_ptr, element_count);

In all cases, computation only happens once. Subsequent calls to submit(), wait(), or get() on the same vuk::Value do not re-execute the work; they simply ensure the result is ready.

Note

You don’t need to wait if you are only using the result on the GPU - just pass the vuk::Value to another pass and vuk will handle dependencies automatically. You can also submit, then use that value in other passes - this means the computation for that intermediate result will be scheduled independently.

Warning

Calling get() incurs CPU-GPU synchronization and should be avoided in throughput-critical paths. Prefer chaining passes when possible.

Warning

vuk can only see the computations until the point you call an execution method. If you rely on vuk determining eg. image usage, be sure that vuk can see every use or make the image yourself.

Advanced topics

Resources outside the render graph

vuk can only reason about resources that are part of the render graph. If you want to use a resource outside the graph, you can specify the desired final state using vuk::Value::as_released().

auto released = my_value.as_released(
    Access::eFragmentSampled,  // Future access outside the graph
    DomainFlagBits::eGraphicsQueue  // Queue that will use it
);

Conversely, if you have a resource created outside the graph that you want to use inside, use vuk::acquire_ia() / vuk::acquire_buf() to import it with its last known access pattern.

auto imported = acquire_buf(
    "imported_buf",
    existing_buffer,
    Access::eTransferWrite  // Last known access
);

Multi-queue execution

Vuk automatically schedules work across multiple queues:

auto transfer_pass = make_pass("upload",
    [](CommandBuffer& cbuf, VUK_BA(eTransferWrite) dst) {
        cbuf.fill_buffer(dst, 0);
        return dst;
    },
    DomainFlagBits::eTransferQueue  // Schedule on transfer queue
);

Resource inference

Vuk can infer resource properties like sizes and formats from other resources in the graph. This is particularly useful when you have resources whose dimensions or properties depend on other resources, but you don’t want to manually track these dependencies.

Inference methods available:

For vuk::Value<ImageAttachment>:

same_extent_as(src) - Infer width, height, and depth
same_2D_extent_as(src) - Infer width and height only
same_format_as(src) - Infer format
same_shape_as(src) - Infer extent, layers, and mip levels
similar_to(src) - Infer all properties (shape, format, sample count)

For vuk::Value<Buffer>:

same_size(src) - Infer buffer size
get_size() - Get size as a Value<uint64_t> for computation
set_size(size_value) - Set size from a computed value

How built-in functions use inference:

Many of vuk’s built-in functions automatically set up inference for you. For example, when copying between images:

auto src = acquire_ia("source", source_image, Access::eTransferRead);
auto dst = declare_ia("destination");  // No properties specified

// copy() automatically sets up inference
auto copied = copy(src, dst);
// dst now has the same extent as src

Similarly, download_buffer infers the required buffer size:

auto gpu_image = render_pass(target);

// download_buffer automatically creates a buffer with the right size
// based on the image format and dimensions
auto cpu_buffer = download_buffer(gpu_image);
auto result = cpu_buffer.get(allocator, compiler);

// result->size is automatically set to the image's data size

Practical example - Blur pipeline:

Here’s a complete example showing how inference simplifies a blur pipeline:

// Input image with some specific properties
auto input = acquire_ia("input", source_image, Access::eFragmentSampled);

// Horizontal blur - needs same size as input
auto h_blur = declare_ia("horizontal_blur");
h_blur.similar_to(input);  // Automatically matches all properties

// Vertical blur - needs same size as horizontal
auto v_blur = declare_ia("vertical_blur");
v_blur.similar_to(h_blur);  // Chain inference

// Perform the blur
auto h_blurred = horizontal_blur_pass(input, h_blur);
auto final = vertical_blur_pass(h_blurred, v_blur);

// If input changes size, everything adapts automatically!

Computing with inferred values:

You can also perform arithmetic on inferred values:

auto input_buf = acquire_buf("input", input_buffer, Access::eTransferRead);

// Create output buffer that's twice the size
auto output_buf = declare_buf("output");
output_buf->memory_usage = MemoryUsage::eGPUonly;
output_buf.set_size(input_buf.get_size() * 2);  // Arithmetic on sizes!

// Create another buffer based on image dimensions
auto image = acquire_ia("img", my_image, Access::eFragmentSampled);
auto pixel_buffer = declare_buf("pixels");
pixel_buffer->memory_usage = MemoryUsage::eGPUonly;

// Compute size from image dimensions
auto width = image.get_size().get_width();
auto height = image.get_size().get_height();
auto pixel_count = width * height;
pixel_buffer.set_size(pixel_count * 4);  // 4 bytes per pixel (RGBA8)

This approach makes your render graphs more maintainable - when input dimensions change, everything downstream adapts automatically without code changes.

Domains and Access

Domains: Where Work Happens

A vuk::DomainFlagBits specifies where work should execute. GPUs have different queues specialized for different types of work:

eGraphicsQueue - Graphics with rasterization, compute shaders, and transfer
eComputeQueue - Compute shaders and transfer
eTransferQueue - Memory transfers (uploads/downloads)
eHost - CPU-side operations
eAny - Let vuk decide

// This upload can happen in parallel with other graphics work
auto transfer_pass = make_pass("upload",
    [](CommandBuffer& cbuf, VUK_BA(eTransferWrite) dst) {
        cbuf.fill_buffer(dst, 0);
        return dst;
    },
    DomainFlagBits::eTransferQueue  // Execute on transfer queue
);

Most of the time you can use DomainFlagBits::eAny and let vuk infer the queue.

Access: How Resources Are Used

An vuk::Access pattern tells vuk how a resource will be used in a pass. This is critical for:

Synchronization - Ensuring reads happen after writes
Layout transitions - Images need different layouts for different operations
Cache management - Proper invalidation and flushing

enum vuk::Access

Access patterns for GPU resources.

These flags specify how a resource will be accessed in a render pass. Vuk uses these to determine synchronization requirements, image layout transitions, and execution dependencies automatically.

Values:

enumerator eNone: No access - resource available without synchronization (initial) or doesn’t need sync (final)

enumerator eColorRead: Read as a framebuffer color attachment.

enumerator eColorWrite: Written as a framebuffer color attachment.

enumerator eColorRW: Read and write as a framebuffer color attachment.

enumerator eDepthStencilRead: Read as a framebuffer depth/stencil attachment.

enumerator eDepthStencilWrite: Written as a framebuffer depth/stencil attachment.

enumerator eDepthStencilRW: Read and write as a framebuffer depth/stencil attachment.

enumerator eVertexSampled: Sampled in a vertex shader.

enumerator eVertexRead: Read from an image or buffer in a vertex shader.

enumerator eAttributeRead: Read from a vertex attribute buffer.

enumerator eIndexRead: Read from an index buffer for indexed rendering.

enumerator eIndirectRead: Read from an indirect buffer for indirect rendering.

enumerator eVertexUniformRead: Read from a uniform buffer in a vertex shader.

enumerator eFragmentSampled: Sampled in a fragment shader.

enumerator eFragmentRead: Read from an image or buffer in a fragment shader.

enumerator eFragmentWrite: Written using image store or buffer write in a fragment shader.

enumerator eFragmentRW: Read and write in a fragment shader.

enumerator eFragmentUniformRead: Read from a uniform buffer in a fragment shader.

enumerator eCopyRead: vkCmdCopy* source

enumerator eCopyWrite: vkCmdCopy* destination

enumerator eCopyRW: vkCmdCopy* source and destination

enumerator eBlitRead: vkCmdBlitImage source

enumerator eBlitWrite: vkCmdBlitImage destination

enumerator eBlitRW: vkCmdBlitImage source and destination

enumerator eClear: vkCmdClear* destination

enumerator eResolveRead: vkCmdResolveImage source

enumerator eResolveWrite: vkCmdResolveImage destination

enumerator eResolveRW: vkCmdResolveImage source and destination

enumerator eTransferRead: All transfer read operations.

enumerator eTransferWrite: All transfer write operations.

enumerator eTransferRW: All transfer operations.

enumerator eComputeRead: Read from an image or buffer in a compute shader.

enumerator eComputeWrite: Written using image store or buffer write in a compute shader.

enumerator eComputeRW: Read and write in a compute shader.

enumerator eComputeSampled: Sampled in a compute shader.

enumerator eComputeUniformRead: Read from a uniform buffer in a compute shader.

enumerator eRayTracingRead: Read from an image or buffer in a ray tracing shader.

enumerator eRayTracingWrite: Written using image store or buffer write in a ray tracing shader.

enumerator eRayTracingRW: Read and write in a ray tracing shader.

enumerator eRayTracingSampled: Sampled in a ray tracing shader.

enumerator eRayTracingUniformRead: Read from a uniform buffer in a ray tracing shader.

enumerator eAccelerationStructureBuildRead: Read during acceleration structure build.

enumerator eAccelerationStructureBuildWrite: Written during acceleration structure build.

enumerator eAccelerationStructureBuildRW: Read and write during acceleration structure build.

enumerator eHostRead: Read by the host CPU.

enumerator eHostWrite: Written by the host CPU.

enumerator eHostRW: Read and write by the host CPU.

enumerator eMemoryRead: Any device access that reads.

enumerator eMemoryWrite: Any device access that writes.

enumerator eMemoryRW: Any device access (read or write)

enumerator ePresent: Presented to swapchain.

enumerator eTessellationSampled: Sampled in a tessellation shader.

enumerator eTessellationRead: Read from an image or buffer in a tessellation shader.

enumerator eTessellationUniformRead: Read from a uniform buffer in a tessellation shader.

Example showing different access patterns

// Clear an image (write access)
auto cleared = make_pass("clear",
    [](CommandBuffer& cbuf, VUK_IA(eColorWrite) target) {
        cbuf.clear_image(target, ClearColor{0.f, 0.f, 0.f, 1.f});
        return target;
    })(my_image);

// Read it as a texture (read access)
auto sampled = make_pass("sample",
    [](CommandBuffer& cbuf, VUK_IA(eFragmentSampled) texture) {
        // Use texture in shader
        return texture;
    })(cleared);

Vuk uses these access patterns to automatically:

Insert memory barriers
Transition image layouts (e.g., TRANSFER_DST → SHADER_READ_ONLY)
Determine execution dependencies
Order passes correctly

API reference for Value and make_pass

template<class T> class Value : public vuk::UntypedValue

Represents a GPU resource that will be available after some work completes.

Template Parameters:: T – Type of the resource (Buffer, ImageAttachment, etc.)

Public Functions

template<class U> inline Value<U> transmute(Ref new_head) noexcept

Internal: Transmute this Value to a different type.

Template Parameters:: U – New type for the Value
Parameters:: new_head – New IR reference
Returns:: Value with new type

inline T *operator->() noexcept

Access the underlying resource (only after declare or wait/get)

Returns:: Pointer to the resource

inline Result<T> get(Allocator &allocator, Compiler &compiler, RenderGraphCompileOptions options = {})

Submit, wait, and retrieve the resource value on the host.

Parameters:

allocator – Allocator to use for resource allocation
compiler – Compiler to use for graph compilation
options – Optional compilation options

Returns:

Result containing the resource, or an error

template<class U = T> inline Value<U> as_released(Access access = Access::eNone, DomainFlagBits domain = DomainFlagBits::eAny)

Mark this Value as released for use outside the render graph.

Template Parameters:

U – Type of the returned Value (defaults to T)

Parameters:

access – The access pattern for future use
domain – The domain where the resource will be used

Returns:

New Value representing the released resource

inline void same_extent_as(const Value<ImageAttachment> &src)

Infer extent (width, height, depth) from another image.

Parameters:: src – Source image to copy extent from

inline void same_2D_extent_as(const Value<ImageAttachment> &src)

Infer 2D extent (width, height) from another image.

Parameters:: src – Source image to copy 2D extent from

inline void same_format_as(const Value<ImageAttachment> &src)

Infer format from another image.

Parameters:: src – Source image to copy format from

inline void same_shape_as(const Value<ImageAttachment> &src)

Infer shape (extent, layers, mip levels) from another image.

Parameters:: src – Source image to copy shape from

inline void similar_to(const Value<ImageAttachment> &src)

Infer all properties (shape, format, sample count) from another image.

Parameters:: src – Source image to copy properties from

inline Value<Buffer> subrange(uint64_t new_offset, uint64_t new_size)

Create a subrange view of this buffer.

Parameters:

new_offset – Offset in bytes from the start of the buffer
new_size – Size of the subrange in bytes

Returns:

Value representing the buffer subrange

inline void same_size(const Value<Buffer> &src)

Infer buffer size from another buffer.

Parameters:: src – Source buffer to copy size from

inline Value<uint64_t> get_size()

Get the size of this buffer as a Value.

Returns:: Value<uint64_t> representing the buffer size

inline void set_size(Value<uint64_t> arg)

Set the size of this buffer from another Value.

Parameters:: arg – Value containing the size to set

inline auto operator[](size_t index)

Array subscript operator for array-typed Values.

Parameters:: index – Index into the array
Returns:: Value representing the array element

inline auto mip(uint32_t mip)

Get a specific mip level of this image.

Parameters:: mip – Mip level to extract
Returns:: Value representing the mip level

inline auto layer(uint32_t layer)

Get a specific array layer of this image.

Parameters:: layer – Array layer to extract
Returns:: Value representing the array layer

template<class F> auto vuk::make_pass(Name name, F &&body, SchedulingInfo scheduling_info = SchedulingInfo(DomainFlagBits::eAny), VUK_CALLSTACK)

Turn a lambda into a callable rendergraph computation (a pass)

Template Parameters:

F – Lambda type

Parameters:

name – Debug name for the pass
body – Callback lambda (body of the pass)
scheduling_info – Queue scheduling constraints