Stoner

做此刻最想做的事


  • Home

  • Tags

  • Categories

  • Archives

Render Passes and Framebuffers

Posted on 2019-04-05 | In sdk , graphics , vulkan

Render Passes and Framebuffers

[TOC]

介绍

内容

  • Specifying attachment descriptions
  • Specifying subpass descriptions
  • Specifying dependencies between subpasses
  • Creating a render pass
  • Creating a framebuffer
  • Preparing a render pass for geometry rendering and postprocess subpasses
  • Preparing a render pass and a framebuffer with color and depth attachments
  • Beginning a render pass
  • Progressing to the next subpass
  • Ending a render pass
  • Destroying a framebuffer
  • Destroying a render pass

dc在render passes中组织.一个render pass是subpasses的集合.subpass描述images 资源(color,depth/stencil,input attachment)如何被使用:layouts是什么,在subpasses间layouts如何变换,何时向attachments渲染或合适从里面读数据,renderpass介绍后它们的内容是否有用,或它们的suage是否只被限制在一个render pass里.

存储在渲染过程中的上述数据只是一个general description或metadata.在rendering process中真实的resources为framebuffers.通过他们,定义了rendering atatchments的image views.

我们需要提前准备这些信息,在我们能issue(record)rendering commands前.有了这些信息,驱动能高效控制drawing process,限制rendering的memory 数量,或者给某些attachments使用非常快的cache,提高更多性能.

接下来讨论如何组织renderpasses和subpasses的drawing操作.以及如何准备RT,创建framebuffers–用作attachments的image views.

descriptor

specifying attachments descriptions

一个render pass是一组资源的集合(images)叫做attachments,用于rendering操作.分为color,depth/stencil,input,或者attachments.在创建render pass前,需要描述所有的attachmetns.
创建一组attachment descriptions.这个数组的indices之后也用于subpass descriptions.类似,创建一个framebuffer和指明每个attachment使用哪个image resources,定义了一个列表其每个元素对应于attachment descriptions数组.

通常绘制一个几何体,至少需要一个color attachment.可能还需要depth attachment(如果开启depth test).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
std::vector<VkAttachmentDescription> attachments_descriptions = 
{
{
0,
VK_FORMAT_R8G8B8A8_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_STORE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
},
{
0,
VK_FORMAT_D16_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_STORE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
}
};

在之前的例子,指明了两个attachments:一个R8G8B8A8_UNORM和一个D16_UNORM格式的.二者在render pass开始都需要clear.(类似glClear).当render pass完成时,我们想保持第一个attachment的内容,单不想第二个attachment的内容.二者都指明一个UNDEFINED intial layout–总能用于一个initial/old layout–意味着当memory barrier set up 时我们不需要images内容.

final layout的内容依赖于在render pass后我们如何使用image.如果我们直接向一个swapchain image渲染且想显示到屏幕上,我们需要PRESENT_SRC layout.对于depth attachment,如果render pass之后不想用depth component(通常为true),需要需要再render pass的最后的subpass set the same layout value as specified.

也有可能一个render pass不用任何attachments.此时不需要指明attachment descriptions但这种情况很少.

pass

Specifying subpass descriptions

再render pass中的操作被组织再subpasses中.每个subpass表示rendering commands(a subset of render pass’s attachments are used)的一个stage或一个phase.

一个render pass总需要至少一个subpass—当开始一个render pass时自动允许的.对于每个subpass,需要准备一个description.

为了减少参数,定义一个自定义结构体.它是vulkan头文件中定义的vksubpassDescription结构的简化版本.

1
2
3
4
5
6
7
8
9
10
11
struct SubpassParameters {
VkPipelineBindPoint PipelineType;//定义了pipeline type(graphic,compute.)
std::vector<VkAttachmentReference> InputAttachments;
std::vector<VkAttachmentReference> ColorAttachments;
//指明哪些color attachments需要在subpass结束时resolved(从多采样图像更改为非多采样/单采样图像)
std::vector<VkAttachmentReference> ResolveAttachments;
//如果用了,指明哪个attachment用于depth and/or stencil attachment.
VkAttachmentReference const * DepthStencilAttachment;
//一组不用于subpass但其内容在整个subpass中需要preserved
std::vector<uint32_t> PreserveAttachments;
};

再总结一下.

vulkan的render pass至少要有一个subpass,subpass参数定义在一组VkSubpassDescription中,每个这样的元素描述了attachments在关联的subpass中如何使用的.他们是分开的input,color,resolve,preserved attachments和单个entry for depth/stencil attachments的列表.所有成员都可能为空,在这种情况下,子类中不使用相应类型的attachment.

刚刚描述的列表中的每个条目都是再attachment descriptions中为render pass指明的attachments的列表的引用.此外,每个条目都指定了一个布局,其中图像应该在子类期间出现.驱动程序自动执行到指定布局的转换.

下面是使用子类参数类型的自定义结构指定子类定义的代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
subpass_descriptions.clear();
for( auto & subpass_description : subpass_parameters ) {
subpass_descriptions.push_back( {
0,
subpass_description.PipelineType,
static_cast<uint32_t>(subpass_description.InputAttachments.size()),
subpass_description.InputAttachments.data(),
static_cast<uint32_t>(subpass_description.ColorAttachments.size()),
subpass_description.ColorAttachments.data(),
subpass_description.ResolveAttachments.data(),
subpass_description.DepthStencilAttachment,
static_cast<uint32_t>(subpass_description.PreserveAttachments.size()),
subpass_description.PreserveAttachments.data()
} );
}

下面是一个使用一个color attachment:a depth/stencil attachment的一个subpass的例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
VkAttachmentReference depth_stencil_attachment = {
1,//index
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};
std::vector<SubpassParameters> subpass_parameters = {
{
VK_PIPELINE_BIND_POINT_GRAPHICS,
{},
{
{
0,//index
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
}
},
{},
&depth_stencil_attachment,
{}
}
};

specifying dependencies between subpasses

当subpass有依赖关系时需要指明subpass dependencies.

定义subpass dependencies与memory barrier相似.

指明subpass之间(或subpass和render pass之后/之前的commands之间)与设置image memory barrier相似.当位于某个subpass的commands以来另一个subpass时需要这样做.不需要设置layout transitions的dependencies.它们时根据render pass attachment和subpass descriptions自动进行的,但如果两个subpass中attachment都是只读的,就不需要指明dependency了.

在render pass建立image memory barriers也需要subpass dependencies.但不能”self-dependency”(source和dest的index相同).但如果给已有subpass定义了一个这样的dependency,我们能record一个memory barrier.其他情况,source subpass index必须比target subpass index小(除了VK_SUBPASS_EXTERNAL)

下例,准备了两个subpass之间的dependency–第一个绘制geometry到color和depth attachments,第二个使用color data做后处理.

1
2
3
4
5
6
7
8
9
10
11
std::vector<VkSubpassDependency> subpass_dependencies = {
{
0,//第一个subpass
1,//第二个subpass
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,//stage for 0
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,//access mask for 0
VK_ACCESS_INPUT_ATTACHMENT_READ_BIT,//access mask for 1
VK_DEPENDENCY_BY_REGION_BIT;//第一个subpass在一个坐标写一个值,第二个subpass在相同坐标读取到相同的值
}
};

VK_DEPENDENCY_BY_REGION_BIT我们这样做时,我们不应该假设区域大于单个像素,因为在不同的硬件平台上,区域的大小可能不同.

creating a render pass

后处理等操作需要在subpasses里对这些操作排序.指明所有需要的attachments的descriptions,所有组织操作的 subpasses,还有这些操作间必要的dependencies.这些数据准备好后,能创建render pass了.

vkCreateRenderPass

render pass创建最重要的部分是准备数据.descriptions.所有用到的attachments和subpasses和subpasses间的dependencies的descriptions.

1
2
3
4
SpecifyAttachmentsDescriptions( attachments_descriptions );
std::vector<VkSubpassDescription> subpass_descriptions;
SpecifySubpassDescriptions( subpass_parameters, subpass_descriptions );
SpecifyDependenciesBetweenSubpasses( subpass_dependencies );

在创建render pass时作为参数使用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
VkRenderPassCreateInfo render_pass_create_info = {
VK_STRUCTURE_TYPE_RENDER_PASS_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(attachments_descriptions.size()),
attachments_descriptions.data(),
static_cast<uint32_t>(subpass_descriptions.size()),
subpass_descriptions.data(),
static_cast<uint32_t>(subpass_dependencies.size()),
subpass_dependencies.data()
};
VkResult result = vkCreateRenderPass( logical_device,
&render_pass_create_info, nullptr, &render_pass );
if( VK_SUCCESS != result ) {
std::cout << "Could not create a render pass." << std::endl;
return false;
}
return true;

为了让render pass正常工作,有关用于所有已定义attachments的特定资源的此类信息存储在framebuffers中。

creating a framebuffer

framebuffers和render passes一起使用.它们指明了render pass里与之关联的attachments使用什么image resources.也定义了可渲染区域的尺寸.

framebuffer总是和render passes一起创建,它们定义了用于渲染过程中指定的attachments的特定image subresources,因此这两种对象类型应相互对应.

当创建frame buffer,提供一个render pass object能使用这个fb.但也不限于用这个特定的render pass使用它.所有可以兼容的render passes都可以用它.

什么是兼容的(compatible)render passes?第一,相同数量的subpasses.每个subpass有compatible的input,color,resolve,depth/stencil attachments.但是,需要记住确保特定区域外的pixels/fragments是未定义的.为此需要在创建pipeline时或者设置对应动态状态时指明一些参数(viewport和scissor test)(相关间:preparing view port and scissor test state c8,graphics and compute piplines ,setting a dynamic viewport and scissors state c9,command recording and drawing).

当开始render pass,使用framebuffer时,需要确保images在framebuffer中指明的subresources不用于其他目的.换句话说,如果使用了image作为framebuffer attachment,就不能降至用于其他render pass.

创建framebuffer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
VkFramebufferCreateInfo framebuffer_create_info = {
VK_STRUCTURE_TYPE_FRAMEBUFFER_CREATE_INFO,
nullptr,
0,
render_pass,
static_cast<uint32_t>(attachments.size()),
attachments.data(),
width,
height,
layers
};
VkResult result = vkCreateFramebuffer( logical_device,
&framebuffer_create_info, nullptr, &framebuffer );
if( VK_SUCCESS != result ) {
std::cout << "Could not create a framebuffer." << std::endl;
return false;
}
return true;

preparing a render pass for geometry rendering and postprocess subpasses

介绍两个subpasses的例子.第一个是有两个attachments–color和depth.第二个从第一个color attachment读取数据并render到另一个color attachment(swapchain image,能显示到屏幕上).

3个attachments

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
std::vector<VkAttachmentDescription> attachments_descriptions = {
{
0,
VK_FORMAT_R8G8B8A8_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL,
},
{
0,
VK_FORMAT_D16_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
},
{
0,
VK_FORMAT_R8G8B8A8_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_STORE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
},
};

第一个是第一个subpass的color attachment和第二个subpass读取数据的.第二个attachment是depth数据;第三个是第二个subpass的color attachment.在render pass之后不需要第一个和第二个attachments了,需要给它们的store operations指明VK_ATTACHMENT_STORE_OP_DONT_CARE.render pass一开始也不需要它们的内容,所以知名一个未定义的layout.也clear三个attachments.

定义2个subpasses:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
VkAttachmentReference depth_stencil_attachment = {
1,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};
std::vector<SubpassParameters> subpass_parameters = {
// #0 subpass
{
VK_PIPELINE_BIND_POINT_GRAPHICS,
{},
{
{
0,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
}
},
{},
&depth_stencil_attachment,
{}
},
// #1 subpass
{
VK_PIPELINE_BIND_POINT_GRAPHICS,
{
{
0,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
}
},
{
{
2,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
}
},
{},
nullptr,
{}
}
};

最后定义两个subpasses关于第一个attachment的dependency,一个时color attachment,另一个是input attachment.然后创建render pass

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
std::vector<VkSubpassDependency> subpass_dependencies = {
{
0,
1,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,
VK_ACCESS_INPUT_ATTACHMENT_READ_BIT,
VK_DEPENDENCY_BY_REGION_BIT
}
};
if( !CreateRenderPass( logical_device, attachments_descriptions,
subpass_parameters, subpass_dependencies, render_pass ) ) {
return false;
}
return true;

reparing a rener pass and a framebuffer with color and depth attachments

3D场景渲染除了需要color attachment还需要给depth testing准备depth attachment.

本节介绍给color和depth数据创建images,创建只有单个subpass(向color和dpeth attachments渲染)的render pass.创建在render pass attachments中使用这两个images的framebuffer.

1.想在render pass里作为RT,usages为COLOR_ATTACHMENT / DEPTH_STENCIL_ATTACHMENT

2.向在render pass之后作为sample 用的textures,usage:SAMPLED

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
if( !Create2DImageAndView( physical_device, logical_device,
VK_FORMAT_R8G8B8A8_UNORM, { width, height }, 1, 1, VK_SAMPLE_COUNT_1_BIT,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT | VK_IMAGE_USAGE_SAMPLED_BIT,
VK_IMAGE_ASPECT_COLOR_BIT, color_image, color_image_memory_object,
color_image_view ) ) {
return false;
}
if( !Create2DImageAndView( physical_device, logical_device,
VK_FORMAT_D16_UNORM, //format
{ width, height }, 1, 1, VK_SAMPLE_COUNT_1_BIT,
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT | VK_IMAGE_USAGE_SAMPLED_BIT,//usage
VK_IMAGE_ASPECT_DEPTH_BIT, depth_image, depth_image_memory_object,
depth_image_view ) ) {
return false;
}

然后将2个attachment指给render pass,他们在render pass开始都clear,render pass之后内容都保持

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
std::vector<VkAttachmentDescription> attachments_descriptions = {
{
0,
VK_FORMAT_R8G8B8A8_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_STORE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL,
},
{
0,
VK_FORMAT_D16_UNORM,
VK_SAMPLE_COUNT_1_BIT,
VK_ATTACHMENT_LOAD_OP_CLEAR,
VK_ATTACHMENT_STORE_OP_STORE,
VK_ATTACHMENT_LOAD_OP_DONT_CARE,
VK_ATTACHMENT_STORE_OP_DONT_CARE,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL,
}
};

下一步是定义一个subpass,将第一个attachment作为color,第二个作为depth/stencil

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
VkAttachmentReference depth_stencil_attachment = {
1,
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL,
};
std::vector<SubpassParameters> subpass_parameters = {
{
VK_PIPELINE_BIND_POINT_GRAPHICS,
{},
{
{
0,
VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL
}
},
{},
&depth_stencil_attachment,
{}
}
};

最后定义subpass和render pass后执行的commands的dependency.

这有必要,因为不想在render pass里正在写数据时其他commands就开始才能够images里读取数据.也创建renderpass和framebufer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
std::vector<VkSubpassDependency> subpass_dependencies = {
{
0,
VK_SUBPASS_EXTERNAL,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,
VK_ACCESS_SHADER_READ_BIT,
0
}
};
if( !CreateRenderPass( logical_device, attachments_descriptions,
subpasses_parameters, subpasses_dependencies, render_pass ) ) {
return false;
}
if( !CreateFramebuffer( logical_device, render_pass,
{ color_image_view,depth_image_view },
width, height, 1, framebuffer ) ) {
return false;
}
return true;

beginning a render pass

创建好render pass和frame buffer且准备开始recording commands绘制geometry,需要一个开始render pass的record操作.这个操作同时也自动开始它的第一个subbpass.在这完成前VkRenderPassBeginInfo:准备就绪

1
2
3
4
5
6
7
8
9
VkRenderPassBeginInfo render_pass_begin_info = {
VK_STRUCTURE_TYPE_RENDER_PASS_BEGIN_INFO,
nullptr,
render_pass,
framebuffer,
render_area,
static_cast<uint32_t>(clear_values.size()),
clear_values.data()
};

clearing values的数组至少要和attachments对应的元素个数一样.只需要给需要clear的提供值.可以提供nullptr.

当开始render pass,需要提供render area的dimentsions.可以和frame buffer的dimension一样也可以更小.我们要确保渲染将局限于指定的区域,否则超出此范围的像素可能会变得未定义.

1
2
vkCmdBeginRenderPass( command_buffer, &render_pass_begin_info,
subpass_contents );

progressing to the next subpass

一个render pass里record的commands分散在subpasses里.当给定subpass里的commands集已经recorded且想给另一个subpass record commands,需要switch(or progress)到下一个subpass.

同一个render pass.

在此操作过程中,将执行适当的layout transitions,并引入memory和执行依赖(与meemory arriers里类似).都是驱动自动进行的,因此如果需要,新的subpass能按照创建render pass指明的方式使用attachemnts.移到下一个subpass也performs multisample resolve operations on specified color attachments.

subpass里的commands可以直接在command buffer里record,也可以在第二个command bufer里间接执行.(Commands in the subpass can be recorded directly, by inlining them in the command buffer, or indirectly by executing a secondary command buffer.)

1
vkCmdNextSubpass( command_buffer, subpass_contents );

ending a render pass

1
vkCmdEndRenderPass( command_buffer );

在一个command bufffer里record这个函数会执行多个操作.引入执行和内存依赖性(如内存屏障中的那些),并执行图像布局转换——将图像从为最后一个子类指定的布局转换为最终布局的值(请参阅指定附件描述方法).Also multisample resolving is performed on color attachments for which resolving was specified in the last subpass. dditionally, for attachments whose contents should be preserved after the render pass, attachment data may be transferred from the cache to the image’s memory.

destroy

destroy a framebuffer

1
2
3
4
if( VK_NULL_HANDLE != framebuffer ) {
vkDestroyFramebuffer( logical_device, framebuffer, nullptr );
framebuffer = VK_NULL_HANDLE;
}

在销毁它前需要确保commands不会继续执行

destroying a render pass

1
2
3
4
if( VK_NULL_HANDLE != render_pass ) {
vkDestroyRenderPass( logical_device, render_pass, nullptr );
render_pass = VK_NULL_HANDLE;
}

Image Presentation

Posted on 2019-04-05 | In sdk , graphics , vulkan

[TOC]

内容

  • 创建一个激活WSI扩展的Vulkan Instance
  • 创建一个presentation surface
  • 选择一个支持已给surface的queue family
  • 创建一个有WSI扩展的logical device
  • 选择一个期望的presentation mode
  • 获得presentation surface的策略
  • 设置swapchain images的大小
  • 选择一个期望的swapchain iamges的方案
  • 选择swapchain images交换方法
  • 选择swapchain images格式
  • 创建swapchain
  • 获得swapchain images的handles
  • 创建一个R8G8B8A8格式和mailbox显示模式的swapchain
  • 请求一个swapchain image
  • 销毁swapchain
  • 销毁presentation surface

介绍

Vulkan API本身由于跨平台考虑,本身不带有显示生成的图像到窗口的接口,但一组扩展接口(Windowing System Integration(WSI))支持了这种操作.每个支持Vulkan的操作系统有它自己的扩展.

最重要的扩展是允许我们创建一个swapchain.swapchain是一组images,能展示(显示)给用户.

步骤

instance with WSI extensions

WSI extension分为Instance 和device levels.

第一步是创建激活了对应扩展运行创建presentation surface的Vulkan Instance

Instance-level extensions负责管理、创建、销毁一个presentation surface.它是一个软件的窗口的(跨平台的)representation.通过它,我们能检查是否能绘制窗口(显示图片、一个queue family的额外属性),能知道它的参数,它支持什么(如果向垂直同步激活或关闭).

1
2
3
4
5
6
7
8
9
10
11
12
desired_extensions.emplace_back( VK_KHR_SURFACE_EXTENSION_NAME );//所有os都支持,用于管理、删除khr
desired_extensions.emplace_back(
#ifdef VK_USE_PLATFORM_WIN32_KHR
VK_KHR_WIN32_SURFACE_EXTENSION_NAME //windows
#elif defined VK_USE_PLATFORM_XCB_KHR
VK_KHR_XCB_SURFACE_EXTENSION_NAME //linux xcb
#elif defined VK_USE_PLATFORM_XLIB_KHR
VK_KHR_XLIB_SURFACE_EXTENSION_NAME //linux xlib
#endif
);
return CreateVulkanInstance( desired_extensions, application_name, instance
);

创建presentation surface

presentation显示软件的窗口,允许我们获取窗口的参数(比如尺寸,支持的颜色格式,请求的images数量,或者显示模式).

前提:windows已经创建

1
2
3
4
5
6
7
8
9
10
11
12
struct WindowParameters {
#ifdef VK_USE_PLATFORM_WIN32_KHR
HINSTANCE HInstance;
HWND HWnd;
#elif defined VK_USE_PLATFORM_XLIB_KHR
Display * Dpy;
Window Window;
#elif defined VK_USE_PLATFORM_XCB_KHR
xcb_connection_t * Connection;
xcb_window_t Window;
#endif
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#ifdef VK_USE_PLATFORM_WIN32_KHR
VkWin32SurfaceCreateInfoKHR surface_create_info = {
VK_STRUCTURE_TYPE_WIN32_SURFACE_CREATE_INFO_KHR,
nullptr,
0,
window_parameters.HInstance,
window_parameters.HWnd
};
VkResult result = vkCreateWin32SurfaceKHR( instance, &surface_create_info,
nullptr, &presentation_surface );
#elif defined VK_USE_PLATFORM_XLIB_KHR
VkXlibSurfaceCreateInfoKHR surface_create_info = {
VK_STRUCTURE_TYPE_XLIB_SURFACE_CREATE_INFO_KHR,
nullptr,
0,
window_parameters.Dpy,
window_parameters.Window
};
VkResult result = vkCreateXlibSurfaceKHR( instance, &surface_create_info,
nullptr, &presentation_surface );
#elif defined VK_USE_PLATFORM_XCB_KHR
VkXcbSurfaceCreateInfoKHR surface_create_info = {
VK_STRUCTURE_TYPE_XCB_SURFACE_CREATE_INFO_KHR,
nullptr,
0,
window_parameters.Connection,
window_parameters.Window
};
VkResult result = vkCreateXcbSurfaceKHR( instance, &surface_create_info,
nullptr, &presentation_surface );
#endif

选择支持给定表面显示的queue family

显示图像是通过提交特色的command到device的queue实现的.所以要求对应的queue支持.

目前queue family可能支持的特性有:Image presentation,graphics,compute,transfer,sparse operations.

1
2
3
4
5
6
7
8
9
10
11
12
for( uint32_t index = 0; index <
static_cast<uint32_t>(queue_families.size()); ++index ) {
VkBool32 presentation_supported = VK_FALSE;
VkResult result = vkGetPhysicalDeviceSurfaceSupportKHR( physical_device,
index, presentation_surface, &presentation_supported );
if( (VK_SUCCESS == result) &&
(VK_TRUE == presentation_supported) ) {
queue_family_index = index;
return true;
}
}
return false;

通过vkGetPhysicalDeviceSurfaceSupportKHR接口进行检查

logical device with WSI extensions

一个device-level WSI扩展允许创建一个swapchain.这是一组被presentation engine管理的images.

VK_KHR_swapchain

一个swapchain,列举了image format,images 数量(双缓存或三缓存),presentation mode(v-sync 激活/关闭).伴随着swapchain创建的images被presentation engine所有和管理.需要使用时,需要请求,绘制,归还到presentation engine.

1
2
3
desired_extensions.emplace_back( VK_KHR_SWAPCHAIN_EXTENSION_NAME );
return CreateLogicalDevice( physical_device, queue_infos,
desired_extensions, desired_features, logical_device );

选择期望的presentation mode

vulkan的swapchain最重要的特性是将图像显示倒屏幕,也是swap’chain的设计目的.

有四种模式.

最简单的是IMMEDIATE模式.会有屏幕撕裂现象

<<<<<<< HEAD
<<<<<<< HEAD
Vulkan API实现都必须支持的是FIFO模式.

FIFO RELAXED是FIFO的简单变体.不同之处是RELAXED模式,只有当图像显示足够快,比刷新率还快时才会在空白期显示图像到屏幕上.如果应用程序显示了一个图像,并且从上次显示到现在所花费的时间大于两个空白周期之间的刷新时间(FIFO queue为空),图像立即显示.因此如果足够快,这里不会有撕裂,但如果绘制得比屏幕刷新慢,会出现撕裂.这个mode与OpenGL的EXT_swap_control_tear扩展类似.

最后一种为mailbox模式.它可以被看做是三重缓冲.有一个只包含一个元素的队列.一个image在这个队列里等待在空白期同步显示(v-sync激活)显示.但当app显示一张image时,新的一张新的image会替换掉队列里的.所以presentation engine总是显示最新的,没有屏幕撕裂.

检查可用modes.vkGetPhysicalDeviceSurfacePresentModesKHR.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
uint32_t present_modes_count = 0;
VkResult result = VK_SUCCESS;
result = vkGetPhysicalDeviceSurfacePresentModesKHR( physical_device,
presentation_surface, &present_modes_count, nullptr );
if( (VK_SUCCESS != result) ||
(0 == present_modes_count) ) {
std::cout << "Could not get the number of supported present modes." <<
std::endl;
return false;
}
std::vector<VkPresentModeKHR> present_modes( present_modes_count );
result = vkGetPhysicalDeviceSurfacePresentModesKHR( physical_device,
presentation_surface, &present_modes_count, &present_modes[0] );
if( (VK_SUCCESS != result) ||
(0 == present_modes_count) ) {
std::cout << "Could not enumerate present modes." << std::endl;
return false;
}

获得了支持的所有modes后,选择一个期望的.如果不支持则选择默认的FIFO(总被支持).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
for( auto & current_present_mode : present_modes ) {
if( current_present_mode == desired_present_mode ) {
present_mode = desired_present_mode;
return true;
}
}
std::cout << "Desired present mode is not supported. Selecting default FIFO
mode." << std::endl;
for( auto & current_present_mode : present_modes ) {
if( current_present_mode == VK_PRESENT_MODE_FIFO_KHR ) {
present_mode = VK_PRESENT_MODE_FIFO_KHR;
return true;
}
}

获得一个presentation surface的功能

当创建一个swapchain时,不能选择我们想要的值而是指定参数.必须提供被presentation surface支持的限制范围内的值.因此为了正确创建swapchain,我们需要获得surface的功能.

1
2
3
4
5
6
7
8
VkResult result = vkGetPhysicalDeviceSurfaceCapabilitiesKHR(
physical_device, presentation_surface, &surface_capabilities );
if( VK_SUCCESS != result ) {
std::cout << "Could not get the capabilities of a presentation surface."
<< std::endl;
return false;
}
return true;

VkSurfaceCapabilitiesKHR

  • swapchain images的最小和最大允许数量
  • 表面的最小、最大、当前范围
  • 支持图像转换(在显示前应用),其实就是是否进行sRGB转换,可能在不同平台支持特性不同,也就是要注意是否要自己进行转换
  • image layers最大数量
  • 支持的usages
  • 支持曲面的alpha值(图像的alpha组件该如何影响应用程序的窗口桌面合成)的支持的组件列表

内容包括

  • 创建一个presentation surface
  • select swapchain images的个数
  • choose swapchain images的尺寸
  • select swapchain chains的期望的usage
  • select swapchain images的transformation
  • select swapchain images的format
  • 创建一个swapchain

select swapchain iamges的数量

当app想向swapchain image里渲染时,必须向prsentation engine请求它.app可以请求多张images,不限制一次请求一张.但可用的images(presentation engine 没在使用的)数量与presentation mode,app当前状态(渲染、显示的历史),images数量(创建swapchain时指定(最小))有关.

相关结构

VkSurfaceCapabilitiesKHR

.minImageCount,一般将minImageCount.+1作为请求数量

.maxImageCount,如果>0则对能创建的images最大数量有限制,就需要修正请求的images 数量了

伴随swapchain创经济的images主要用作显示目的.但它们也表示引擎正常工作.知道它被替换,app不能使用它(image).images立即替换现实中的image或者在队列里等待替换它(v-sync)–基于选择的mode.app只能请求处于unused 状态的image.(可以请求所有unused状态的images).但同时,需要present至少一张image,否则请求操作会死锁.

未使用的映像的数量主要取决于表示模式和使用swapchain创建的images的总数.因此,我们想要创建的图像的数量应该根据我们想要实现的呈现场景(应用程序想同时拥有多少图像)和所选的当前模式来选择.

请求最小数量的Images:

1
2
3
4
5
6
number_of_images = surface_capabilities.minImageCount + 1;
if( (surface_capabilities.maxImageCount > 0) &&
(number_of_images > surface_capabilities.maxImageCount) ) {
number_of_images = surface_capabilities.maxImageCount;
}
return true;

Vulkan API实现都必须支持的是FIFO模式.

选择需求需要的images数量

choose swapchain images size *

通常要适合window大小,支持的dimensions再presentation surface的属性里有.但有的操作系统,iamges的size决定了最终window的大小.

同时也要记住去检查swapchain images的适合的dimensions.

相关结构

VkSurfaceCapabilitiesKHR

VkExtent2D

检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//如果surface_capabilities.currentExtent.width==-1
//则image的size决定windows的size
if( 0xFFFFFFFF == surface_capabilities.currentExtent.width )
{
//这种情况就根据surface 属性调节image size
size_of_images = { 640, 480 };
//范围检查
if( size_of_images.width < surface_capabilities.minImageExtent.width )
{
size_of_images.width = surface_capabilities.minImageExtent.width;
}
else if( size_of_images.width >
surface_capabilities.maxImageExtent.width )
{
size_of_images.width = surface_capabilities.maxImageExtent.width;
}

if( size_of_images.height < surface_capabilities.minImageExtent.height )
{
size_of_images.height = surface_capabilities.minImageExtent.height;
}
else if( size_of_images.height > surface_capabilities.maxImageExtent.height )
{
size_of_images.height = surface_capabilities.maxImageExtent.height;
}
}
else
{
//这种情况就将surface的size作为images的size
size_of_images = surface_capabilities.currentExtent;//currentExtent是创建的windows尺寸
}
return true;

​ 正常情况是将windows size作为images的size,但有的os是swapchain images的尺寸决定的.

select 所需的swapchain iamges使用场景

伴随swapchain创建的images常用作color attachments(VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT).也就是说我们想向它们渲染东西(RT).但不限于此.也有其他用处:可以进行采样,在copy操作时作为数据源,或者作为拷贝目标.这些都是在创建swapchain时的不同的使用usages,但是需要检查usages是否支持.

1
2
image_usage = desired_usages & surface_capabilities.supportedUsageFlags;
return desired_usages == image_usage;

select a transformation of swapchain images

有的(尤其是手机)设备,images能从不同orientations看,有时需要控制图像显示到屏幕上时面向.Vulkan能做到这点,能在显示前指定图像的转换.

相关结构

VkSurfaceTransformFlagBitsKHR

Transformations定义了image在显示到屏幕前如何旋转、镜像.在swapchain创建时,能够指定期望的transformation和presentation engine,并作为显示过程的一部分.

1
2
3
4
5
if( surface_capabilities.supportedTransforms & desired_transform ) {
surface_transform = desired_transform;
} else {
surface_transform = surface_capabilities.currentTransform;
}

select 一种swapchain images 格式

format定义了color分量的数量和每个分量的bits和数据类型.在创建swapchain时

需要决定使用的颜色通道

是否使用uint或float类型

精度

线性、非线性颜色.

​ 但只能选择被支持的特性.

相关结构

VkFormat

VkColorSpaceKHR

VkSurfaceFormatKHR

VkSurfaceKHR

获得所有支持的formats,调用两次vkGetPhysicalDeviceSurfaceFormatsKHR,存在列表VkSurfaceFormatKHR里,如果只返回一个VK_FORMAT_UNDEFINED,也就是说对format没有限制.

1
2
3
4
5
6
if( (1 == surface_formats.size()) &&
(VK_FORMAT_UNDEFINED == surface_formats[0].format) ) {
image_format = desired_surface_format.format;
image_color_space = desired_surface_format.colorSpace;
return true;
}

当返回一系列的VkSurfaceFormatKHR时,需要选择image format好color space都支持的.

1
2
3
4
5
6
7
8
for( auto & surface_format : surface_formats ) {
if( (desired_surface_format.format == surface_format.format) &&
(desired_surface_format.colorSpace == surface_format.colorSpace) ) {
image_format = desired_surface_format.format;
image_color_space = desired_surface_format.colorSpace;
return true;
}
}

最后,如果期望公式不支持,选择第一个吧

1
2
3
4
5
image_format = surface_formats[0].format;
image_color_space = surface_formats[0].colorSpace;
std::cout << "Desired format is not supported. Selecting available format -
colorspace combination." << std::endl;
return true;

创建swapchain

一个swapchain用于显示images到屏幕上.那是一组能被app请求并显示到app窗口上的images.它们有相同的属性(properties).当准备好所有的参数:数量,size,format,swapchain images的使用usage,选择一个支持的先试试modes,就可以创建swapchain了.

一个swapchain是一组Images,伴随swapchain自动创建、销毁.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
VkSwapchainCreateInfoKHR swapchain_create_info = {
VK_STRUCTURE_TYPE_SWAPCHAIN_CREATE_INFO_KHR,
nullptr,
0,
presentation_surface,//surface
image_count,//minImageCount
surface_format.format,
surface_format.colorSpace,//imageColorSpace
image_size,
1,//imageArrayLayers
image_usage,
VK_SHARING_MODE_EXCLUSIVE,//imageSharingMode
0,//queueFamilyIndexCount
nullptr,
surface_transform,
VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR,
present_mode,
VK_TRUE,
old_swapchain
};
VkResult result = vkCreateSwapchainKHR( logical_device,
&swapchain_create_info, nullptr, &swapchain );
if( (VK_SUCCESS != result) ||
(VK_NULL_HANDLE == swapchain) ) {
std::cout << "Could not create a swapchain." << std::endl;
return false;
}
1
vkDestroySwapchainKHR

一个app的窗口只能关联一个swapchain,当创建一个新的swapchain时,需要销毁之前为这个窗口创建的swapchain.

1
2
3
4
if( VK_NULL_HANDLE != old_swapchain ) {
vkDestroySwapchainKHR( logical_device, old_swapchain, nullptr );
old_swapchain = VK_NULL_HANDLE;
}

获得swapchain images的handles

vkGetSwapchainImagesKHR

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
uint32_t images_count = 0;
VkResult result = VK_SUCCESS;
result = vkGetSwapchainImagesKHR( logical_device, swapchain, &images_count,
nullptr );
if( (VK_SUCCESS != result) ||
(0 == images_count) ) {
std::cout << "Could not get the number of swapchain images." <<
std::endl;
return false;
}
swapchain_images.resize( images_count );
result = vkGetSwapchainImagesKHR( logical_device, swapchain, &images_count,
&swapchain_images[0] );
if( (VK_SUCCESS != result) ||
(0 == images_count) ) {
std::cout << "Could not enumerate swapchain images." << std::endl;
return false;
}
return true;

驱动可能创建多余创建swapchain时传参数量的images.我们设置了最小数量,但vulkan实现时可能创建更多.

vulkan中,如果想绘制一个image需要有它的handle.需要创建一个包裹image的image view且在创建framebuffer时用到.framebuffer时一组在渲染过程中用到的images.

得到得是一个数而不是handle本身.这个数字表示使用vkGetSwapchainImagesKHR得到的images数组的索引.因此了解images总数、顺序、handles对于正确使用swapchain和images很必要.

创建一个swapchain(R8G8B8A8 format & mailbox显示模式)

无transformations,标准color attachment image usage.

已有设施

1
2
3
4
5
6
VkPhysicalDevice physical_device;
VkSurfaceKHR presentation_surface;
VkDevice logical_device;
VkSwapchainKHR old_swapchain;
VkPresentModeKHR desired_present_mode;
VkSurfaceCapabilitiesKHR surface_capabilities;
1
2
3
uint32_t number_of_images;
VkExtent2D image_size;
VkImageUsageFlags image_usage = VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
VkPresentModeKHR desired_present_mode;
if( !SelectDesiredPresentationMode( physical_device, presentation_surface,
VK_PRESENT_MODE_MAILBOX_KHR, desired_present_mode ) ) {
return false;
}
VkSurfaceCapabilitiesKHR surface_capabilities;
if( !GetCapabilitiesOfPresentationSurface( physical_device,
presentation_surface, surface_capabilities ) ) {
return false;
}
uint32_t number_of_images;
if( !SelectNumberOfSwapchainImages( surface_capabilities, number_of_images
) ) {
return false;
}
VkExtent2D image_size;
if( !ChooseSizeOfSwapchainImages( surface_capabilities, image_size ) ) {
return false;
}
if( (0 == image_size.width) ||
(0 == image_size.height) ) {
return true;
}
VkImageUsageFlags image_usage;
if( !SelectDesiredUsageScenariosOfSwapchainImages( surface_capabilities,
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT, image_usage ) ) {
return false;
}
VkSurfaceTransformFlagBitsKHR surface_transform;
SelectTransformationOfSwapchainImages( surface_capabilities,
VK_SURFACE_TRANSFORM_IDENTITY_BIT_KHR, surface_transform );
VkFormat image_format;
VkColorSpaceKHR image_color_space;
if( !SelectFormatOfSwapchainImages( physical_device, presentation_surface,
{ VK_FORMAT_R8G8B8A8_UNORM, VK_COLOR_SPACE_SRGB_NONLINEAR_KHR },
image_format, image_color_space ) ) {
return false;
}


if( !CreateSwapchain( logical_device, presentation_surface,
number_of_images, { image_format, image_color_space }, image_size,
image_usage, surface_transform, desired_present_mode, old_swapchain,
swapchain ) ) {
return false;
}
if( !GetHandlesOfSwapchainImages( logical_device, swapchain,
swapchain_images ) ) {
return false;
}
return true;

获得swapchain iamge

vkGetSwapchainImagesKHR

semaphores和fences

semphores用于同步device的queues.不能用于同步app的commands提交.

fences app

1
2
3
4
5
6
7
8
9
10
VkResult result;
result = vkAcquireNextImageKHR( logical_device, swapchain, 2000000000,
semaphore, fence, &image_index );
switch( result ) {
case VK_SUCCESS:
case VK_SUBOPTIMAL_KHR:
return true;
default:
return false;
}

在immediate模式下,可能images都不可用,也就是可能失败.第三个参数是一个超时参数(ns),

如果要让驱动在处理commands前进行等待,semaphore会用到.

app侧等待性能影响更大.

返回值也很重要

如果返回VK_SUBOPTIMAL_KHR,意味着我们能用这个image,但它不再最适合presentation engine.需要重新创建swapchain.但不必立即做.

当返回VK_ERROR_OUT_OF_DATE_KHR时,image就不能用了,我们需要立即重建swapchain.

对于swapchain最后需要注意的时在能使用一张image前,我们需要改变(transition)它的layout,layout时image的内部内存组织–可能跟当前目的不同.如果想用于不同目的就修改它的layout.

比如,用于presentation engine的images必须有VK_IMAGE_LAYOUT_PRESENT_SRC_KHR层.如果用于渲染必须有VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL,改变layout的操作称为transition.

present an image

1
2
3
4
struct PresentInfo {
VkSwapchainKHR Swapchain;
uint32_t ImageIndex;//想显示的image index
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
VkPresentInfoKHR present_info = {
VK_STRUCTURE_TYPE_PRESENT_INFO_KHR,
nullptr,
static_cast<uint32_t>(rendering_semaphores.size()),
rendering_semaphores.size() > 0 ? &rendering_semaphores[0] : nullptr,
static_cast<uint32_t>(swapchains.size()),
swapchains.size() > 0 ? &swapchains[0] : nullptr,
swapchains.size() > 0 ? &image_indices[0] : nullptr,
nullptr
};
result = vkQueuePresentKHR( queue, &present_info );
switch( result ) {
case VK_SUCCESS:
return true;
default:
return false;
}

在提交image前,需要修改其layout为VK_IMAGE_LAYOUT_PRESENT_SRC_KHR否则presentation engine可能无法显示它.

当提交命令时,rendering_semaphores用于同步

destroy

1
2
3
4
if( swapchain ) {
vkDestroySwapchainKHR( logical_device, swapchain, nullptr );
swapchain = VK_NULL_HANDLE;
}
1
2
3
4
if( presentation_surface ) {
vkDestroySurfaceKHR( instance, presentation_surface, nullptr );
presentation_surface = VK_NULL_HANDLE;
}

Command Recording and Drawing

Posted on 2019-04-05 | In sdk , graphics , vulkan

9.Command Recording and Drawing

[TOC]

介绍

内容

  • Clearing a color image
  • Clearing a depth-stencil image
  • Clearing render pass attachments
  • Binding vertex buffers
  • Binding an index buffer
  • Providing data to shaders through push constants
  • Setting viewport state dynamically
  • Setting scissor state dynamically
  • Setting line width state dynamically
  • Setting depth bias state dynamically
  • Setting blend constants state dynamically
  • Drawing a geometry
  • Drawing an indexed geometry
  • Dispatching compute work
  • Executing a secondary command buffer inside a primary command buffer
  • Recording a command buffer that draws a geometry with a dynamic viewport
    and scissor states
  • Recording command buffers on multiple threads
  • Preparing a single frame of animation
  • Increasing performance through increasing the number of separately rendered
    frames

简述

Vulkan设计为图形和计算API.它的主要目的是允许我们用多个厂商生产的grahpics 硬件生成dynamic images.

已经了解了如何创建和管理资源以及在shaders中使用.了解了不同的shader stages和pipeline objects控制rendering state或分发computational work.最后一件事是需要知道如何绘制images的知识.

本文讨论commands.受线学习drawing commands和在我们的source code里管理它们以达到最高性能.最后vulkan API里最强力的能力–在多线程进行record command buffers.

准备

Clearing a color image

vulkan里,给render pass的attachment description设置loadOp为VK_ATTACHMENT_LOAD_OP_CLEAR以clear.

有时,我们不想这么做,需要隐式实现

1
2
3
vkCmdClearColorImage( command_buffer, image, image_layout, &clear_color,
static_cast<uint32_t>(image_subresource_ranges.size()),
image_subresource_ranges.data() );

提供image的handle,layout,sub-resources的数组(mipmap level and/or array layers).

只能清理color image.以及transfer dst usage images.

Clearing a depth-stencil image

1
2
3
vkCmdClearDepthStencilImage( command_buffer, image, image_layout,
&clear_value, static_cast<uint32_t>(image_subresource_ranges.size()),
image_subresource_ranges.data() );

VkClearDepthStencilValue

  • depth when a depth aspect should be cleared
  • stencil for a value used to clear the stencil aspect

Clearing render pass attachments

vkCmdClearAttachments

有时清理attachements of sub-passes.

1
2
3
vkCmdClearAttachments( command_buffer,
static_cast<uint32_t>(attachments.size()), attachments.data(),
static_cast<uint32_t>(rects.size()), rects.data() );

VkClearAttachment

  • aspectMask attachment的aspect(color,depth,stencil)
  • aspectMask 置为VK_IMAGE_ASPECT_COLOR_BIT,指明colorAttachment为当前sub-pass里的color attachemnt,否则忽略
  • clearValue

VkClearRect

  • top-left,width,height

Binding vertex buffers

当进行几何绘制需要指明vertiices数据.至少需要vertex positions(其实也不是必须的,可以shader里生成…).其他数据还有normal,tangent/bitangent,colors,teexture coordinates.这些数据来源于usage为vertex buffer的buffers.需要在dc前绑定这些buffers.

VertexBufferParameters

1
2
3
4
5
6
7
struct VertexBufferParameters {
VkBuffer Buffer;
VkDeviceSize MemoryOffset;
};

std::vector<VertexBufferParameters> named buffers_parameters.
..
1
2
3
4
5
6
7
8
9
std::vector<VkBuffer> buffers;
std::vector<VkDeviceSize> offsets;
for( auto & buffer_parameters : buffers_parameters ) {
buffers.push_back( buffer_parameters.Buffer );
offsets.push_back( buffer_parameters.MemoryOffset );
}
vkCmdBindVertexBuffers( command_buffer, first_binding,
static_cast<uint32_t>(buffers_parameters.size()), buffers.data(),
offsets.data() );

Binding an index buffer

index buffer的usage为index buffer,type为,比如VK_INDEX_TYPE_UINT16,VK_INDEX_TYPE_UINT32

1
vkCmdBindIndexBuffer( command_buffer, buffer, memory_offset, index_type );

Providing data to shaders through push constants

大多数时间使用descriptor set通过buffers或images提供大量数据.为了快速方便提供数据给shader,可以使用push constants.

1
2
3
4
5
6
7
vkCmdPushConstants( command_buffer, 
pipeline_layout,
pipeline_stages,
offset,//4的倍数
size, //4的倍数
data //void*
)

硬件最少支持128bytes.

一个例子

1
2
3
4
std::array<float, 4> color = { 0.0f, 0.7f, 0.4f, 0.1f };
ProvideDataToShadersThroughPushConstants( CommandBuffer, *PipelineLayout,
VK_SHADER_STAGE_FRAGMENT_BIT, 0, static_cast<uint32_t>(sizeof( color[0] ) *
color.size()), &color[0] );
1
2
3
4
5
ProvideDataToShadersThroughPushConstants(...)
{
vkCmdPushConstants( command_buffer, pipeline_layout, pipeline_stages,
offset, size, data );
}

settings

Setting viewport state dynamically

VkViewport

  • left :up left for x
  • top: up left for y
  • width
  • height
1
2
3
> vkCmdSetViewport( command_buffer, first_viewport,
> static_cast<uint32_t>(viewports.size()), viewports.data() );
>

stages dynamic指明是动态的,但数量是创建时就固定了的

Setting scissor state dynamically

scissor额外再viewport dimentsion内添加了一个渲染rectangle区域.总开启.可以静态设置,也可以cb动态设置

VkRect2D

  • x:horizontal offset (in pixels) from up left corner of viewport for x number of offset
  • y:upper left corner
  • width
  • height

vkCmdSetScissor

1
2
3
> vkCmdSetScissor( command_buffer, first_scissor,
> static_cast<uint32_t>(scissors.size()), scissors.data() );
>

Setting line width states dynamically

1
vkCmdSetLineWidth( command_buffer, line_width )

Setting depth bias state dynamically

depth bias可以修正fragment的depth value计算.

depth bias可以对fragment的depth进行offset.通常绘制非常近的objects用到.比如墙上的pictures or posters.这类objects绘制会有z-fighting.

depth bias修正value计算–存储在depth attachment里的depth value.但不会影响渲染的image.也就是不会影响距离感.修正是基于constant factor和fragment的slope.也指明depth bias(clamp)能加的最大或最小值.

1
vkCmdSetDepthBias( command_buffer, constant_factor, clamp, slope_factor );

Setting blend constants states dynamically

blend用于透明物体模拟.通过控制混合英子和操作,得到最终结果.也可以使用constant color进行计算.constant color可以动态设置.

1
vkCmdSetBlendConstants( command_buffer, blend_constants.data() );

drawing

Drawing a geometry

1
2
3
4
5
6
vkCmdDraw( command_buffer, 
vertex_count,
instance_count,
first_vertex,//多models存储到一个vertex buffer里有用
first_instance
);

instance在不改变vertex进行通mesh绘制很有用(ref specifying pipeline vertex binding description,attribute description,and input state,chapter 8,graphics and compute piipeline).

vulkan里没有Default state.

比如descriptor sets或dynamic pipeline states.每次record cb,所有需求的descriptor sets需要绑定给成本,类似作为dynamic的pipeline state必须用对于函数提供值,render pass必须在合适的command buffer里开始.

drawing cam be performed only inside the render pass.

Drawing an indexed geometry

最常用的.

vkCmdDrawIndexed()

去重复顶点,需要额外的index buffer.但在vertex有很多额外数据时很有必要(normal,tangent,bitangent,two texture coordinates).

$\color {red}{新的概念(reuse vertex)}$:Indexed drawing允许硬件重用vertex caching里已经计算的vertices.根据indices,如果已经计算过,reuse.

1
2
vkCmdDrawIndexed( command_buffer, index_count, instance_count, first_index,
vertex_offset, first_instance );

Dispatching compute work

compute pipeline

resource通过且只能通过descriptor sets

可用来进行image post-processing,color correction or blur.physical 计算.

compute shaderdispatched in groups.

vkCmdDispatch( command_buffer, x_size, y_size, z_size );

workgroups

maxComputeWorkGroupCount[3]

硬件最少支持65,535

不能再render passes里进行

Executing a secondary command buffer inside a primary command buffer

vulkan里可以record2中command buffers-primary and secondary.primary command buffers能直接submit到queues.secondary command buffers只能在primary command buffer里执行.

vkCmdExecuteCommands

一般primary command buffers已经足够用来rendering或computing work.但是有时需要把工作分到两种command buffer 里.当想图形硬件执行secondary command buffers时我们能在primary command buffer里这样做:

1
2
3
vkCmdExecuteCommands( command_buffer,
static_cast<uint32_t>(secondary_command_buffers.size()),
secondary_command_buffers.data() );

example

Recording a command buffer that draws a geometry with a dynamic viewport and scissor states

1
2
3
4
5
struct Mesh {
std::vector<float> Data;
std::vector<uint32_t> VertexOffset;
std::vector<uint32_t> VertexCount;
};
1
2
3
4
if( !BeginCommandBufferRecordingOperation( command_buffer,
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, nullptr ) ) {
return false;
}

image memory barrier

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
if( present_queue_family_index != graphics_queue_family_index ) {
ImageTransition image_transition_before_drawing = {
swapchain_image,
VK_ACCESS_MEMORY_READ_BIT,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
present_queue_family_index,
graphics_queue_family_index,
VK_IMAGE_ASPECT_COLOR_BIT
};
SetImageMemoryBarrier( command_buffer, VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, {
image_transition_before_drawing } );
}

start render pass,bind pipeline object

1
2
3
4
BeginRenderPass( command_buffer, render_pass, framebuffer, { { 0, 0 },
framebuffer_size }, clear_values, VK_SUBPASS_CONTENTS_INLINE );
BindPipelineObject( command_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
graphics_pipeline );

设置dynamic states.viewport ,scissor..bind a buffer for vertex data

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
VkViewport viewport = {
0.0f,
0.0f,
static_cast<float>(framebuffer_size.width),
static_cast<float>(framebuffer_size.height),
0.0f,
1.0f,
};
SetViewportStateDynamically( command_buffer, 0, { viewport } );
VkRect2D scissor = {
{
0,
0
},
{
framebuffer_size.width,
framebuffer_size.height
}
};
SetScissorStateDynamically( command_buffer, 0, { scissor } );
BindVertexBuffers( command_buffer, first_vertex_buffer_binding,
vertex_buffers_parameters );

descriptor sets,shaders访问

1
2
BindDescriptorSets( command_buffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
pipeline_layout, index_for_first_descriptor_set, descriptor_sets, {} );

现在可以绘制几何体了.当然还可以设置index buffer,提供push constants值.

1
2
3
4
for( size_t i = 0; i < geometry.Parts.size(); ++i ) {
DrawGeometry( command_buffer, geometry.Parts[i].VertexCount,
instance_count, geometry.Parts[i].VertexOffset, first_instance );
}

在停止record command buffer前,需要end render pass.之后需要另一个transition on a swapchain image.当完成在single frame of animation上进行绘制,想要在swapchain image上显示.为此需要改变它的layout为VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,因为这是presentation engine正确显示image要求的.这个不走需要显示进行.

$\color{red}{注意}$,如果用于graphics operations和presentations的queues不同,需要一个queue ownership transfer.这通过另一个image memory barrier完成.之后,我们能停止record a command buffer.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
EndRenderPass( command_buffer );
if( present_queue_family_index != graphics_queue_family_index ) {
ImageTransition image_transition_before_present = {
swapchain_image,
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT,
VK_ACCESS_MEMORY_READ_BIT,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
graphics_queue_family_index,
present_queue_family_index,
VK_IMAGE_ASPECT_COLOR_BIT
};
SetImageMemoryBarrier( command_buffer,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT, { image_transition_before_present }
);
}
if( !EndCommandBufferRecordingOperation( command_buffer ) ) {
return false;
}
return true;

我们能用这个cb并submit it to a (graphic) queue.只能submit一次,因为flag 为VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT.

submit这个cb之后,能显示到swapchain image上.需要注意submission和presentation operations需要进行同步

advanced

*Recording command buffers on multiple threads

自定义结构

1
2
3
4
struct CommandBufferRecordingThreadParameters {
VkCommandBuffer CommandBuffer;
std::function<bool( VkCommandBuffer )> RecordingFunction;
};

每个线程一个,记录cbs.RecordingFunction定义了一个在独立thread里record command buffer的function.

为了多线程使用vulkan,需要记住一些规则.

第一不能再多线程修改同一个object.比如不能再多线程从同一个pool allocate command buffers或不能从多线程更新descriptor set.

只有再资源时只读的或者时访问分开的资源吗,我们能从多线程访问.但很难追踪哪个资源时哪个线程创建的.通常,资源创建和修改再主线程(rendering thread).

在Vulkan中使用多线程最常见的场景是并发地记录命令缓冲区.这个操作花费大量时间,分开到多线程进行时很有道理的.

当多线程进行record command buffers时需要多线程和每个线程对应一个独立的command pool

command buffer recording不影响其他资源(除了pool).只准备给一个queue submit commands,所以能record任何操作使用任何资源.比如记录多个操作访问同样的图片或descriptor sets.同样的pipelines能同时绑定不同的command buffers.我们也能record operations绘制到同样的attachments里.

1
2
3
4
5
6
std::vector<std::thread> threads( threads_parameters.size() );
for( size_t i = 0; i < threads_parameters.size(); ++i ) {
threads[i] = std::thread::thread(
threads_parameters[i].RecordingFunction,
threads_parameters[i].CommandBuffer );
}

所有thread完成record cbs后需要收集到一起然后submit它们到queue.

真实app里会避免这样创建和销毁threads的方式.相反,使用已有的job/task system并使用它们record需要的cbs.如图.

submission只能再单线程进行(queus,similarly to other resources,cannot be accessed concurrently),需要等待所有线程完成.

1
2
3
4
5
6
7
8
9
10
std::vector<VkCommandBuffer> command_buffers( threads_parameters.size() );
for( size_t i = 0; i < threads_parameters.size(); ++i ) {
threads[i].join();
command_buffers[i] = threads_parameters[i].CommandBuffer;
}
if( !SubmitCommandBuffersToQueue( queue, wait_semaphore_infos,
command_buffers, signal_semaphores, fence ) ) {
return false;
}
return true;

submitting cbs一次只能再一个线程进行.

swapchain object也会发生同样的情况.同时只能在一个线程acquire和present swapchain images.

需要留意将layout 从VK_IMAGE_LAYOUT_PRESENT_SRC_KHR (or VK_IMAGE_LAYOUT_UNDEFINED)转换为VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

Preparing a single frame of animation

Preparing a single frame of animation can be divided into five steps:

  1. Acquiring a swapchain image.
  2. Creating a framebuffer.
  3. Recording a command buffer.
  4. Submitting the command buffer to the queue.
  5. Presenting an image.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
uint32_t image_index;
if( !AcquireSwapchainImage( logical_device, swapchain,
image_acquired_semaphore, VK_NULL_HANDLE, image_index ) ) {
return false;
}
std::vector<VkImageView> attachments = { swapchain_image_views[image_index]
};
if( VK_NULL_HANDLE != depth_attachment ) {
attachments.push_back( depth_attachment );
}
if( !CreateFramebuffer( logical_device, render_pass, attachments,
swapchain_size.width, swapchain_size.height, 1, *framebuffer ) ) {
return false;
}
if( !record_command_buffer( command_buffer, image_index, *framebuffer ) ) {
return false;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
std::vector<WaitSemaphoreInfo> wait_semaphore_infos = wait_infos;
wait_semaphore_infos.push_back( {
image_acquired_semaphore,
VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT
} );
if( !SubmitCommandBuffersToQueue( graphics_queue, wait_semaphore_infos, {
command_buffer }, { ready_to_present_semaphore }, finished_drawing_fence )
) {
return false;
}
PresentInfo present_info = {
swapchain,
image_index
};
if( !PresentImage( present_queue, { ready_to_present_semaphore }, {
present_info } ) ) {
return false;
}
return true;

fence用于GPU确定cb结束

*Increasing performance through increasing the number of separately rendered frames

在等待cb 运行结束这段时间是浪费了的.所以需要独立render multiple frames of animation .

自定义结构体

1
2
3
4
5
6
7
8
struct FrameResources {
VkCommandBuffer CommandBuffer;//单帧独立的comman buffer
VkDestroyer<VkSemaphore> ImageAcquiredSemaphore;//给presentation engine的信号量
VkDestroyer<VkSemaphore> ReadyToPresentSemaphore;//用于知道queue停止运行该cb
VkDestroyer<VkFence> DrawingFinishedFence;//当signaled表示GPU运行完了
VkDestroyer<VkImageView> DepthAttachment;//
VkDestroyer<VkFramebuffer> Framebuffer;
};

用于管理单帧生命周期内管理的资源.

rendering animation是要给循环.一帧绘制,一帧显示.

需要准备多份set

Tests have shown that increasing the number of frame resources from one to two may increase the performance by 50%.

Adding a third set increases the performance further, but the growth isn’t as big this time.

So, the performance gain is smaller with each additional set of frame resources. Three sets of rendering resources seems like a good choice, but we should perform our own tests and see what is best for our specific needs.

check

1
2
3
4
5
6
7
8
9
10
static uint32_t frame_index = 0;
FrameResources & current_frame = frame_resources[frame_index];
if( !WaitForFences( logical_device, { *current_frame.DrawingFinishedFence
}, false, 2000000000 ) ) {
return false;
}
if( !ResetFences( logical_device, { *current_frame.DrawingFinishedFence } )
) {
return false;
}
1
2
3
4
5
6
7
8
9
10
11
12
InitVkDestroyer( logical_device, current_frame.Framebuffer );
if( !PrepareSingleFrameOfAnimation( logical_device, graphics_queue,
present_queue, swapchain, swapchain_size, swapchain_image_views,
*current_frame.DepthAttachment, wait_infos,
*current_frame.ImageAcquiredSemaphore,
*current_frame.ReadyToPresentSemaphore,
*current_frame.DrawingFinishedFence, record_command_buffer,
current_frame.CommandBuffer, render_pass, current_frame.Framebuffer ) ) {
return false;
}
frame_index = (frame_index + 1) % frame_resources.size();
return true;

Descriptor Sets

Posted on 2019-04-05 | In sdk , graphics , vulkan

Descriptor Sets

[TOC]

Creating a sampler
Creating a sampled image
Creating a combined image sampler
Creating a storage image
Creating a uniform texel buffer
Creating a storage texel buffer
Creating a uniform buffer
Creating a storage buffer
Creating an input attachment
Creating a descriptor set layout
Creating a descriptor pool
Allocating descriptor sets
Updating descriptor sets
Binding descriptor sets
Creating descriptors with a texture and a uniform buffer
Freeing descriptor sets
Resetting a descriptor pool
Destroying a descriptor pool
Destroying a descriptor set layout
Destroying a sampler

在现代计算机图形学领域,多数image数据(vertices,pixels,fragments,voxel)的渲染和执行都是通过可编程pipeline和shaders实现.相关的:textures,samplers,buffers,uniform variables.在vulkan里这些通过descriptor sets提供.

Descriptors表示shader resources的不透明的structures.有descriptor set layouts指明内容,组织成groups或sets.为了给shaders提供资源,给Pipelines绑定descriptor sets.我们能一次绑定多个sets.为了从shaders内部访问resources,我们需要指定从哪个set以及从set中的哪个位置(称为binding)获取给定资源.

sampler

create a sampler

Samplers定义了image data加载到shader里的一组参数.包括address caculations(wrapping or repeating)、filtering(linear or nearest)、use mipmaps.

VkSamplerCreateInfo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
VkSamplerCreateInfo sampler_create_info = {
VK_STRUCTURE_TYPE_SAMPLER_CREATE_INFO,
nullptr,
0,
mag_filter,
min_filter,
mipmap_mode,
u_address_mode,
v_address_mode,
w_address_mode,
lod_bias,
anisotropy_enable,
max_anisotropy,
compare_enable,
compare_operator,
min_lod,
max_lod,
border_color,
unnormalized_coords
};
1
2
3
4
5
6
7
VkResult result = vkCreateSampler( logical_device, &sampler_create_info,
nullptr, &sampler );
if( VK_SUCCESS != result ) {
std::cout << "Could not create sampler." << std::endl;
return false;
}
return true;

为了在shader里指明samplr,需要创建一个sampler uniform 变量

形如

1
layout (set=m, binding=n) uniform sampler <variable name>;

create a sampled image

sampled images用于在shaders里从images(textures)读取数据.通常是一起创建,VK_IMAGE_USAGE_SAMPLED_BIT usage.

在shaders里,我们能用多个samplers按照不同方式读取同一个image.也能一个sampler对应多个images.但有些平台,二者是何为一个obj的.

不是所有image foramt都支持sampled iamges.这依赖于app执行平台.下列是总能用于sampled images和linearly filtered sampled images的formats.不限于:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
VK_FORMAT_B4G4R4A4_UNORM_PACK16
VK_FORMAT_R5G6B5_UNORM_PACK16
VK_FORMAT_A1R5G5B5_UNORM_PACK16
VK_FORMAT_R8_UNORM and VK_FORMAT_R8_SNORM
VK_FORMAT_R8G8_UNORM and VK_FORMAT_R8G8_SNORM
VK_FORMAT_R8G8B8A8_UNORM, VK_FORMAT_R8G8B8A8_SNORM, and
VK_FORMAT_R8G8B8A8_SRGB
VK_FORMAT_B8G8R8A8_UNORM and VK_FORMAT_B8G8R8A8_SRGB
VK_FORMAT_A8B8G8R8_UNORM_PACK32, VK_FORMAT_A8B8G8R8_SNORM_PACK32,
and VK_FORMAT_A8B8G8R8_SRGB_PACK32
VK_FORMAT_A2B10G10R10_UNORM_PACK32
VK_FORMAT_R16_SFLOAT
VK_FORMAT_R16G16_SFLOAT
VK_FORMAT_R16G16B16A16_SFLOAT
VK_FORMAT_B10G11R11_UFLOAT_PACK32
VK_FORMAT_E5B9G9R9_UFLOAT_PACK32

其他格式性需要自己检查一下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
VkFormatProperties format_properties;
vkGetPhysicalDeviceFormatProperties( physical_device, format,
&format_properties );
if( !(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_SAMPLED_IMAGE_BIT) ) {
std::cout << "Provided format is not supported for a sampled image." <<
std::endl;
return false;
}
if( linear_filtering &&
!(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT) ) {
std::cout << "Provided format is not supported for a linear image
filtering." << std::endl;
return false;
}

如果满足需求,可以创建一个image,一个memory object,一个image view(vulkan中images are represented with iamge views most of the time).指明usage为VK_IMAGE_USAGE_SAMPLED_BIT

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if( !CreateImage( logical_device, type, format, size, num_mipmaps,
num_layers, VK_SAMPLE_COUNT_1_BIT, usage | VK_IMAGE_USAGE_SAMPLED_BIT,
false, sampled_image ) ) {
return false;
}
if( !AllocateAndBindMemoryObjectToImage( physical_device, logical_device,
sampled_image, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object ) ) {
return false;
}
if( !CreateImageView( logical_device, sampled_image, view_type, format,
aspect, sampled_image_view ) ) {
return false;
}
return true;

当想用一个image作为sampled image,在加载数据到shaders前,我们需要变换image的layout为VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL.

shader里

1
layout (set=m, binding=n) uniform texture2D <variable name>;

create a combined image sampler

创建和分开时是一样的,只是shaders里不一样.descriptor为VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if( !CreateSampler( logical_device, mag_filter, min_filter, mipmap_mode,
u_address_mode, v_address_mode, w_address_mode, lod_bias,
anisotropy_enable, max_anisotropy, compare_enable, compare_operator,
min_lod, max_lod, border_color, unnormalized_coords, sampler ) ) {
return false;
}
bool linear_filtering = (mag_filter == VK_FILTER_LINEAR) || (min_filter ==
VK_FILTER_LINEAR) || (mipmap_mode == VK_SAMPLER_MIPMAP_MODE_LINEAR);
if( !CreateSampledImage( physical_device, logical_device, type, format,
size, num_mipmaps, num_layers, usage, view_type, aspect, linear_filtering,
sampled_image, sampled_image_view ) ) {
return false;
}
return true;

使用sampler关键词

1
layout (set=m, binding=n) uniform sampler2D <variable name>;

有些平台性能更好.

storage

create a storage image

storage image允许我们从images里加载数据到Pipelines,也可以从shader存储数据到images.这类images需要指明usage为VK_IMAGE_USAGE_STORAGE_BIT

虽然可以从这类images里load数据,单数据是unfiltered(所以不能sampler)

descriptors type为VK_DESCRIPTOR_TYPE_STORAGE_IMAGE

需要指明合适的格式,不是所有格式都支持storage images.与平台相关,单下列是都支持的(not limited).

1
2
3
4
5
6
7
8
9
VK_FORMAT_R8G8B8A8_UNORM, VK_FORMAT_R8G8B8A8_SNORM,
VK_FORMAT_R8G8B8A8_UINT, and VK_FORMAT_R8G8B8A8_SINT
VK_FORMAT_R16G16B16A16_UINT, VK_FORMAT_R16G16B16A16_SINT and
VK_FORMAT_R16G16B16A16_SFLOAT
VK_FORMAT_R32_UINT, VK_FORMAT_R32_SINT and VK_FORMAT_R32_SFLOAT
VK_FORMAT_R32G32_UINT, VK_FORMAT_R32G32_SINT and
VK_FORMAT_R32G32_SFLOAT
VK_FORMAT_R32G32B32A32_UINT, VK_FORMAT_R32G32B32A32_SINT and
VK_FORMAT_R32G32B32A32_SFLOAT

如果想要原子操作,只能用如下格式

1
2
VK_FORMAT_R32_UINT
VK_FORMAT_R32_SINT

如果想用其他的,需要检查是否支持或是否支持原子操作.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
VkFormatProperties format_properties;
vkGetPhysicalDeviceFormatProperties( physical_device, format,
&format_properties );
if( !(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_STORAGE_IMAGE_BIT) ) {
std::cout << "Provided format is not supported for a storage image." <<
std::endl;
return false;
}
if( atomic_operations &&
!(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT) ) {
std::cout << "Provided format is not supported for atomic operations on
storage images." << std::endl;
return false;
}

如果支持,如常创建images.指明usage为VK_IMAGE_USAGE_STORAGE_BIT,然后创建memory object,绑定都image,然后是image view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if( !CreateImage( logical_device, type, format, size, num_mipmaps,
num_layers, VK_SAMPLE_COUNT_1_BIT, usage | VK_IMAGE_USAGE_STORAGE_BIT,
false, storage_image ) ) {
return false;
}
if( !AllocateAndBindMemoryObjectToImage( physical_device, logical_device,
storage_image, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object ) ) {
return false;
}
if( !CreateImageView( logical_device, storage_image, view_type, format,
aspect, storage_image_view ) ) {
return false;
}
return true;

在load/store数据前需要设置layout为VK_IMAGE_LAYOUT_GENERAL.这是这些操作为唯一支持的layout.

GLSL的storage images定义的例子

1
layout (set=m, binding=n, r32f) uniform image2D <variable name>;

create uniform texel buffer

uniform texel buffer允许我们想从images里读取数据一样,他们的内容不是一个单值(scalar)的数组,而是格式化整个为pixels(texel)(1,2,3,4种分项).能比images访问更多的数据.

创建一个uniform texel buffer的buffer时usage为VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT

下列是可用于uniform texel buffers(not limited)的formats:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
VK_FORMAT_R8_UNORM, VK_FORMAT_R8_SNORM, VK_FORMAT_R8_UINT, and
VK_FORMAT_R8_SINT
VK_FORMAT_R8G8_UNORM, VK_FORMAT_R8G8_SNORM, VK_FORMAT_R8G8_UINT,
and VK_FORMAT_R8G8_SINT
VK_FORMAT_R8G8B8A8_UNORM, VK_FORMAT_R8G8B8A8_SNORM,
VK_FORMAT_R8G8B8A8_UINT, and VK_FORMAT_R8G8B8A8_SINT
VK_FORMAT_B8G8R8A8_UNORM
VK_FORMAT_A8B8G8R8_UNORM_PACK32, VK_FORMAT_A8B8G8R8_SNORM_PACK32,
VK_FORMAT_A8B8G8R8_UINT_PACK32, and
VK_FORMAT_A8B8G8R8_SINT_PACK32
VK_FORMAT_A2B10G10R10_UNORM_PACK32 and
VK_FORMAT_A2B10G10R10_UINT_PACK32
VK_FORMAT_R16_UINT, VK_FORMAT_R16_SINT and VK_FORMAT_R16_SFLOAT
VK_FORMAT_R16G16_UINT, VK_FORMAT_R16G16_SINT and
VK_FORMAT_R16G16_SFLOAT
VK_FORMAT_R16G16B16A16_UINT, VK_FORMAT_R16G16B16A16_SINT and
VK_FORMAT_R16G16B16A16_SFLOAT
VK_FORMAT_R32_UINT, VK_FORMAT_R32_SINT and VK_FORMAT_R32_SFLOAT
VK_FORMAT_R32G32_UINT, VK_FORMAT_R32G32_SINT and
VK_FORMAT_R32G32_SFLOAT
VK_FORMAT_R32G32B32A32_UINT, VK_FORMAT_R32G32B32A32_SINT and
VK_FORMAT_R32G32B32A32_SFLOAT
VK_FORMAT_B10G11R11_UFLOAT_PACK32

检查是否支持

1
2
3
4
5
6
7
8
9
VkFormatProperties format_properties;
vkGetPhysicalDeviceFormatProperties( physical_device, format,
&format_properties );
if( !(format_properties.bufferFeatures &
VK_FORMAT_FEATURE_UNIFORM_TEXEL_BUFFER_BIT) ) {
std::cout << "Provided format is not supported for a uniform texel
buffer." << std::endl;
return false;
}

然后create a buffer,memory object and bind it to the buffer,create a buffer view:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if( !CreateBuffer( logical_device, size, usage |
VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT, uniform_texel_buffer ) ) {
return false;
}
if( !AllocateAndBindMemoryObjectToBuffer( physical_device, logical_device,
uniform_texel_buffer, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object )
) {
return false;
}
if( !CreateBufferView( logical_device, uniform_texel_buffer, format, 0,
VK_WHOLE_SIZE, uniform_texel_buffer_view ) ) {
return false;
}
return true;

uniform texel buffers,我们需要指明data format,以便shaders按照合适的方式访问buffer的内容,这就是buffer view的作用.

GLSL

1
layout (set=m, binding=n) uniform samplerBuffer <variable name>;

create a torage texel buffer

如果想在shader中存数据到buffer里,需要使用storage buffers,usage为VK_BUFFER_USAGE_STORAGE_BUFFER_BIT

descriptor types:VK_DESCRIPTOR_TYPE_STORAGE_BUFFER 或者VK_DESCRIPTOR_TYPE_STORAGE_BUFFER_DYNAMIC

storage buffers需要注意对齐问题,遵循GLSL的std430是最简单的方式.基本的对齐方式与Uniform buffer的数组和结构体差不多.它们的偏移量不需要四舍五入到16的倍数.规则如下:

  • A scalar variable of size N must be placed at offsets that are a multiple of N

  • A vector with two components, where each component has a size of N, must be

    placed at offsets that are a multiple of 2N

  • A vector with three or four components, where each component has a size of N,

    must be placed at offsets that are a multiple of 4N

  • An array with elements of size N must be placed at offsets that are a multiple of N

  • A structure must be placed at offsets that are a multiple of the biggest offset of

  • any of its members (a member with the biggest offset requirement)

  • A row-major matrix must be placed at an offset equal to the offset of a vector

    with the number of components equal to the number of columns in the matrix

  • A column-major matrix must be placed at the same offsets as its columns

dynamic storage buffers不同之处为它们的base memory offset被定义了.在描述符集更新期间为普通存储缓冲区指定的偏移量和范围在下一次更新之前保持不变.在动态变化的情况下,指定的偏移量将变为基址,随后由描述符集绑定到命令缓冲区时指定的动态偏移量修改.

GLSL中使用关键词buffer

1
2
3
4
5
6
layout (set=m, binding=n) buffer <variable name>
{
vec4 <member 1 name>;
mat4 <member 2 name>;
// ...
};

这里有个非常关键的信息没有说,关于对齐的计算

1
2
3
4
5
6
7
8
9
size_t minUboAlignment = device->properties.limits.minUniformBufferOffsetAlignment;
dynamicAlignment = sizeof (customstruct);
if (minUboAlignment > 0) {
dynamicAlignment = (dynamicAlignment + static_cast<uint32_t>(minUboAlignment - 1)) & ~(static_cast<uint32_t>(minUboAlignment - 1));
}
size_t bufferSize = count * dynamicAlignment;
ptr = (customstruct*)tl::alignedAlloc (bufferSize, dynamicAlignment);
std::cout << "minUniformBufferOffsetAlignment = " << minUboAlignment << std::endl;
std::cout << "dynamicAlignment = " << dynamicAlignment << std::endl;

create an input attachment

attachment是render passes中dc绘制的RT.

对于input attachments,通常为color或depth/stencil attachments,也可能是其他images.

usage:VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT

descriptors type:VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT

vulkan里render passes有一个及以上的subpass,在一个subpass写了attachment,后面的subpass可以读.也是唯一在shaders里读取attachments的方式.

当从input attachments读取数据时,我们仅限于与processed fragment的location对应的location.但这种方法可能比渲染到attachments中、结束render pass、将image绑定到作为sampled image(texture)的descriptor set以及启动另一个不将给定image用作其任何attachments的render pass更为理想.

对于Input attachments,也能用其他images(不必作为color or depth/stencil attachments).只需要使用usage VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT和合适的格式创建.下列格式是强制支持的input attachment(color).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
VK_FORMAT_R5G6B5_UNORM_PACK16
VK_FORMAT_A1R5G5B5_UNORM_PACK16
VK_FORMAT_R8_UNORM, VK_FORMAT_R8_UINT and VK_FORMAT_R8_SINT
VK_FORMAT_R8G8_UNORM, VK_FORMAT_R8G8_UINT, and VK_FORMAT_R8G8_SINT
VK_FORMAT_R8G8B8A8_UNORM, VK_FORMAT_R8G8B8A8_UINT,
VK_FORMAT_R8G8B8A8_SINT, and VK_FORMAT_R8G8B8A8_SRGB
VK_FORMAT_B8G8R8A8_UNORM and VK_FORMAT_B8G8R8A8_SRGB
VK_FORMAT_A8B8G8R8_UNORM_PACK32, VK_FORMAT_A8B8G8R8_UINT_PACK32,
VK_FORMAT_A8B8G8R8_SINT_PACK32, and
VK_FORMAT_A8B8G8R8_SRGB_PACK32
VK_FORMAT_A2B10G10R10_UNORM_PACK32 and
VK_FORMAT_A2B10G10R10_UINT_PACK32
VK_FORMAT_R16_UINT, VK_FORMAT_R16_SINT and VK_FORMAT_R16_SFLOAT
VK_FORMAT_R16G16_UINT, VK_FORMAT_R16G16_SINT and
VK_FORMAT_R16G16_SFLOAT
VK_FORMAT_R16G16B16A16_UINT, VK_FORMAT_R16G16B16A16_SINT, and
VK_FORMAT_R16G16B16A16_SFLOAT
VK_FORMAT_R32_UINT, VK_FORMAT_R32_SINT, and VK_FORMAT_R32_SFLOAT
VK_FORMAT_R32G32_UINT, VK_FORMAT_R32G32_SINT, and
VK_FORMAT_R32G32_SFLOAT
VK_FORMAT_R32G32B32A32_UINT, VK_FORMAT_R32G32B32A32_SINT, and
VK_FORMAT_R32G32B32A32_SFLOAT

depth/stencil 强制支持的

1
2
3
4
5
VK_FORMAT_D16_UNORM
VK_FORMAT_X8_D24_UNORM_PACK32 or VK_FORMAT_D32_SFLOAT (at least one of
these two formats must be supported)
VK_FORMAT_D24_UNORM_S8_UINT or VK_FORMAT_D32_SFLOAT_S8_UINT (at
least one of these two formats must be supported)

其他格式需要检查

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
VkFormatProperties format_properties;
vkGetPhysicalDeviceFormatProperties( physical_device, format,
&format_properties );
if( (aspect & VK_IMAGE_ASPECT_COLOR_BIT) &&
!(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_COLOR_ATTACHMENT_BIT) ) {
std::cout << "Provided format is not supported for an input attachment."
<< std::endl;
return false;
}
if( (aspect & (VK_IMAGE_ASPECT_DEPTH_BIT | VK_IMAGE_ASPECT_DEPTH_BIT)) &&
!(format_properties.optimalTilingFeatures &
VK_FORMAT_FEATURE_DEPTH_STENCIL_ATTACHMENT_BIT) ) {
std::cout << "Provided format is not supported for an input attachment."
<< std::endl;
return false;
}

然后创建image,allocate a memory object(或使用已有的),bind it to the image,create an image view.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if( !CreateImage( logical_device, type, format, size, 1, 1,
VK_SAMPLE_COUNT_1_BIT, usage | VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT, false,
input_attachment ) ) {
return false;
}
if( !AllocateAndBindMemoryObjectToImage( physical_device, logical_device,
input_attachment, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object ) ) {
return false;
}
if( !CreateImageView( logical_device, input_attachment, view_type, format,
aspect, input_attachment_image_view ) ) {
return false;
}
return true;

需要准备合适的render pass的descrition,包括framebuffers的image views.

GLSL

1
2
layout (input_attachment_index=i, set=m, binding=n) uniform subpassInput
<variable name>;

descriptor

create a descriptor set layout

descritor sets将很多resources(descriptors)收集到一个object里.之后再pipeline中建立了app和shaders的接口.但是硬件要知道什么资源组织在set里,每种有多少,什么顺序,我们需要创建descriptor set layout.

descriptor set layout指明了descriptor set的核心结构,同时,严格定义了什么资源能被它bound.

当创建layout需要知道什么资源(descriptor types)会被用以及他们的顺序.顺序是通过bindings指明的.shader里的index.

1
layout (set=m, binding=n) // variable definition

西安志明所有资源的列表

1
2
3
4
5
6
7
VkDescriptorSetLayoutCreateInfo descriptor_set_layout_create_info = {
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(bindings.size()),
bindings.data()
};

create layout

1
2
3
4
5
6
7
8
VkResult result = vkCreateDescriptorSetLayout( logical_device,
&descriptor_set_layout_create_info, nullptr, &descriptor_set_layout );
if( VK_SUCCESS != result ) {
std::cout << "Could not create a layout for descriptor sets." <<
std::endl;
return false;
}
return true;

descriptor set layouts也form了一个个pipeline layout,定义了已给pipeline能访问的resources type.created layouts是创建pipeline layout的一部分也是descriptor set allocation要求的.

createa a descriptor pool

descriptors由descriptor pools分配.创建descriptor pools时需要指明什么descriptors,多少,能从该pool创建.

VkDescriptorPoolCreateInfo

1
2
3
4
5
6
7
8
9
10
VkDescriptorPoolCreateInfo descriptor_pool_create_info = {
VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO,
nullptr,
free_individual_sets ?

VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT : 0,
max_sets_count,
static_cast<uint32_t>(descriptor_types.size()),
descriptor_types.data()
};

注意多线程问题.

allocating descriptor sets

Descriptor sets汇集shader resources(descriptors)到一个object容器里.它的内容,types,资源数量由descritpor set layout定义.storage 从池里取,

descriptor sets提供了shaders的resources.他们形成了app和programmable pipeline stages的interface.这个interface的结构由descriptor set layouts定义.当使用image或buffer resources更新descriptor sets时提供了真实数据,然后在recording 操作绑定descriptor sets到cb.

1
2
3
4
5
6
7
VkDescriptorSetAllocateInfo descriptor_set_allocate_info = {
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO,
nullptr,
descriptor_pool,
static_cast<uint32_t>(descriptor_set_layouts.size()),
descriptor_set_layouts.data()
};

然后,allocate descriptor sets

1
2
3
4
5
6
7
8
descriptor_sets.resize( descriptor_set_layouts.size() );
VkResult result = vkAllocateDescriptorSets( logical_device,
&descriptor_set_allocate_info, descriptor_sets.data() );
if( VK_SUCCESS != result ) {
std::cout << "Could not allocate descriptor sets." << std::endl;
return false;
}
return true;

不幸的是,当我们分配和释放separate descriptor sets时,池的内存可能会变得fragmented(支离破碎).此时即使还没到上限也不能申请新的sets了.如下

第一次allocate descriptors sets,fragmentation问题不会发生

另外,如果每个descritor sets使用相同类型相同数量的资源也不会产生这个问题.

为了避免这个问题,需要释放一次释放所有descriptor sets,否则,只能创建一个新的pool.

updating descriptor sets

现在想提供特定的资源(samplers,image views,buffers,buffer views)(之后通过descriptor sets绑定到pipeline).定义应该使用的资源是通过update descriptor sets的过程来完成的.

一些自定义结构

samplers

1
2
3
4
5
6
7
struct ImageDescriptorInfo {
VkDescriptorSet TargetDescriptorSet;
uint32_t TargetDescriptorBinding;
uint32_t TargetArrayElement;
VkDescriptorType TargetDescriptorType;
std::vector<VkDescriptorImageInfo> ImageInfos;
};

uniform 和 storage buffers

1
2
3
4
5
6
7
struct BufferDescriptorInfo {
VkDescriptorSet TargetDescriptorSet;
uint32_t TargetDescriptorBinding;
uint32_t TargetArrayElement;
VkDescriptorType TargetDescriptorType;
std::vector<VkDescriptorBufferInfo> BufferInfos;
};

uniform and storage texel buffer

1
2
3
4
5
6
7
struct TexelBufferDescriptorInfo {
VkDescriptorSet TargetDescriptorSet;
uint32_t TargetDescriptorBinding;
uint32_t TargetArrayElement;
VkDescriptorType TargetDescriptorType;
std::vector<VkBufferView> TexelBufferViews;
};

可以从另一个descriptor拷贝.

1
2
3
4
5
6
7
8
9
struct CopyDescriptorInfo {
VkDescriptorSet TargetDescriptorSet;
uint32_t TargetDescriptorBinding;
uint32_t TargetArrayElement;
VkDescriptorSet SourceDescriptorSet;
uint32_t SourceDescriptorBinding;
uint32_t SourceArrayElement;
uint32_t DescriptorCount;
};

前面所有的结构都定义了应该更新的descriptor set的句柄、给定集内描述符的索引以及数组中的索引如果要更新的话.通过数组访问的描述符.其余参数是特定于类型的.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
std::vector<VkWriteDescriptorSet> write_descriptors;
for( auto & image_descriptor : image_descriptor_infos ) {
write_descriptors.push_back( {
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
nullptr,
image_descriptor.TargetDescriptorSet,
image_descriptor.TargetDescriptorBinding,
image_descriptor.TargetArrayElement,
static_cast<uint32_t>(image_descriptor.ImageInfos.size()),
image_descriptor.TargetDescriptorType,
image_descriptor.ImageInfos.data(),
nullptr,
nullptr
} );
}
for( auto & buffer_descriptor : buffer_descriptor_infos ) {
write_descriptors.push_back( {
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
nullptr,
buffer_descriptor.TargetDescriptorSet,
buffer_descriptor.TargetDescriptorBinding,
buffer_descriptor.TargetArrayElement,
static_cast<uint32_t>(buffer_descriptor.BufferInfos.size()),
buffer_descriptor.TargetDescriptorType,
nullptr,
buffer_descriptor.BufferInfos.data(),
nullptr
} );
}
for( auto & texel_buffer_descriptor : texel_buffer_descriptor_infos ) {
write_descriptors.push_back( {
VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET,
nullptr,
texel_buffer_descriptor.TargetDescriptorSet,
texel_buffer_descriptor.TargetDescriptorBinding,
texel_buffer_descriptor.TargetArrayElement,
static_cast<uint32_t>(texel_buffer_descriptor.TexelBufferViews.size()),
texel_buffer_descriptor.TargetDescriptorType,
nullptr,
nullptr,
texel_buffer_descriptor.TexelBufferViews.data()
} );
}

也能复用其他sets的descriptor,更快.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::vector<VkCopyDescriptorSet> copy_descriptors;
for( auto & copy_descriptor : copy_descriptor_infos ) {
copy_descriptors.push_back( {
VK_STRUCTURE_TYPE_COPY_DESCRIPTOR_SET,
nullptr,
copy_descriptor.SourceDescriptorSet,
copy_descriptor.SourceDescriptorBinding,
copy_descriptor.SourceArrayElement,
copy_descriptor.TargetDescriptorSet,
copy_descriptor.TargetDescriptorBinding,
copy_descriptor.TargetArrayElement,
copy_descriptor.DescriptorCount
} );
}

update descriptor sets

1
2
3
vkUpdateDescriptorSets( logical_device,
static_cast<uint32_t>(write_descriptors.size()), write_descriptors.data(),
static_cast<uint32_t>(copy_descriptors.size()), copy_descriptors.data() );

binding descriptor sets

创建好descriptor set后,需要在recording 操作中将之绑定到cb.

1
2
3
4
5
6
7
8
9
10
VkCommandBuffer command_buffer;
VkPipelineLayout pipeline_layout;
td::vector<VkDescriptorSet> descriptor_sets;
uint32_t index_for_first_set;
std::vector<uint32_t> dynamic_offsets;
vkCmdBindDescriptorSets( command_buffer, pipeline_type,
pipeline_layout, index_for_first_set, static_cast<uint32_t>
(descriptor_sets.size()), descriptor_sets.data(),
static_cast<uint32_t>(dynamic_offsets.size()),
dynamic_offsets.data() )

当我们record a command buffer,它的state是未定义的额.在record 与image 或 buffer资源相关的drawing操作前,我们需要给cb绑定合适的resources.通过vkCmdBindDescriptorSets()绑定descriptor sets实现的.

create descriptors with a texture and a uniform buffer

创建a combined image sampler和a uniform buffer为descriptors创建做准备

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
if( !CreateCombinedImageSampler( physical_device, logical_device,
VK_IMAGE_TYPE_2D, VK_FORMAT_R8G8B8A8_UNORM, sampled_image_size, 1, 1,
VK_IMAGE_USAGE_TRANSFER_DST_BIT,
VK_IMAGE_VIEW_TYPE_2D, VK_IMAGE_ASPECT_COLOR_BIT, VK_FILTER_LINEAR,
VK_FILTER_LINEAR, VK_SAMPLER_MIPMAP_MODE_NEAREST,
VK_SAMPLER_ADDRESS_MODE_REPEAT,
VK_SAMPLER_ADDRESS_MODE_REPEAT, VK_SAMPLER_ADDRESS_MODE_REPEAT, 0.0f,
false, 1.0f, false, VK_COMPARE_OP_ALWAYS, 0.0f, 0.0f,
VK_BORDER_COLOR_FLOAT_OPAQUE_BLACK, false,
sampler, sampled_image, sampled_image_memory_object, sampled_image_view )
) {
return false;
}
if( !CreateUniformBuffer( physical_device, logical_device,
uniform_buffer_size, VK_BUFFER_USAGE_TRANSFER_DST_BIT, uniform_buffer,
uniform_buffer_memory_object ) ) {
return false;
}

然后准备定义descriptor set核心结构的layout

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
std::vector<VkDescriptorSetLayoutBinding> bindings = {
{
0,
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
1,
VK_SHADER_STAGE_FRAGMENT_BIT,
nullptr
},
{
1,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
1,
VK_SHADER_STAGE_VERTEX_BIT | VK_SHADER_STAGE_FRAGMENT_BIT,
nullptr
}
};
if( !CreateDescriptorSetLayout( logical_device, bindings,
descriptor_set_layout ) ) {
return false;
}

组后,用一开始创建的resources更新descriptor set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
std::vector<ImageDescriptorInfo> image_descriptor_infos = {
{
descriptor_sets[0],
0,
0,
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
{
{
sampler,
sampled_image_view,
VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL
}
}
}
};
std::vector<BufferDescriptorInfo> buffer_descriptor_infos = {
{
descriptor_sets[0],
1,
0,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
{
{
uniform_buffer,
0,
VK_WHOLE_SIZE
}
}
}
};
UpdateDescriptorSets( logical_device, image_descriptor_infos,
buffer_descriptor_infos, {}, {} );
return true;

destroy

free descriptor sets

如果向将descriptor set的内存归还给pool,可以free它.归还后可以用来创建另一个,但是可能会由于Pool内存的碎片化导致失败.

能够一次释放多个descriptor sets,(来自同一个pool的)

1
2
3
4
5
6
7
8
9
VkResult result = vkFreeDescriptorSets( logical_device, descriptor_pool,
static_cast<uint32_t>(descriptor_sets.size()), descriptor_sets.data() );
if( VK_SUCCESS != result ) {
std::cout << "Error occurred during freeing descriptor sets." <<
std::endl;
return false;
}
descriptor_sets.clear();
return true;

reset a descriptor pool

能够一次释放一个pool的所有descriptor sets.

如果pool创建flag没有VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT,该方法是唯一释放其descriptor sets的方法.

1
2
3
4
5
6
7
VkResult result = vkResetDescriptorPool( logical_device, descriptor_pool, 0
);
if( VK_SUCCESS != result ) {
std::cout << "Error occurred during descriptor pool reset." << std::endl;
return false;
}
return true;

destroy a descriptor pool

1
2
3
4
if( VK_NULL_HANDLE != descriptor_pool ) {
vkDestroyDescriptorPool( logical_device, descriptor_pool, nullptr );
descriptor_pool = VK_NULL_HANDLE;
}

destroy a descriptor set layout

1
2
3
4
5
if( VK_NULL_HANDLE != descriptor_set_layout ) {
vkDestroyDescriptorSetLayout( logical_device, descriptor_set_layout,
nullptr );
descriptor_set_layout = VK_NULL_HANDLE;
}

destroy a sampler

1
2
3
4
if( VK_NULL_HANDLE != sampler ) {
vkDestroySampler( logical_device, sampler, nullptr );
sampler = VK_NULL_HANDLE;
}

Resources and Memory

Posted on 2019-04-05 | In sdk , graphics , vulkan

Resources and Memory

[TOC]

内容

Creating a buffer
Allocating and binding a memory object for a buffer
Setting a buffer memory barrier
Creating a buffer view
Creating an image
Allocating and binding a memory object to an image
Setting an image memory barrier
Creating an image view
Creating a 2D image and view
Creating a layered 2D image with a CUBEMAP view
Mapping, updating, and unmapping host-visible memory
Copying data between buffers
Copying data from a buffer to an image
Copying data from an image to a buffer
Using a staging buffer to update a buffer with a device-local memory bound
Using a staging buffer to update an image with a device-local memory bound
Destroying an image view
Destroying an image
Destroying a buffer view
Freeing a memory object
Destroying a buffer

Vulkan里非常重要的存储数据的资源时buffers和images.buffers存储linear数组数据.Images和OpenGL的textures类似,有1D,2D,3D.Buffers和Images可以用于很多目的:shaders里可以read或者sample数据,或者存储数据.Images可以用于color或者depth/stencil绑定(RT),也就是说可以渲染到其上.Buffer还可以存储顶点数据、indices,parameters(indiret drawing).

重要的是提及的所有usages需要在创建资源时指明(可以一次提供很多).

Vulkan里buffers和images没有自己的storage,需要创建和绑定memory objects.

本节介绍如何使用这些资源、如何申请缓存和绑定、如何上传数据到GPU、如何在资源见进行拷贝.

buffer

创建buffer

buffers能用于很多目的.它们能通过descriptor sets在pipelines中统一uniform buffers、storage buffers、texel buffer等的后备缓冲.它们能作为vertex indices或者attributes的数据源,或者暂存从CPU到GPU移动数据的staging resources.为了这些目的,我们需要创建buffer和指定它的usage.

buffers只能用于创建时指定的usages.

buffers支持的使用方式列表

flag description
VK_BUFFER_USAGE_TRANSFER_SRC_BIT specifies that the buffer can be a source of data for copy operations
VK_BUFFER_USAGE_TRANSFER_DST_BIT specifies that we can copy data to the buffer
VK_BUFFER_USAGE_UNIFORM_TEXEL_BUFFER_BIT indicates that the buffer can be used in shaders as a uniform texel buffer
VK_BUFFER_USAGE_STORAGE_TEXEL_BUFFER_BIT specifies that the buffer can be used in shaders as a storage texel buffer
VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT indicates that the buffer can be used in shaders as a source of values for uniform variables
VK_BUFFER_USAGE_STORAGE_BUFFER_BIT indicates that we can store data in the buffer from within shaders
VK_BUFFER_USAGE_INDEX_BUFFER_BIT specifies that the buffer can be used as a source of vertex indices during drawing
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT indicates that the buffer can be a source of data for vertex attributes specified during drawing
VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT indicates that the buffer can contain data that will be used during indirect drawing
1
2
3
4
5
6
7
8
9
10
VkBufferCreateInfo buffer_create_info = {
VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO,
nullptr,
0,
size,
usage,
VK_SHARING_MODE_EXCLUSIVE,
0,
nullptr
};

之前的VK_SHARING_MODE_EXCLUSIVE(sharingMode)是一个非常重要的参数.通过它我们能指定多families里的queues能否同时访问buffer.Exclusive shaing mode(独占)告诉驱动程序缓冲区只能由一个系列中的队列一次引用.如果想从另一个family提交的commands使用buffer,必须在所有者改变时明确告诉驱动(从一个family改为另一个).这样性能更好但也更麻烦.

我们也可以指明VK_SHARING_MODE_CONCURRENT,这样多families多queues能够同时访问一个buffer,并且不用进行所有者转换,但并发性能可能很低.

创建buffer

1
2
3
4
5
6
7
VkResult result = vkCreateBuffer( logical_device, &buffer_create_info,
nullptr, &buffer );
if( VK_SUCCESS != result ) {
std::cout << "Could not create a buffer." << std::endl;
return false;
}
return true;

allocating and binding a memory object for a buffer

Vulkan里buffers和images没有自己的memroy,需要allocate memory object并绑定.

关于内存管理

https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator

https://www.youtube.com/watch?v=rXSdDE7NWmA

过程

1.take VkPhysicalDevice physical_device

2.create VkPhysicalDeviceMemoryProperties physical_device_memory_properties

3.clall vkGetPhysicalDeviceMemoryProperties( physical_device,&physical_device_memory_properties )会存储内存相关参数(heaps的数量,大小,types)

4.take VkDevice logical_device

5.take VkBuffer buffer

6.create VkMemoryRequirements memory_requirements.

7.call vkGetBufferMemoryRequirements(logical_device, buffer,&memory_requirements);

8.create VkDeviceMemory memory_object = VK_NULL_HANDLE;

9.create VkMemoryPropertyFlagBits memory_properties

10.遍历physical device的physical_device_memory_properties的内存types,每次循环进行如下操作:

1.确保memory_requirements.memoryTypeBits 设置了

2.确保memory_properties变量的位设置与memory type的propertyflags成员相同,该成员位于物理_device_memory_properties变量中memorytypes数组的索引类型.

3.如果1、2为false,continue

4.创建VkMemoryAllocateInfo buffer_memory_allocate_info

.allocationSize = memory_requirements.size,

.memoryTypeIndex = type

5.vkAllocateMemory( logical_device,&buffer_memory_allocate_info, nullptr, &memory_object)

6.确认结果为VK_SUCCESS

11.确保allocate的memory object成功

12.绑定,call vkBindBufferMemory(logical_device, buffer, memory_object, 0 )

13.确保call成功

为了给buffer allocate a memory,需要知道可用的memory types(physical device)以及有多少.

1
2
3
VkPhysicalDeviceMemoryProperties physical_device_memory_properties;
vkGetPhysicalDeviceMemoryProperties( physical_device,
&physical_device_memory_properties );

接下来,我们需要知道给定缓冲区需要多少存储(缓冲区的内存可能需要大于缓冲区的大小),以及与之兼容的内存类型.这些信息存储在VkMemoryRequirements:

1
2
VkMemoryRequirements memory_requirements;
vkGetBufferMemoryRequirements(logical_device, buffer, &memory_requirements);

接下来,我们需要检查哪个内存类型对应于缓冲区的内存需求:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
memory_object = VK_NULL_HANDLE;
for( uint32_t type = 0; type < physical_device_memory_properties.memoryTypeCount; ++type )
{
if( (memory_requirements.memoryTypeBits & (1 << type)) &&
((physical_device_memory_properties.memoryTypes[type].propertyFlags &
memory_properties) == memory_properties) )
{
VkMemoryAllocateInfo buffer_memory_allocate_info =
{
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
nullptr,
memory_requirements.size,
type
};
VkResult result = vkAllocateMemory( logical_device,
&buffer_memory_allocate_info, nullptr, &memory_object );
if( VK_SUCCESS == result )
{
break;
}
}
}

这里遍历所有可用memory types以检查是否能够用于我们的buffer,我们也能检查一些额外的属性,比如如果想直接从CPU上传数据到GPU,memory mapping必须支持.这种情况下,我们需要使用的memory type为host-visible.

当我们找到合适的memory type后,我们能用它allocate memory object并break,然后可以绑定了:

1
2
3
4
5
6
7
8
9
10
11
if( VK_NULL_HANDLE == memory_object ) {
std::cout << "Could not allocate memory for a buffer." << std::endl;
return false;
}
VkResult result = vkBindBufferMemory( logical_device, buffer,
memory_object, 0 );
if( VK_SUCCESS != result ) {
std::cout << "Could not bind memory object to a buffer." << std::endl;
return false;
}
return true;

绑定时,指定了offset等其他参数.这个参数在内存管理里很有用.

通常来说,不会为每个buffer使用一个分开的memory object.而是allocate很大的memory objects,多个buffers各使用其一部分.在这种方式中,我们通过call vkGetPhysicalDeviceMemoryProperties来获取物理设备的可用内存类型.但通常来说,为了提高app的性能,不会每次需要allocate memory object的时候都调用它.我们只需要调用一次,在我们选择一个physical device后可以使用存储的参数.

set a buffer memory barrier

我们必须通知一个驱动程序每一个这样的使用,不仅在缓冲区创建期间,而且在预期的使用之前.当我们出于一个目的使用缓冲区,并且从现在开始我们想以不同的方式使用它时,我们必须告诉驱动程序缓冲区的使用发生了变化.这是通过缓冲存储器屏障(barrier)实现的.在cb record时,它们作为pipeline barriers的一部分.

自定义一个结构体

1
2
3
4
5
6
7
8
9
struct BufferTransition {
VkBuffer Buffer;
//VkAccessFlags buffer如何使用
VkAccessFlags CurrentAccess;
VkAccessFlags NewAccess;
//用于想将buffer用于不同families的queue时(exclusive sharing mode时用到)
uint32_t CurrentQueueFamily;
uint32_t NewQueueFamily;
};

在Vulkan里,提交给queues的opoerations是按顺序执行的,但也是独立的.有时有些操作需要等待其他操作结束后才能执行,这时候memory barriers就有用了.

memory barriers用于定义命令缓冲区执行中的时刻,在这些时刻中,后面的命令应该等待前面的命令完成它们的工作.它们还使这些操作的结果对其他操作可见。

image

使用目的

type description
VK_IMAGE_USAGE_TRANSFER_SRC_BIT specifies that the image can be used as a source of data for copy operations
VK_IMAGE_USAGE_TRANSFER_DST_BIT specifies that we can copy data to the image
VK_IMAGE_USAGE_SAMPLED_BIT indicates that we can sample data from the image inside shaders
VK_IMAGE_USAGE_STORAGE_BIT specifies that the image can be used as a storage image inside shaders
VK_IMAGE_USAGE_COLOR_ATTACHMENT_BIT specifies that we can render into an image (use it as a color render target/attachment in a framebuffer)
VK_IMAGE_USAGE_DEPTH_STENCIL_ATTACHMENT_BIT indicates that the image can be used as a depth and/or stencil buffer (as a depth render target/attachment

in a framebuffer) |
| VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT | indicates that the memory bound to the image will be allocated lazily (on demand) |
| VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT | specifies that the image can be used as an input attachment inside shaders |

不同的usage情景要求使用不同的image layout,通过使用image memory barriers进行改变,我们需需要指明VK_IMAGE_LAYOUT_UNDEFINED,如果不在意初始化内容,或者VK_IMAGE_LAYOUT_PREINITIALIZED如果想通过mapping host-visible memory上传数据.在使用前总需要transition to another layout.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
VkImageCreateInfo image_create_info = {
VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO,
nullptr,
cubemap ? VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT : 0u,
type,
format,
size,
num_mipmaps,
cubemap ? 6 * num_layers : num_layers,
samples,
VK_IMAGE_TILING_OPTIMAL,
usage_scenarios,
VK_SHARING_MODE_EXCLUSIVE,
0,
nullptr,
VK_IMAGE_LAYOUT_UNDEFINED
};

创建Image时需要指明tiling

  • linear tiling:在memory中线性存储,这允许我们映射图像的内存并直接从应用程序读取或初始化它,因为我们知道内存是如何组织的.但对使用目的有严格显示,比如不能作为depth texture或cubemap.会降低性能
  • optimal tiling:能用于所有目的,性能更好,作为代价不知道image得memory如何组织.

不同硬件存储image数据方式不同.所以不能在app中直接mapping、初始化、读image的内存.此时,可以使用staging resources.

1
2
3
4
5
6
7
VkResult result = vkCreateImage( logical_device, &image_create_info,
nullptr, &image );
if( VK_SUCCESS != result ) {
std::cout << "Could not create an image." << std::endl;
return false;
}
return true;

allocating and binding a memory object to an image

vkGetImageMemoryRequirements

vkBindImageMemory

先检查可用的memory types.

1
2
3
VkPhysicalDeviceMemoryProperties physical_device_memory_properties;
vkGetPhysicalDeviceMemoryProperties( physical_device,
&physical_device_memory_properties );

然后给image请求指定的memory requirememts.每个image可能不同,与format,size,mipmaps和layers的数量和其他属性有关.

1
2
VkMemoryRequirements memory_requirements;
vkGetImageMemoryRequirements( logical_device, image, &memory_requirements);

下一步是找到与image的memory requirements适配的memory type

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
memory_object = VK_NULL_HANDLE;
for( uint32_t type = 0; type <physical_device_memory_properties.memoryTypeCount; ++type ) {
if( (memory_requirements.memoryTypeBits & (1 << type)) &&
((physical_device_memory_properties.memoryTypes[type].propertyFlags &
memory_properties) == memory_properties) ) {
VkMemoryAllocateInfo image_memory_allocate_info = {
VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO,
nullptr,
memory_requirements.size,
type
};
VkResult result = vkAllocateMemory( logical_device,
&image_memory_allocate_info, nullptr, &memory_object );
if( VK_SUCCESS == result ) {
break;
}
}
}
1
2
3
4
5
6
7
8
9
10
11
if( VK_NULL_HANDLE == memory_object ) {
std::cout << "Could not allocate memory for an image." << std::endl;
return false;
}
VkResult result = vkBindImageMemory( logical_device, image, memory_object,
0 );
if( VK_SUCCESS != result ) {
std::cout << "Could not bind memory object to an image." << std::endl;
return false;
}
return true;

申请大内存共享,可以提高性能,可以减少memory浪费.

setting an image memory barrier

image 用于texture,RT(通过descriptor sets),swapchain的images.拷贝(目标或源)

定义一个结构体

1
2
3
4
5
6
7
8
9
10
11
12
struct ImageTransition {
VkImage Image;
//
VkAccessFlags CurrentAccess;
VkAccessFlags NewAccess;
//不用image usages有不用的layout,所以改变目的时需要确保
VkImageLayout CurrentLayout;
VkImageLayout NewLayout;
uint32_t CurrentQueueFamily;
uint32_t NewQueueFamily;
VkImageAspectFlags Aspect;//usage.color,depth,stencil
};

在我们不想改变ownerships时我们能使用VK_QUEUE_FAMILY_IGNORED.

内存屏障用于定义命令缓冲区执行中的时刻,在这些时刻中,后面的命令应该等待前面的命令完成其任务.它们还使这些操作的结果对其他操作可见.

Barriers用于让后续commands 的memory操作可见

为了性能,最好给特定的usages用image memory layout,尽管需要注意频繁转换用处的layout.

VkImageMemoryBarrier

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
std::vector<VkImageMemoryBarrier> image_memory_barriers;
for( auto & image_transition : image_transitions ) {
image_memory_barriers.push_back( {
VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER,
nullptr,
image_transition.CurrentAccess,
image_transition.NewAccess,
image_transition.CurrentLayout,
image_transition.NewLayout,
image_transition.CurrentQueueFamily,
image_transition.NewQueueFamily,
image_transition.Image,
{
image_transition.Aspect,
0,
VK_REMAINING_MIP_LEVELS,
0,
VK_REMAINING_ARRAY_LAYERS
}
} );
}

需要定义pipeline stages

左侧vertex等待fragment结束

右侧fragment等待vertex结束,减少barrier数量很重要,如果需要以正确设置绘图命令并为屏障选择适当的pipeline stages

1
2
3
4
5
6
if( image_memory_barriers.size() > 0 ) {
vkCmdPipelineBarrier( command_buffer, generating_stages,
consuming_stages, 0, 0, nullptr, 0, nullptr,
static_cast<uint32_t>(image_memory_barriers.size()),
&image_memory_barriers[0] );
}

如果多次使用于同样的目的,不需要重复设置barrier.这个设置是在改变时设置,而非usage.

create an image view

Images在Vulkan Commands中被直接使用,Framebuffer和shaders(通过descriptor sets)通过image views访问images.Images view定义了image的内存的选定部分和指明了读取image数据的额外的信息.

VkImageViewCreateInfo
VkImageView

image view定义了用于访问image的额外的数据,通过它我们能指明commands能够访问的image的部分.比如,如果渲染到image,可以指明就一个mipmap level需要更新.

Image view也定义了image内存如何解释.一个好的例子是multiple layers.对于它,我们可以定义一个image view来解释image,作为一个layered image,或者我们可以使用image view从中创建一个cubemap映射。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
VkImageViewCreateInfo image_view_create_info = {
VK_STRUCTURE_TYPE_IMAGE_VIEW_CREATE_INFO,
nullptr,
0,
image,
view_type,
format,
{
VK_COMPONENT_SWIZZLE_IDENTITY,
VK_COMPONENT_SWIZZLE_IDENTITY,
VK_COMPONENT_SWIZZLE_IDENTITY,
VK_COMPONENT_SWIZZLE_IDENTITY
},
{
aspect,
0,
VK_REMAINING_MIP_LEVELS,
0,
VK_REMAINING_ARRAY_LAYERS
}
};

vkCreateImageView

1
2
3
4
5
6
7
VkResult result = vkCreateImageView( logical_device,
&image_view_create_info, nullptr, &image_view );
if( VK_SUCCESS != result ) {
std::cout << "Could not create an image view." << std::endl;
return false;
}
return true;

create a 2D image and view

RGBA 32bits 2D texture最常用.

分三步

  • 创建一个image
  • 创建一个memory object(或者使用以存在的)绑定到image
  • 创建一个image view

需要创建type为 VK_IMAGE_TYPE_2D,format为VK_FORMAT_R8G8B8A8_UNORM的image.其他image的属性依赖于image的size(换句话说,我们从一个已经存在的image文件创建一个textur,需要匹配它的dimensions)、filtering类型(如果想用mipmapping的话)、samples的数量(如果需要multisampled的话)、期望的usage.

将image创建过程封装在CreateImage里

1
2
3
4
5
if( !CreateImage( logical_device, VK_IMAGE_TYPE_2D, format, { size.width,
size.height, 1 }, num_mipmaps, num_layers, samples, usage, false, image ) )
{
return false;
}

将allocate和bind一个memory object封装在AllocateAndBindMemoryObjectToImage

1
2
3
4
if( !AllocateAndBindMemoryObjectToImage( physical_device, logical_device,
image, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object ) ) {
return false;
}

也能使用一个已经创建的memory

然后创建image view.

1
2
3
4
if( !CreateImageView( logical_device, image, VK_IMAGE_VIEW_TYPE_2D, format,
aspect, image_view ) ) {
return false;
}

create a layered 2D image with a CUEMAP view

app常用于模拟物体反射环境的CUBEMAPs,不需要创建一个CUBEMAP image,只需要创建一个layered image和通过image view告诉硬件将之视为6个CUBEMAP表面.

与创建普通image一样,不同之处是CUBEMAP需要6个layres,同时不能每个texel不能使用超过一个sample.

1
2
3
4
5
if( !CreateImage( logical_device, VK_IMAGE_TYPE_2D,
VK_FORMAT_R8G8B8A8_UNORM, { size, size, 1 }, num_mipmaps, 6,
VK_SAMPLE_COUNT_1_BIT, usage, true, image ) ) {
return false;
}

allocate和bind a memory object

1
2
3
4
if( !AllocateAndBindMemoryObjectToImage( physical_device, logical_device,
image, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, memory_object ) ) {
return false;
}

创建image view,指明CUBEMAP view type

1
2
3
4
if( !CreateImageView( logical_device, image, VK_IMAGE_VIEW_TYPE_CUBE,
VK_FORMAT_R8G8B8A8_UNORM, aspect, image_view ) ) {
return false;
}

faces order +X,-X,+Y,-Y,+Z,-Z

data

mapping,updating and unmapping host-visible memory

images和buffers绑定的memory位于显卡硬件(device-local memory),高性能,但不能直接使用,我们需要使用中间的(staging)资源作为GPU(device)-CPU(host)中转.

staging resources需要host-visible,为了上传数据或者读取数据,需要map it.

mapping memory是最简单的用于upload data的方式.需要指明需要map的memory的内容(offset,size).

1
2
3
4
5
6
7
8
VkResult result;
void * local_pointer;
result = vkMapMemory( logical_device, memory_object, offset, data_size, 0,
&local_pointer );
if( VK_SUCCESS != result ) {
std::cout << "Could not map memory object." << std::endl;
return false;
}

mapping给我们一个指针,与标准c++一样使用,不限制是读还是写.

1
std::memcpy( local_pointer, data, data_size );

当更新了mapped memory range.需要告诉驱动,内存的内容被修改了否则更新的数据不会立刻被其他提交给queues的操作访问.这个过程称为flush.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
std::vector<VkMappedMemoryRange> memory_ranges = {
{
VK_STRUCTURE_TYPE_MAPPED_MEMORY_RANGE,
nullptr,
memory_object,
offset,
data_size
}
};
vkFlushMappedMemoryRanges( logical_device,
static_cast<uint32_t>(memory_ranges.size()), &memory_ranges[0] );
if( VK_SUCCESS != result ) {
std::cout << "Could not flush mapped memory." << std::endl;
return false;
}

在使用完映射的memory后,可以unmap.memory mapping不会影响app运行效率.但关闭程序前必须unmap.

1
2
3
4
5
6
if( unmap ) {
vkUnmapMemory( logical_device, memory_object );
} else if( nullptr != pointer ) {
*pointer = local_pointer;
}
return true;

copy data between buffers

除了mapping,vulkan还支持memory间(包括不同types)进行内存拷贝.

这类操作需要在command buffer中record.

1
2
3
4
if( regions.size() > 0 ) {
vkCmdCopyBuffer( command_buffer, source_buffer, destination_buffer,
static_cast<uint32_t>(regions.size()), &regions[0] );
}

为了最好的性能,渲染阶段用到的资源,需要绑定device-local memory.但我们不能map这类memory.使用vkCmdCopyBuffer,能拷贝数据到另一个host-visible内存.这种memory能直接被app给mapped和updated.

能被拷贝的memory创建时要有VK_BUFFER_USAGE_TRANSFER_SRC_BIT usage.
能拷贝数据的memory创建时需要有VK_BUFFER_USAGE_TRANSFER_DST_BIT usage.

当想拷贝buffer到另一个buffer,我们需要设置一个memory barrier,告诉硬件从现在开始该buffer的操作需要按照VK_ACCESS_TRANSFER_WRITE_BIT来.当拷贝完后,我们想将之用于特定目的,需要设置另一个memory barrier.

copy data from a buffer to an image

对于images,能帮到不同memory types的memory objects.但只有host-visible memory能被app直接mapped或updated.当想更新一个device-local memory的image需要从一个buffer进行拷贝.

拷贝buffer的数据到image通过comman buffer完成.

1
2
3
if( regions.size() > 0 ) {
vkCmdCopyBufferToImage( command_buffer, source_buffer, destination_image,image_layout, static_cast<uint32_t>(regions.size()), &regions[0] );
}

需要知道image data是如何组织在buffer里的.包括memory offset,length of the data row,height of data in a buffer.可以给row length和height设置为0,表明是紧密的数据,并与image的尺寸一致.

image from:VK_BUFFER_USAGE_TRANSFER_SRC_BIT,在transfer之前,image layout需要为VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL

buffer to:VK_BUFFER_USAGE_TRANSFER_DST_BIT

在从image拷贝数据前,需要设置一个memory barrier,改变image得layout VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL.内存方位类型变为VK_ACCESS_TRANSFER_READ_BIT.如果拷贝完后要用于其他目的,需要设置另一个barrier..

staging buffer

使用staging buffer更新device-local memory

staging resources用于更新not host-visible的memory的内容.这种memory不能mapped.需要一个中间buffer(可以mapped和更新),通过它传输数据.

需要一个能mapped的buffer,可以用池.

1
2
3
4
5
6
7
8
9
10
VkBuffer staging_buffer;
if( !CreateBuffer( logical_device, data_size,
VK_BUFFER_USAGE_TRANSFER_SRC_BIT, staging_buffer ) ) {
return false;
}
VkDeviceMemory memory_object;
if( !AllocateAndBindMemoryObjectToBuffer( physical_device, logical_device,
staging_buffer, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, memory_object ) ) {
return false;
}

然后map it和更新内容

1
2
3
4
if( !MapUpdateAndUnmapHostVisibleMemory( logical_device, memory_object, 0,
data_size, data, true, nullptr ) ) {
return false;
}

然后开始record command buffer.先给目标buffer设置一个memory barrier,改变它的usage为copy操作的target.staging buffer不需要barrier.当我们map和update buffer的memory,它的内容对其他commands也是可见的.因为导尿管我们开始command buffer recording一个隐性的barrier为host写操作而设置了.

1
2
3
4
5
6
7
8
9
if( !BeginCommandBufferRecordingOperation( command_buffer,
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, nullptr ) ) {
return false;
}
SetBufferMemoryBarrier( command_buffer,
destination_buffer_generating_stages, VK_PIPELINE_STAGE_TRANSFER_BIT, { {
destination_buffer, destination_buffer_current_access,
VK_ACCESS_TRANSFER_WRITE_BIT, VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED } } );

然后我们能record拷贝操作

1
2
CopyDataBetweenBuffers( command_buffer, staging_buffer, destination_buffer,
{ { 0, destination_offset, data_size } } );

之后,需要为target buffer设置另一个barrier.将他的usage改为使用时期望的

1
2
3
4
5
6
7
SetBufferMemoryBarrier( command_buffer, VK_PIPELINE_STAGE_TRANSFER_BIT,
destination_buffer_consuming_stages, { { destination_buffer,
VK_ACCESS_TRANSFER_WRITE_BIT, destination_buffer_new_access,
VK_QUEUE_FAMILY_IGNORED, VK_QUEUE_FAMILY_IGNORED } } );
if( !EndCommandBufferRecordingOperation( command_buffer ) ) {
return false;
}

然后创建一个fence,并在submit cb到queue时使用

1
2
3
4
5
6
7
8
VkFence fence;
if( !CreateFence( logical_device, false, fence ) ) {
return false;
}
if( !SubmitCommandBuffersToQueue( queue, {}, { command_buffer },
signal_semaphores, fence ) ) {
return false;
}

如果不再想使用staging buffer,销毁它.但必须在使用完成之后才能进行.(fence).

1
2
3
4
5
6
if( !WaitForFences( logical_device, { fence }, VK_FALSE, 500000000 ) ) {
return false;
}
DestroyBuffer( logical_device, staging_buffer );
FreeMemoryObject( logical_device, memory_object );
return true;

真实应用中,通常使用一个池,复用它,而不是动态创建.这样能避免wait fence的时间,也能提高效率

使用staging buffer更新device-local memory image

与上面的类似.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
VkBuffer staging_buffer;
if( !CreateBuffer( logical_device, data_size,
VK_BUFFER_USAGE_TRANSFER_SRC_BIT, staging_buffer ) ) {
return false;
}
VkDeviceMemory memory_object;
if( !AllocateAndBindMemoryObjectToBuffer( physical_device, logical_device,
staging_buffer, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, memory_object ) ) {
return false;
}
if( !MapUpdateAndUnmapHostVisibleMemory( logical_device, memory_object, 0,
data_size, data, true, nullptr ) ) {
return false;
}

设置barrier

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
if( !BeginCommandBufferRecordingOperation( command_buffer,
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT, nullptr ) ) {
return false;
}
SetImageMemoryBarrier( command_buffer, destination_image_generating_stages,
VK_PIPELINE_STAGE_TRANSFER_BIT,
{
{
destination_image,
destination_image_current_access,
VK_ACCESS_TRANSFER_WRITE_BIT,
destination_image_current_layout,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED,
destination_image_aspect
} } );
CopyDataFromBufferToImage( command_buffer, staging_buffer,
destination_image, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
{
{
0,
0,
0,
destination_image_subresource,
destination_image_offset,
destination_image_size,
} } );

再次修改image的usag

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SetImageMemoryBarrier( command_buffer, VK_PIPELINE_STAGE_TRANSFER_BIT,
destination_image_consuming_stages,
{
{
destination_image,
VK_ACCESS_TRANSFER_WRITE_BIT,
destination_image_new_access,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
destination_image_new_layout,
VK_QUEUE_FAMILY_IGNORED,
VK_QUEUE_FAMILY_IGNORED,
destination_image_aspect
} } );
if( !EndCommandBufferRecordingOperation( command_buffer ) ) {
return false;
}

fence

1
2
3
4
5
6
7
8
9
10
11
12
13
14
VkFence fence;
if( !CreateFence( logical_device, false, fence ) ) {
return false;
}
if( !SubmitCommandBuffersToQueue( queue, {}, { command_buffer },
signal_semaphores, fence ) ) {
return false;
}
if( !WaitForFences( logical_device, { fence }, VK_FALSE, 500000000 ) ) {
return false;
}
DestroyBuffer( logical_device, staging_buffer );
FreeMemoryObject( logical_device, memory_object );
return true;

如果是用池的话就不需要fence.

destroy

销毁iamge view

1
2
3
4
if( VK_NULL_HANDLE != buffer_view ) {
vkDestroyBufferView( logical_device, buffer_view, nullptr );
buffer_view = VK_NULL_HANDLE;
}

memory object

1
2
3
4
if( VK_NULL_HANDLE != memory_object ) {
vkFreeMemory( logical_device, memory_object, nullptr );
memory_object = VK_NULL_HANDLE;
}

buffer

1
2
3
4
if( VK_NULL_HANDLE != buffer ) {
vkDestroyBuffer( logical_device, buffer, nullptr );
buffer = VK_NULL_HANDLE;
}

Graphics and Compute Pipelines

Posted on 2019-04-05 | In sdk , graphics , vulkan

Graphics and Compute Pipelines

[TOC]

说明

内容

  • Creating a shader module
  • Specifying pipeline shader stages
  • Specifying a pipeline vertex binding description, attribute description, and input
    state
  • Specifying a pipeline input assembly state
  • Specifying a pipeline tessellation state
  • Specifying a pipeline viewport and scissor test state
  • Specifying a pipeline rasterization state
  • Specifying a pipeline multisample state
  • Specifying a pipeline depth and stencil state
  • Specifying a pipeline blend state
  • Specifying pipeline dynamic states
  • Creating a pipeline layout
  • Specifying graphics pipeline creation parameters
  • Creating a pipeline cache object
  • Retrieving data from a pipeline cache
  • Merging multiple pipeline cache objects
  • Creating a graphics pipeline
  • Creating a compute pipeline
  • Binding a pipeline object
  • Creating a pipeline layout with a combined image sampler, a buffer, and push
    constant ranges
  • Creating a graphics pipeline with vertex and fragment shaders, depth test
    enabled, and with dynamic viewport and scissor tests
  • Creating multiple graphics pipelines on multiple threads
  • Destroying a pipeline
  • Destroying a pipeline cache
  • Destroying a pipeline layout
  • Destroying a shader module

介绍

本文内容是核心之一.

在cb里record和提交给queus的operations由硬件执行.使用compute pipeline进行数学计算,使用graphic pipeline来绘制图形.

Pipeline objects控制geometry绘制和计算的方式.管理硬件的行为.是Vulkan和OpenGL最大的区别之处.它允许我们随时修改rendering或computing参数.我们能设置state,激活shaderprogram,绘制几何体,然后激活另一个shader program绘制另一个几何体.在vulkan里这是不可能的,因为整个rendering或computeingstate存储在一个单片的(monolithical)object里.当使用不同的shaders时,需要准备和使用分开的pipeline.不能switch shaders.

这一开始可能让人害怕,因为很多shader变体(variations)(还不包括额外的pipeline state)可能会创建大量的pipeline objects.但它是为了两个目的服务的,第一是性能.驱动能提前知道整个state以便优化后续操作的执行.第二是稳定性,随时修改state可能让驱动执行额外的操作,比如shader重编译.vulkan中所有需要提前准备的包括shader 编译都在pipeline创建时完成.

本文讨论如何给graphics或compute pipelines 参数进行设置.准备shader modules和决定激活shader stages激活,如何设置depth/stencil tests和如何激活blending.指明vertex attributes以及在绘制操作时时如何提供的.最后看如何创建多pipelines以及如何提高创建速度.

Shader Module

Creating a shader module

第一件事是为pipeline object准备shader modules.SPIR-V assembly.一个module可能包含多个shader stages.

Shader modules包括选择的shader programs的源码–一个SPIR-V assembly.可能包含多个stages,但每个stage需要有关联的入口(entry point).这些入口作为创建pipeline object的参数之一.

加载SPIR-V code,然后

1
2
3
4
5
6
7
VkShaderModuleCreateInfo shader_module_create_info = {
VK_STRUCTURE_TYPE_SHADER_MODULE_CREATE_INFO,
nullptr,
0,
source_code.size(),
reinterpret_cast<uint32_t const *>(source_code.data())
};

调用vkCreateShaderModule

1
2
3
4
5
6
7
VkResult result = vkCreateShaderModule( logical_device,
&shader_module_create_info, nullptr, &shader_module );
if( VK_SUCCESS != result ) {
std::cout << "Could not create a shader module." << std::endl;
return false;
}
return true;

需要记住创建shader module时shader没有编译和链接,而是在创建pipeline object时完成.

pipeline states

Specifying pipeline shader stages

在compute pipelines,我们只能用compute shaders.但graphics pipeline包括很多shader stages–vertex,geometry,tessellation control and evaluation,fragment.所以为了正确创建pipeline,需要指明哪些可编程shader stages在创建在cb里的pipeline时会被激活.且需要提供激活的shaders的所有源码.

自定义一个结构体

1
2
3
4
5
6
structShaderStageParameters {
VkShaderStageFlagBits ShaderStage;
VkShaderModule ShaderModule;
char const * EntryPointName;
VkSpecializationInfo const * SpecializationInfo;
};

VkSpecializationInfo 用来提供constant变量设置值.可为nullptr.

为了定义一组pipeline要激活的shader stages,需要准备VkPipelineShaderStageCreateInfo的数组.每个shader stage需要一个独立的entry,在entry里指明shader模块以及实现shader行为的入口.

也能提供特殊信息,比如创建时(运行时)修改常量的值,这允许在多次使用相同的shader时有细微变化.

graphics和compute pipelins都需要指明pipeline shader stages信息

假设只使用verte和fragment shaders.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
std::vector<ShaderStageParameters>shader_stage_params = {
{
VK_SHADER_STAGE_VERTEX_BIT,
*vertex_shader_module,
"main",
nullptr
},
{
VK_SHADER_STAGE_FRAGMENT_BIT,
*fragment_shader_module,
"main",
nullptr
}
};
1
2
3
4
5
6
7
8
9
10
11
12
shader_stage_create_infos.clear();
for( auto & shader_stage : shader_stage_params ) {
shader_stage_create_infos.push_back( {
VK_STRUCTURE_TYPE_PIPELINE_SHADER_STAGE_CREATE_INFO,
nullptr,
0,
shader_stage.ShaderStage,
shader_stage.ShaderModule,
shader_stage.EntryPointName,
shader_stage.SpecializationInfo
} );
}

每个shader stage需要时独一无二的.

Specifying a pipelinee vertex bindign description,attribute description,and input state

当想绘制几何体,我们准备额外的属性比如normal vectors,colors,texture coordinates.这些顶点数据是我们可以随意选择的,为了硬件能正确使用它们,我们需要指明有多少属性,内存中如何而排放,或者它们从哪里取.这些通过创建graphics pipeline时verte bindign description和attribute description提供.

veertex binding定义从绑定到选定索引的顶点缓冲区获取的数据集合.此绑定用作顶点属性的编号数据源.我们能至少使用16个分开的bindings,能绑定分开的vertex buffers或同一个buffer的不同memory.

通过binding description,指明数据来自哪里(from which binding),如何存放(缓冲区中连续元素之间的跨距是多少),数据如何读取($\color{red}{逐vertex还是逐instance}$).

一下是一个例子:vec3 position,ve2 texcoord,vec3 color

1
2
3
4
5
6
7
std::vector<VkVertexInputBindingDescription> binding_descriptions = {
{
0,
8 * sizeof( float ),
VK_VERTEX_INPUT_RATE_VERTEX
}
};

通过vertex input description,我们定义了从给定绑定中获取的属性.每个属性需要提供一个shader location(与layout(location=)一样),用于给定属性的数据格式,以及给定属性开始时的内存偏移量(offset).input description 条目数量指明了渲染时属性数量总和.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
std::vector<VkVertexInputAttributeDescription> attribute_descriptions = {
{
0,
0,
VK_FORMAT_R32G32B32_SFLOAT,
0
},
{
1,
0,
VK_FORMAT_R32G32_SFLOAT,
3 * sizeof( float )
},
{
2,
0,
VK_FORMAT_R32G32B32_SFLOAT,
5 * sizeof( float )
}
};
1
2
3
4
5
6
7
8
9
vertex_input_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_VERTEX_INPUT_STATE_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(binding_descriptions.size()),
binding_descriptions.data(),
static_cast<uint32_t>(attribute_descriptions.size()),
attribute_descriptions.data()
};

Specifying a pipeline input assembly state

绘制几何体涉及明确的图元类型,通过input assembly state完成.

VkPipelineInputAssemblyStateCreateInfo

通过input assembly state定义vertices如何组成polygons,最常用的是triangle strips 或Lists.

注意事项

  • list primitives不能使用primitive restart选项
  • primitives with adjacency只能和geometry shaders一起使用过.创建logical device时需要激活geometryShader特性.
  • 当使用tessellation shaders时只能用patch primitives.创建logical device时需要激活tessellationShader特性.

VkPipelineInputAssemblyStateCreateInfo:

1
2
3
4
5
6
7
input_assembly_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_INPUT_ASSEMBLY_STATE_CREATE_INFO,
nullptr,
0,
topology,
primitive_restart_enable
};

Specifying a pipeline tessellation state

为了使用tessellation shaders,需要

  • 创建logicalDevice时激活tessellationShader特性
  • 为tessellation control和evaluation shaders写代码
  • 为他们创建一个shader module(或2个)
  • 准备VkPipelineTessellationStateCreateInfo pipeline tessellation state

VkPipelineTessellationStateCreateInfo

1
2
3
4
5
6
tessellation_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO,
nullptr,
0,
patch_control_points_count
};

在tessellation state里我们只需要提供形成patch(vertices)的control points信息.至少支持32个vertices.

一个patch就是一组点(vertices),用于tessellation stages生成points,lines,或三角形之类的polygons.作为例子,获取三角形vertices

VkPipelineTessellationStateCreateInfo:

1
2
3
4
5
6
tessellation_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_TESSELLATION_STATE_CREATE_INFO,
nullptr,
0,
patch_control_points_count
};

Specifying a pipeline viewport and scissor test state

在屏幕上绘制要求指明screen parameters.创建swapchain不够,不总是绘制整个image area.有在一个更小的image上绘制得情况,比如镜面反射,分屏多人游戏.我们定义要通过pipeline viewport和scissor test states绘制到的图像区域

指明viewport和scissor states参数要求给viewport和scissor test提供独立的参数,但数量一致.自定义一个结构体

1
2
3
4
struct ViewportInfo {
std::vector<VkViewport> Viewports;
std::vector<VkRect2D> Scissors;
};

如果要多viewport渲染,需要在创建logical device时激活multiViewport特性

顶点从局部坐标变换到clip space,硬件做透视出发,成圣normalized device coordinates(标准化设备坐标NDC),然后polygons被assemled和rasterized(光栅化)–产生了fragments,每个fragments有自己的position(由framgbuffer的coordinates定义).为了position被正确计算,需要视口(viewport)变换.这个变换的参数由viewport state指明.

viewport 和 scissor test state是可选的,尽管通常启用.但如果不激活rasterization就不需要提供它们.

viewport state,我们定义framebuffer的coordinate(pixels on screen)的$\color{red}{左上角和width和height}$.也定义iewport depth值得最小、最大值(floating-point $\in$[0,1]).最大深度比最小深度小也是合法的.

scissor test允许对生成的fragments用指明的矩形做额外的clip操作.如果不想做clip操作,可以指明一个viewport大小的区域.

Vulkan里scissort test一直开启.

一个例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ViewportInfo viewport_infos = {
{
{
0.0f,
0.0f,
512.0f,
512.0f,
0.0f,
1.0f
},
},
{
{
{
0,
0
},
{
512,
512
}
}
}
};

前面的变量可用于创建此配方中定义的viewport和scissor test.实现如下

1
2
3
4
5
6
7
8
9
10
11
12
13
uint32_t viewport_count =
static_cast<uint32_t>(viewport_infos.Viewports.size());
uint32_t scissor_count =
static_cast<uint32_t>(viewport_infos.Scissors.size());
viewport_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_VIEWPORT_STATE_CREATE_INFO,
nullptr,
0,
viewport_count,
viewport_infos.Viewports.data(),
scissor_count,
viewport_infos.Scissors.data()
};

如果想改变viewport或scissor test参数,需要重建pipeline.但是在创建pipeline时可以指明viewport和scissor test parameters是动态的(dynamic).这样就不用重建pipeline就能改变这些参数了.可以在command buffer recording过程中指明.但是viewport和scissor tests的数量是pipeline 创建时指定的.之后不能改.

除非创建logical device时激活了multiViewport特性,否则不能提供一个以上的viewport和scissor test.

只能在geometry shaders内更改用于rasterization的viewport transformation的index.

Specifying a pipeline rasterzation state

rasterization process将assembled polygons生成fragments(pixels).viewport state在这使用,fragments会生成到framebuffer coordinates.为了觉得fragments如何生成,我们需要准备rasterization state.

rasterization state控制rasterization的参数.首先最重要的是它定义是否开启rasterization.能指明polygon哪一侧是front–是顶点在屏幕上按顺时针顺序(clockwise)出现或按逆时针(counterclockwise)顺序出现的.是否进行front,back,both faces culling.OpenGL中默认逆时针表面为正面且culling关闭.vulkan没有默认值.

一个rasterization state在graphics pipeline创建时总是需要的.

rasterization state也控制polygons绘制的方式.通常需要fully rendered(filled).但也能指明是否只绘制edges(lines)或者points(vertices).Line或points模式只有在创建logical device时激活了fillModeNoSolid特性时才能用.

还需要定义fragment的深度值如何计算,能够开启depth bias–一个给生成的depth value进行offset并添加slope factor的过程.也需要指明当depth bias激活时能给depth value加上的最大的(最小的)offset值.

这之后,也需要定义如果深度超过viewport state给定的范围怎么做.当depth clamp激活,会进行clamp.如果没有,fragment会discarded.(默认应该是disable)

最后一件事,定义绘制的lines的夸大怒,通常指明为1.但如果激活wideLines特性,能提供大于1的值.

同理,设置point size.

其实这些值在shader里可以对定点进行设置

VkPipelineRasterizationStateCreateInfo.

一个例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
VkPipelineRasterizationStateCreateInfo rasterization_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_RASTERIZATION_STATE_CREATE_INFO,
nullptr,
0,
depth_clamp_enable,
rasterizer_discard_enable,
polygon_mode,
culling_mode,
front_face,
depth_bias_enable,
depth_bias_constant_factor,
depth_bias_clamp,
depth_bias_slope_factor,
line_width
};

Specifying a pipeline multisample state

多重采用(multisampling)是绘制primitives时抗锯齿(eliminates jagged edges)的proceess.换句话说,它可以anti-alias polygons,lines and points.通过multisample state控制.

VkPipelineMultisampleStateCreateInfo

1
2
3
4
5
6
7
8
9
10
11
multisample_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_MULTISAMPLE_STATE_CREATE_INFO,
nullptr,
0,
sample_count,//the number of samples generated per fragment
per_sample_shading_enable,//
min_sample_shading,//minimal number of uniquely shaded samples
sample_masks,//fragment的覆盖范围参数
alpha_to_coverage_enable,//是否从alpha分量生成coverage
alpha_to_one_enable//是否alpha用1.0替代
};

Specifying a pipeline depth and stencil state

depth test (never, less, less and equal, equal,greater and equal, greater, not equal, always)

stencil test compareOp(never, less, less and equal, equal,greater and equal, greater, not equal, always)

dpeth 和 stencil state在rasterization为非激活或render pass给定的subpass没有用depth/stencil attachment时不需要.

需要指明depth value如何比较的以及通过测试的fragment是否写入depth attachment.

当depthBounds 特性激活时,能使用额外的depth bounds test.这个测试监测fragment是否在特定的minDepthBounds-maxDepthBounds范围内.如果不是discard(failed the depth test).

stencil test对每个fragment与一个integer 值进行额外的test.能用于多种目的,比如能定义复杂的图形决定哪块区域需要渲染,在defered shading/lighting中决定哪块区域进行lit,还有渲染鼠标选中物体的轮廓(高亮)都很有用,以及渲染隐藏在物体后面的物体的轮廓.

在激活stencil test的情况下,我们需要给front and back -facing polygons定义参数.这些参数在:fragment stencil test失败;stencil test成功但depth test失败;stencil test和depth test都成功的情况下执行什么行动.对于每种情况定义一些模式:保持不变;重置为0;替换为参考值;clamp(saturate)递增或递减;按位倒转.也指明进行comparison操作时test如何操作(与depth test类似),比较和写入模板,选择应参与测试或应在模板attachment中更新的stencil value’s bits,以及参考值.

VkPipelineDepthStencilStateCreateInfo

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
VkPipelineDepthStencilStateCreateInfo depth_and_stencil_state_create_info =
{
VK_STRUCTURE_TYPE_PIPELINE_DEPTH_STENCIL_STATE_CREATE_INFO,
nullptr,
0,
depth_test_enable,
depth_write_enable,
depth_compare_op,
depth_bounds_test_enable,
stencil_test_enable,
front_stencil_test_parameters,
back_stencil_test_parameters,
min_depth_bounds,
max_depth_bounds
};

Specifying a pipeline blend state

为了模拟透明物体,硬件通过混合存储在framebuffer里的已经渲染的fragment的颜色,通过graphics pipeline的blend state准备这个操作.

VkPipelineColorBlendAttachmentState

VkPipelineColorBlendStateCreateInfo

blending state是可选的且在rasterization非激活状态时或graphics pipeline的subpass没有color attachment是不要求的.

blending state主要是定义blending 操作的参数.但它也有其他用处,指明color mask选择渲染时哪个color components刷新(written to).控制logical operation 状态.当激活时,在当前fragment color和已经写入framebuffer的color 执行指定的逻辑操作.

仅对具有整数和规范化整数格式的attachment执行逻辑操作.

支持的logical 操作

  • CLEAR: Setting the color to zero
  • AND: Bitwise AND operation between the source (fragment’s) color and a
    destination color (already stored in an attachment)
  • AND_REVERSE: Bitwise AND operation between source and inverted destination
    colors
  • COPY: Copying the source (fragment’s) color without any modifications
    AND_INVERTED: Bitwise AND operation between destination and inverted source
    colors
  • NO_OP: Leaving the already stored color intact
  • XOR: Bitwise excluded OR between source and destination colors
  • OR: Bitwise OR operation between the source and destination colors
  • NOR: Inverted bitwise OR
  • EQUIVALENT: Inverted XOR
  • INVERT: Inverted destination color
  • OR_REVERSE: Bitwise OR between the source color and inverted destination color
  • COPY_INVERTED: Copying bitwise inverted source color
  • OR_INVERTED: Bitwise OR operation between destination and inverted source
    color
  • NAND: Inverted bitwise AND operation
    SET: Setting all color bits to ones

blending操作对给定graphic pipeline的subpass的每个color attachment是分开的.也就是说需要给每个color attachment指定blending 参数.但如果independentBlend特性没有启用,每个attachment的blending参数必须一样.

对blending,我们为color components和alpha component分别指明source和destination factors.支持的blend factors包括:

  • ZERO: 0
  • ONE: 1
  • SRC_COLOR:
  • ONE_MINUS_SRC_COLOR: 1 -
  • DST_COLOR:
  • ONE_MINUS_DST_COLOR: 1 -
  • SRC_ALPHA:
  • ONE_MINUS_SRC_ALPHA: 1 -
  • DST_ALPHA:
  • ONE_MINUS_DST_ALPHA: 1 -
  • CONSTANT_COLOR:
  • ONE_MINUS_CONSTANT_COLOR: 1 -
  • CONSTANT_ALPHA:
  • ONE_MINUS_CONSTANT_ALPHA: 1 -
  • SRC_ALPHA_SATURATE: min( , 1 -
    )
  • SRC1_COLOR: <component of a source’s second color> (used in dual
    source blending)
  • ONE_MINUS_SRC1_COLOR: 1 - <component of a source’s second color>
    (from dual source blending)
  • SRC1_ALPHA: <alpha component of a source’s second color> (in dual
    source blending)
  • ONE_MINUS_SRC1_ALPHA: 1 - <alpha component of a source’s second
    color> (from dual source blending)

有些blendingg factors使用constant color而不是fragment的(source)color或者存储在attachment的color(destination).这个constant color可以在创建Pipeline是静态指定货在command buffer recording调用vkCmdSetBlendConstants()动态设置.

其中use the source’s second color(SRC1)只在dualSrcBlend特性开启式有效.

控制如何blending的blending function也能为color和alpha分量分开指定.Blending operators包括:

  • ADD: +
  • SUBTRACT: -
  • REVERSE_SUBTRACT: -
  • MIN: min( , )
  • MAX: max( , )

Enabling a logical operation disables blending.

下面是disabled logical operation和blending操作的blend state的例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
 std::vector<VkPipelineColorBlendAttachmentState> attachment_blend_states =
{
{
false,
VK_BLEND_FACTOR_ONE,
VK_BLEND_FACTOR_ONE,
VK_BLEND_OP_ADD,
VK_BLEND_FACTOR_ONE,
VK_BLEND_FACTOR_ONE,
VK_BLEND_OP_ADD,
VK_COLOR_COMPONENT_R_BIT |
VK_COLOR_COMPONENT_G_BIT |
VK_COLOR_COMPONENT_B_BIT |
VK_COLOR_COMPONENT_A_BIT
}
};
VkPipelineColorBlendStateCreateInfo blend_state_create_info;
SpecifyPipelineBlendState( false, VK_LOGIC_OP_COPY,
attachment_blend_states, { 1.0f, 1.0f, 1.0f, 1.0f },
blend_state_create_info );

这种recipe实现fillsVkPipelineColorBlendStateCreateInfo如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
blend_state_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_COLOR_BLEND_STATE_CREATE_INFO,
nullptr,
0,
logic_op_enable,
logic_op,
static_cast<uint32_t>(attachment_blend_states.size()),
attachment_blend_states.data(),
{
blend_constants[0],
blend_constants[1],
blend_constants[2],
blend_constants[3]
}
};

Specifying pipeline dynamic states

创建graphic pipeline要求提供很多参数,且不再能修改,专业能提高性能,能给驱动提供稳定的可预测的环境.但不幸的是,给开发者造成了不便,使得可能需要创建很多pipeline objects–但只有很少的不同.

为了避免这个问题,引入了dynamic states.它允许我们再command bufferrecording specific函数动态控制pipeline的参数.为了做到这,需要指明pipeline的那部分时dynamic.这通过指明pipeline dynamic states实现.

VkDynamicState

  • VK_DYNAMIC_STATE_VIEWPORT
  • VK_DYNAMIC_STATE_SCISSOR
  • VK_DYNAMIC_STATE_LINE_WIDTH
  • VK_DYNAMIC_STATE_DEPTH_BIAS
  • VK_DYNAMIC_STATE_BLEND_CONSTANTS
  • VK_DYNAMIC_STATE_DEPTH_BOUNDS
  • VK_DYNAMIC_STATE_STENCIL_COMPARE_MASK
  • VK_DYNAMIC_STATE_STENCIL_WRITE_MASK
  • VK_DYNAMIC_STATE_STENCIL_REFERENCE

dynamic pipeline states被引入允许设置pipeline objects的state.在命令缓冲区记录期间,可能没有太多不同的管道部分可以设置,但是选择需要在性能、驱动程序的简单性、现代硬件的功能和API的易用性之间的折衷.

dynamic state时可选的.

一下是可以被动态设置的部分:

  • Viewport
  • Scissor
  • Line width
  • Depth bias
  • Stencil compare mask
  • Stencil write mask
  • Stencil reference value
  • Blend constants

通过VkDynamicState数组指明哪些state需要动态设置,然后通过VkPipelineDynamicStateCreateInfo结构记录

1
2
3
4
5
6
7
VkPipelineDynamicStateCreateInfo dynamic_state_creat_info = {
VK_STRUCTURE_TYPE_PIPELINE_DYNAMIC_STATE_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(dynamic_states.size()),
dynamic_states.data()
};

pipeline

Creating a pipeline layout

Pipeline layouts和descriptor set layouts类似.Descriptor set layouts用来定义什么类型的resources形成descriptor set.Pipeline layouts定义什么类型的资源能被pipeline 访问.它们通过descriptor set layouts创建并push constant ranges

在pipeline创建时需要pipeline layouts,因为它们通过a set,binding,array element address指明了shader stages和shader resources间的接口.shaders使用同样的address(through a lyout qualifier)能访问给定resources.但是,即使给定的管道不使用任何描述符资源,我们也需要创建一个管道布局来通知驱动程序不需要这样的接口.

pipeline layout定义了pipeline 的shaders能访问的resources集合.当record command buffers时,我们绑定descriptor set to 选定的indices(Binding descriptor sets).descriptor set layout的index与关联的Pipeline layout的数组的index一致.同样的index在shaders中通过layout(set = ,binding=)qualifier指定以访问所给资源.

通常multiple pipelines会访问不同的resources.在command buffer recording,绑定pipeilne 和descriptor sets.只有这样才能issue dcs.当我们切换pipeline,需要根据pipeline的需要绑定信的descriptor sets.但频繁绑定不同descriptor sets会影响app的性能.这也是创建由相似(or compatible)layouts的pipelines和绑定不常改变的descriptor sets(that are common for many pipelines)到indices接近0(或靠近layout开始的地方).这样,当我们switch pipelines,descriptor sets.这样,当切换pipelines时,descriptor sets靠近pipeline layout 开始的地方(from index 0 to some index N)能继续用且不用更新.只有在绑定不同descriptor sets(由更高indices,在index N之后),才有必要.但需要注意,为了similar(or compatible),pipeline layouts必须由相同的push constant ranges.

我们需要将很多pipelines通用的descriptor sets绑定到pipeline layout靠近开始的地方(near the $0^{th}$ index)

pipeline layouts也定义了push constants的ranges.能提供一个小的constant values集合给shaders.比更新descriptor sets快,但memory更小,最少只有128bytes(in a pipeline layout).

比如,我们能给graphics pipeline每个state提供不同的range.每个stage128/5=26bytes.也可以给多个shader stage提供相同的ranges.但每个shader stage只能访问一个push constant range.

通常是不需要push constant ranges的,所以上述例子是比较糟糕的情况.一般由足够的内存存储若干vec4或1、2个matrix

需要注意push constant range的size和offset必须为4的倍数.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
VkPipelineLayoutCreateInfo pipeline_layout_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(descriptor_set_layouts.size()),
descriptor_set_layouts.data(),
static_cast<uint32_t>(push_constant_ranges.size()),
push_constant_ranges.data()
};
VkResult result = vkCreatePipelineLayout( logical_device,
&pipeline_layout_create_info, nullptr, &pipeline_layout );
if( VK_SUCCESS != result ) {
std::cout << "Could not create pipeline layout." << std::endl;
return false;
}
return true;

Specifying graphics pipeline creation parameters

创建graphic pipeline需要填VkGraphicsPipelineCreateInfo提供很多控制不同方面的内容的参数.

在pipeline创建阶段能提供很多VkGraphicsPipelineCreateInfo,每一个指明了会被创建的单个pipeline的属性.

创建graphic pipeline后,可以在recording a dc前将之绑定到cb.Graphic pipeline只能在render pass绑定cb.在pipeline创建时,我们指明在哪个render pass这个pipeline会被爱使用.如果render pass是compatible那么可以使用同一个pipeline.

很少pipeline没有公共state.所以为了加快速度,可以$\color{red}{指明一个pipeline称为其他pipeline的parent(allow dervatives)}$,使用VkGraphicsPipelineCreateInfob的basePipelineHandle或basePipelineIndex.

basePipelineHandle允许我们指明已经存在的pipeline的handle,作为parent

basePipelineIndex当一次创建多Pipelines时,能指明VkGraphicsPipelineCreateInfo数组的哪个index提供给vkCreateGraphicsPipelines().此索引指向将与子pipeline一起在同一个单函数调用中创建的父管道.因为一起创建的所以无法提供handle.要求是parent的index必须必其他的小.也就是先创建.

basePipelineHandle和basePipelineIndex不能同时使用.

下面是一个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
VkGraphicsPipelineCreateInfo graphics_pipeline_create_info = {
VK_STRUCTURE_TYPE_GRAPHICS_PIPELINE_CREATE_INFO,
nullptr,
additional_options,
static_cast<uint32_t>(shader_stage_create_infos.size()),
shader_stage_create_infos.data(),
&vertex_input_state_create_info,
&input_assembly_state_create_info,
&tessellation_state_create_info,
&viewport_state_create_info,
&rasterization_state_create_info,
&multisample_state_create_info,
&depth_and_stencil_state_create_info,
&blend_state_create_info,
&dynamic_state_creat_info,
pipeline_layout,
render_pass,
subpass,
base_pipeline_handle,
base_pipeline_index
};

$\color {red}{Creating\ a\ pipeline\ cache\ object}$

一个pipeline object不只是对参数进行包装.它包括所有可编程states和fixed pipeline stages的准备,设置shaders和descriptor resources间的interface,compiling和linking shader programs,进行错误检查(检查shaders是否正确linked).这些结果会存在cache里.这个cache能在创建相似属性的pipeline objects是复用加速.

VkPipelineCacheCreateInfo

VkPipelineCache

vkCreatePipelineCache

pipeline cache存储着一个pipeline preparation process的结果.可选的且能省略的.但能显著加快创建pipeline objects的速度.

在创建Pipeline时使用cache需要先创建一个pipeline cache object并提供给Pipeline creating function.驱动会自动缓存结果.如果cache有数据,driver自动尝试在创建pipeline时使用它.

使用pipeline cache object最常用的剧本(scenario)是将它的内容存储到一个file并在相同的app的独立的executions中复用.当启动app时,创建一个所有pipelines需要的empty cache.然后检索这个cache data并存储到file里.下次app执行时,也创建这个cache,但这次从文件读取数据来初始化它.但如果是只创建少量的pipelines,可能不用这么复杂.但是现代3D app都需要大量的pipelines.这种技术能极大加快初始化速度.

假设cache数据存储在cache_data数组里,可能是空的也可能是从先前创建的数据初始化了,创建pipeline cache的process如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
VkPipelineCacheCreateInfo pipeline_cache_create_info = {
VK_STRUCTURE_TYPE_PIPELINE_CACHE_CREATE_INFO,
nullptr,
0,
static_cast<uint32_t>(cache_data.size()),
cache_data.data()
};
VkResult result = vkCreatePipelineCache( logical_device,
&pipeline_cache_create_info, nullptr, &pipeline_cache );
if( VK_SUCCESS != result ) {
std::cout << "Could not create pipeline cache." << std::endl;
return false;
}
return true;

Retrieving data from a pipeline cache

为了能复用pipeline cache,我们需要存储cache的内容并在任何时候复用它.为此,我们检索cache里的数据.

vkGetPipelineCacheData

检索pipeliine cache内容是Vulkan里典型的doule-call of a single function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
size_t data_size = 0;
VkResult result = VK_SUCCESS;
result = vkGetPipelineCacheData( logical_device, pipeline_cache,
&data_size, nullptr );
if( (VK_SUCCESS != result) ||
(0 == data_size) ) {
std::cout << "Could not get the size of the pipeline cache." <<
std::endl;
return false;
}
pipeline_cache_data.resize( data_size );
result = vkGetPipelineCacheData( logical_device, pipeline_cache,
&data_size, pipeline_cache_data.data());
if( (VK_SUCCESS != result) ||
(0 == data_size) ) {
std::cout << "Could not acquire pipeline cache data." << std::endl;
return false;
}
return true;

Merging multiple pipeline cache objects

因为要创建大量的pipelines,为了缩短创建时间,通过多线程将他们创建过程分开.每个线程会使用一个独立的pipeline cache.当都完成后,为了复用cache,需要合并他们到一个cache objects里.

vkMergePipelineCaches

1
2
3
4
5
6
7
8
9
VkResult result = vkMergePipelineCaches( logical_device,
target_pipeline_cache,
static_cast<uint32_t>(source_pipeline_caches.size()),
source_pipeline_caches.data() );
if( VK_SUCCESS != result ) {
std::cout << "Could not merge pipeline cache objects." << std::endl;
return false;
}
return true;

注意合并后的那个cache object不能在vector里.

Creating a grphics pipeline

graphics pipeline控制所有drawing相关的操作.通过它我们指明drawing阶段的shader programs,各种测试(depth,stencil)的参数,或者final color如何计算并写入any of the subpass attachments.是最重要的objects之一.能创建一个或一次创建多个.

vkCreateGraphicsPipelines

下图白色blocks为可编程stages,灰色为固定管线部分

其中有的是可选的.如果Rasterization关闭,就不需要Fragment stage.如果启用tessellation stage,就需要提供Tessellation control 和 evaluation shaders.

VkGraphicsPipelineCreateInfo

VkPipeline

二者大小相同

1
2
3
4
5
6
7
8
9
10
11
graphics_pipelines.resize( graphics_pipeline_create_infos.size() );
VkResult result = vkCreateGraphicsPipelines( logical_device,
pipeline_cache,
static_cast<uint32_t>(graphics_pipeline_create_infos.size()),
graphics_pipeline_create_infos.data(), nullptr, graphics_pipelines.data()
);
if( VK_SUCCESS != result ) {
std::cout << "Could not create a graphics pipeline." << std::endl;
return false;
}
return true;

Creating a compute pipeline

VkPipelineShaderStageCreateInfo

VkComputePipelineCreateInfo

VkPipeline

vkCreateComputePipelines

一个compute pipeline 只有一个compute shader stage.(尽管硬件可能实现额外的stages)

compute shader只有一些内置变量,没有输入输出.只能用uniform 变量(buffers or images).所以compute shader更通用,能对images执行数学计算.

与graphics pipelines类似,也有继承.

下面是一个简单例子

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
VkComputePipelineCreateInfo compute_pipeline_create_info = {
VK_STRUCTURE_TYPE_COMPUTE_PIPELINE_CREATE_INFO,
nullptr,
additional_options,
compute_shader_stage,
pipeline_layout,
base_pipeline_handle,
-1
};
VkResult result = vkCreateComputePipelines( logical_device, pipeline_cache,
1, &compute_pipeline_create_info, nullptr, &compute_pipeline );
if( VK_SUCCESS != result ) {
std::cout << "Could not create compute pipeline." << std::endl;
return false;
}
return true;

Binding a pipeline object

在issue dc或dispatch computational work前,需要设置所有需要的states.其一为cb绑定pipeline object,graphic piepeline 或compute pipeline.

VkCommandBuffer

vkCmdBindPipeline

1
vkCmdBindPipeline( command_buffer, pipeline_type, pipeline );

example

Creating a pipeline layout with a combined image sampler, a buffer,and push constant ranges

fragment有一个image,vertex有一个uniform

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
std::vector<VkDescriptorSetLayoutBinding> descriptor_set_layout_bindings =
{
{
0,
VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE,
1,
VK_SHADER_STAGE_FRAGMENT_BIT,
nullptr
},
{
1,
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER,
1,
VK_SHADER_STAGE_VERTEX_BIT,
nullptr
}
};
if( !CreateDescriptorSetLayout( logical_device,
descriptor_set_layout_bindings, descriptor_set_layout ) ) {
return false;
}

ranges of push constants

1
2
3
4
5
if( !CreatePipelineLayout( logical_device, { descriptor_set_layout },
push_constant_ranges, pipeline_layout ) ) {
return false;
}
return true;

Creating a graphics pipeline with vertex andfragment shaders, depth test enabled, and with dynamic viewport and scissor tests

desctroy

本节介绍一个通用的graphic pipeline 创建过程,vertex shaders,fragment shaders,depth test enabled.动态指明viewport 和 sicssor tests.

准备vertex和fragment shader stages

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
std::vector<unsigned char> vertex_shader_spirv;
if( !GetBinaryFileContents( vertex_shader_filename, vertex_shader_spirv ) )
{
return false;
}
VkDestroyer<VkShaderModule> vertex_shader_module( logical_device );
if( !CreateShaderModule( logical_device, vertex_shader_spirv,
*vertex_shader_module ) ) {
return false;
}
std::vector<unsigned char> fragment_shader_spirv;
if( !GetBinaryFileContents( fragment_shader_filename, fragment_shader_spirv
) ) {
return false;
}
VkDestroyer<VkShaderModule> fragment_shader_module( logical_device );
if( !CreateShaderModule( logical_device, fragment_shader_spirv,
*fragment_shader_module ) ) {
return false;
}
std::vector<ShaderStageParameters> shader_stage_params = {
{
VK_SHADER_STAGE_VERTEX_BIT,
*vertex_shader_module,
"main",
nullptr
},
{
VK_SHADER_STAGE_FRAGMENT_BIT,
*fragment_shader_module,
"main",
nullptr
}
};
std::vector<VkPipelineShaderStageCreateInfo> shader_stage_create_infos;
SpecifyPipelineShaderStages( shader_stage_params, shader_stage_create_infos
);

然后选择vertex bindings和verte attributes.

1
2
3
4
5
6
VkPipelineVertexInputStateCreateInfo vertex_input_state_create_info;
SpecifyPipelineVertexInputState( vertex_input_binding_descriptions,
vertex_attribute_descriptions, vertex_input_state_create_info );
VkPipelineInputAssemblyStateCreateInfo input_assembly_state_create_info;
SpecifyPipelineInputAssemblyState( primitive_topology,
primitive_restart_enable, input_assembly_state_create_info );

Viewport和scissor test参数很重要,因为动态设置,所以只有viewports的数量重要.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
ViewportInfo viewport_infos = {
{
{
0.0f,
0.0f,
500.0f,
500.0f,
0.0f,
1.0f
}
},
{
{
{
0,
0
},
{
500,
500
}
}
}
};
VkPipelineViewportStateCreateInfo viewport_state_create_info;
SpecifyPipelineViewportAndScissorTestState( viewport_infos,
viewport_state_create_info );

然后为rasterization 和 multisample states准备参数.

1
2
3
4
5
6
7
VkPipelineRasterizationStateCreateInfo rasterization_state_create_info;
SpecifyPipelineRasterizationState( false, false, polygon_mode,
culling_mode, front_face, false, 0.0f, 1.0f, 0.0f, 1.0f,
rasterization_state_create_info );
VkPipelineMultisampleStateCreateInfo multisample_state_create_info;
SpecifyPipelineMultisampleState( VK_SAMPLE_COUNT_1_BIT, false, 0.0f,
nullptr, false, false, multisample_state_create_info );

dpeth test.一般而言需要靠近摄像机的fragment得到保留,所以使用VK_COMPARE_OP_LESS_OR_EQUAL作为比较操作.这里假设stencil test关闭

1
2
3
4
5
6
7
8
9
10
11
12
13
14
VkStencilOpState stencil_test_parameters = {
VK_STENCIL_OP_KEEP,
VK_STENCIL_OP_KEEP,
VK_STENCIL_OP_KEEP,
VK_COMPARE_OP_ALWAYS,
0,
0,
0
};
VkPipelineDepthStencilStateCreateInfo depth_and_stencil_state_create_info;
SpecifyPipelineDepthAndStencilState( true, true,
VK_COMPARE_OP_LESS_OR_EQUAL, false, 0.0f, 1.0f, false,
stencil_test_parameters, stencil_test_parameters,
depth_and_stencil_state_create_info );

blending parameters

1
2
3
VkPipelineColorBlendStateCreateInfo blend_state_create_info;
SpecifyPipelineBlendState( logic_op_enable, logic_op,
attachment_blend_states, blend_constants, blend_state_create_info );

list of dynamic states

1
2
3
4
5
6
std::vector<VkDynamicState> dynamic_states = {
VK_DYNAMIC_STATE_VIEWPORT,
VK_DYNAMIC_STATE_SCISSOR
};
VkPipelineDynamicStateCreateInfo dynamic_state_create_info;
SpecifyPipelineDynamicStates( dynamic_states, dynamic_state_create_info );

创建pipeline

1
2
3
4
5
6
7
8
9
10
11
12
13
VkGraphicsPipelineCreateInfo graphics_pipeline_create_info;
SpecifyGraphicsPipelineCreationParameters( additional_options,
shader_stage_create_infos, vertex_input_state_create_info,
input_assembly_state_create_info, nullptr, &viewport_state_create_info,
rasterization_state_create_info, &multisample_state_create_info,
&depth_and_stencil_state_create_info, &blend_state_create_info,
&dynamic_state_create_info, pipeline_layout, render_pass,
subpass, base_pipeline_handle, -1, graphics_pipeline_create_info );
if( !CreateGraphicsPipelines( logical_device, {
graphics_pipeline_create_info }, pipeline_cache, graphics_pipeline ) ) {
return false;
}
return true;

multiple thread

Creating multiple graphics pipelines on multiple threads

创建graphic pipeline 可能会话很长时间.shader编译链接在pipeline创建时完成,指定给shader的states是否正常.所以有大量pipeline需要创建时最好使用多线程.

但当有大量pipeline创建时需要使用cachee去加速创建过程.本节会介绍在多并发管道(multiple concurrent pipeline)创建时使用cache并在之后合并cache.

本节使用VkDestroyer<>模板来自动销毁无用的资源

流程

  • cache文件std::string pipeline_cache_filename
  • cache从文件加载到std::vector cache_data;
  • std::vector pipeline_caches.为每个独立的thread创建pipeline cache object并存储句柄到pipeline_caches
  • std::vector\std::thread\ threads. resize
  • 创建变量std::vector\<std::vector\<VkGraphicsPipelineCreateInfo>> graphics_pipelines_create_infos.为每个thread添加新的VkGraphicsPipelineCreateInfo graphics_pipelines_create_infos.并存储到线程创建的pipeline数等大的数组里.
  • 创建变量std::vector\<std::vector\<VkPipeline>> graphics_pipelines.按照每个thread的pipelines数量resize graphics_pipelines的子数组.
  • 创建期望数量的threads,每个thread使用logical_device创建选定数量的pipelines,一个cache关联到该thread(pipeline_caches[]),一个VkGraphicsPipelineCreateInfo数组关联到该thread(graphics_pipelines_create_infos[]).
  • 等待所有threads结束
  • 创建一个VkPipelineCache target_cache
  • 合并pipeline_caches数组到 target_cache.
  • 遍历target_cache内容,存储到cache_data数组.
  • 将cache_data存储到文件pipeline_cache_filename

创建multiple graphics pipeline要求给很多不同pipelines提供很多参数.

为了速度更快,使用pipeline cache非常有效,首先需要从文件里读取预先存储的cache(如果有的话).然后为每个独立thread创建cache.每个cache需要用文件里加载的cache内容初始化.

1
2
3
4
5
6
7
8
9
10
11
12
> std::vector<unsigned char> cache_data;
> GetBinaryFileContents( pipeline_cache_filename, cache_data );
> std::vector<VkDestroyer<VkPipelineCache>> pipeline_caches(
> graphics_pipelines_create_infos.size() );
> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
> pipeline_caches[i] = VkDestroyer< VkPipelineCache >( logical_device );
> if( !CreatePipelineCacheObject( logical_device, cache_data,
> *pipeline_caches[i] ) ) {
> return false;
> }
> }
>

下一步是为每个thread创建的pipeline handles准备存储空间.同时开始所有thread使用对应的cache object创建多pipelines.

1
2
3
4
5
6
7
8
9
> std::vector<std::thread>threads( graphics_pipelines_create_infos.size() );
> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
> graphics_pipelines[i].resize( graphics_pipelines_create_infos[i].size()
> );
> threads[i] = std::thread::thread( CreateGraphicsPipelines,
> logical_device, graphics_pipelines_create_infos[i], *pipeline_caches[i],
> graphics_pipelines[i] );
> }
>

等待所有thread完成.然后合并所有cache objects到一个.将新内容存储(replace)到对应文件.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
> for( size_t i = 0; i < graphics_pipelines_create_infos.size(); ++i ) {
> threads[i].join();
> }
> VkPipelineCache target_cache = *pipeline_caches.back();
> std::vector<VkPipelineCache> source_caches( pipeline_caches.size() - 1);
> for( size_t i = 0; i < pipeline_caches.size() - 1; ++i ) {
> source_caches[i] = *pipeline_caches[i];
> }
> if( !MergeMultiplePipelineCacheObjects( logical_device, target_cache,
> source_caches ) ) {
> return false;
> }
> if( !RetrieveDataFromPipelineCache( logical_device, target_cache,
> cache_data ) ) {
> return false;
> }
> if( !SaveBinaryFile( pipeline_cache_filename, cache_data ) ) {
> return false;
> }
> return true;
>

Destroy

Destroy pipeline

1
2
3
4
if( VK_NULL_HANDLE != pipeline ) {
vkDestroyPipeline( logical_device, pipeline, nullptr );
pipeline = VK_NULL_HANDLE;
}

需要确保commands已经完成(通过fences).

Destroy a pipeline cache

当用来创建了pipeline,合并cache data,或遍历了内容后可以销毁cache.

1
2
3
4
if( VK_NULL_HANDLE != pipeline_cache ) {
vkDestroyPipelineCache( logical_device, pipeline_cache, nullptr );
pipeline_cache = VK_NULL_HANDLE;
}

Destroying a pipeline layout

当不需要pipeline layout时,也就是不想用它来创建更多pipeline、绑定descriptor sets或更新push constants(给定layout使用的)、使用这个pipeline layout的所有操作已经完成,我们能销毁它.

Pipeline layouts只在三种情况游泳–创建pipelines,绑定descriptor sets,update push constants.第一种可以用完就销毁,后两种在硬件停止运行相关cbs后销毁

1
2
3
4
if( VK_NULL_HANDLE != pipeline_layout ) {
vkDestroyPipelineLayout( logical_device, pipeline_layout, nullptr );
pipeline_layout = VK_NULL_HANDLE;
}

Destroying a shadr module

Shader modules只用于创建pipeline objects.完成后能立即销毁.

1
2
3
4
if( VK_NULL_HANDLE != shader_module ) {
vkDestroyShaderModule( logical_device, shader_module, nullptr );
shader_module = VK_NULL_HANDLE;
}

UE4的Film ACES移植到unity中

Posted on 2019-03-16 | In render , Postprocess

本文简单介绍如何讲UE4后处理移植到unity中

Read more »
12

nolife

一个游戏程序员的blog
17 posts
6 categories
2 tags
© 2019 nolife
Powered by Hexo v3.8.0
|
Theme – NexT.Muse v7.0.1