Vulkan validation error when I try to reset a commandPool after vkQueueWaitIddle - vulkan

I have a small Vulkan program that runs a compute shader in a loop.
There is only one commandBuffer that is allocated from the only commandPool I have.
After the commandBuffer is built, I submit it to the queue, and wait for it to comple with vkQueueWaitIddle. I does indeed wait for a while in that line of code. After that, I call vkResetCommandPool, which should reset all commandBuffer allocated with that pool (there is only one anyways).
...
vkEndCommandBuffer(commandBuffer);
{
VkSubmitInfo info = {};
info.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
info.commandBufferCount = 1;
info.pCommandBuffers = &commandBuffer;
vkQueueSubmit(queue, 1, &info, VK_NULL_HANDLE);
}
vkQueueWaitIdle(queue);
vkResetCommandPool(device, commandPool, VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
When it tries to reset the commandPool the validation gives me the following error.
VUID-vkResetCommandPool-commandPool-00040(ERROR / SPEC): msgNum: -1254218959
- Validation Error: [ VUID-vkResetCommandPool-commandPool-00040 ]
Object 0: handle = 0x20d2ce0b718, type = VK_OBJECT_TYPE_COMMAND_BUFFER; |
MessageID = 0xb53e2331 |
Attempt to reset command pool with VkCommandBuffer 0x20d2ce0b718[] which is in use.
The Vulkan spec states: All VkCommandBuffer objects allocated from commandPool must not be in the pending state
(https://vulkan.lunarg.com/doc/view/1.2.176.1/windows/1.2-extensions/vkspec.html#VUID-vkResetCommandPool-commandPool-00040)
Objects: 1
[0] 0x20d2ce0b718, type: 6, name: NULL
But I don't understand why, since I'm already waiting with vkQueueWaitIdle. According to the documentation, once the commandBuffer is done executing, it should go to the invalid state, and I should be able to reset it.
Here's the relevan surrounding code:
VkCommandBufferBeginInfo beginInfo = {};
beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
beginInfo.pInheritanceInfo = nullptr;
for (i64 i = 0; i < numIterations; i++)
{
vkBeginCommandBuffer(commandBuffer, &beginInfo);
vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipeline);
vkCmdBindDescriptorSets(commandBuffer, VK_PIPELINE_BIND_POINT_COMPUTE, pipelineLayout,
0, 2, descriptorSets, 0, nullptr);
uniforms.start = i * numThreads;
vkCmdUpdateBuffer(commandBuffer, unifsBuffer, 0, sizeof(uniforms), &uniforms);
vkCmdPipelineBarrier(commandBuffer,
VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, 0,
0, nullptr,
1, &memBarriers[0],
0, nullptr);
vkCmdDispatch(commandBuffer, numThreads, 1, 1);
vkCmdPipelineBarrier(commandBuffer,
VK_PIPELINE_STAGE_COMPUTE_SHADER_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0,
0, nullptr,
1, &memBarriers[1],
0, nullptr);
VkBufferCopy copyInfo = {};
copyInfo.srcOffset = 0;
copyInfo.dstOffset = 0;
copyInfo.size = sizeof(i64) * numThreads;
vkCmdCopyBuffer(commandBuffer,
buffer, stagingBuffer, 1, &copyInfo);
vkEndCommandBuffer(commandBuffer);
{
VkSubmitInfo info = {};
info.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
info.commandBufferCount = 1;
info.pCommandBuffers = &commandBuffer;
vkQueueSubmit(queue, 1, &info, VK_NULL_HANDLE);
}
vkQueueWaitIdle(queue);
vkResetCommandPool(device, commandPool, VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT);
i64* result;
vkMapMemory(device, stagingBufferMem, 0, sizeof(i64) * numThreads, 0, (void**)&result);
for (int i = 0; i < numThreads; i++)
{
if (result[i]) {
auto res = result[i];
vkUnmapMemory(device, stagingBufferMem);
return res;
}
}
vkUnmapMemory(device, stagingBufferMem);
}

I have found my problem. In vkCmdDispatch, I thought the paremeters specify the global size (number of compute shader invocations) but it's actually the number of work groups. Therefore, I was dispatching more threads than I intended, and my buffer wasn't big enough, so the threads were writing out of bounds.
I believe the validation layer wasn't giving me the right hints though.

Related

<Vulkan> Use rendered vkImage as Texture

I want to use a vkImage rendered at a previous render pass as Texture to do the composite operation in a fragment shader. From here I learned vkCmdPipelineBarrier is used to wait for GPU finish a rendering operation and I write this code. It works well on Snapdragon devices. But not on Mali-G52. The Write-after-write error is partly happed. Is this code not enough? Any suggestions?
vkCmdEndRenderPass(cb);
vkCmdBeginRenderPass(cb, &renderPassBeginInfo, VK_SUBPASS_CONTENTS_INLINE);
VkViewport viewport = vks::initializers::viewport((float)offscreenPass.width, (float)offscreenPass.height, 0.0f, 1.0f);
vkCmdSetViewport(cb, 0, 1, &viewport);
VkRect2D scissor = vks::initializers::rect2D(offscreenPass.width, offscreenPass.height, 0, 0);
vkCmdSetScissor(cb, 0, 1, &scissor);
// https://github.com/KhronosGroup/Vulkan-Samples/blob/master/samples/performance/pipeline_barriers/pipeline_barriers.cpp
VkImageMemoryBarrier imageMemoryBarrier = vks::initializers::imageMemoryBarrier();
imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = 0;
imageMemoryBarrier.image = offscreenPass.color[drawframe].image;
imageMemoryBarrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
imageMemoryBarrier.subresourceRange.baseMipLevel = 0;
imageMemoryBarrier.subresourceRange.levelCount = 1;
imageMemoryBarrier.subresourceRange.baseArrayLayer = 0;
imageMemoryBarrier.subresourceRange.layerCount = 1;
vkCmdPipelineBarrier(
cb,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_DEPTH_STENCIL_READ_ONLY_OPTIMAL;
imageMemoryBarrier.image = offscreenPass.depth.image;
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = 0;
vkCmdPipelineBarrier(
cb,
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
0, 0, nullptr, 0, nullptr, 1, &imageMemoryBarrier);
I have tried every pattern written here.
If you want to synchronize render passes then your pipeline barrier must be outside of the render pass in the command stream. I.e. it must be after the vkCmdEndRenderPass() of the first pass, and before the vkCmdBeginRenderPass() of the second pass. Pipeline barriers issued inside a render pass, as you are currently doing, are used for synchronization only within the current subpass.
Also, try to avoid:
srcStage=VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
dstStage=VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT
... for pipeline barriers when you only consume the output of the first pass as a fragment shader input in the second. This is overly conservative and needlessly serializes execution of the geometry processing too. In this case, you should use:
srcStage=VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT
dstStage=VK_PIPELINE_STAGE_FRAGMENT_BIT
... which allows the non-dependent vertex shading and binning for the second pass to run in parallel to the first pass.
Self solved.
The difference in the precision of sampler2D between Adreno and Mali causes this issue. I can read correct data using "precision highp sampler2D".

The way that copy data to a linear tiled image(not using stage buffer)when the format of image is VK_FORMAT_R8G8B8_UNORM seems not work correctly?

There are two ways that can copy data to image(using stage buffer or not).In the first way that using stage buffer, when the image format is VK_FORMAT_R8G8B8A8_UNORM or VK_FORMAT_R8G8B8_UNORM, it works correctly.But in the way that not using stage buffer, the image format is VK_FORMAT_R8G8B8A8_UNORM, it works well. While changing the format to VK_FORMAT_R8G8B8_UNORM, the result of sample is not correct.The source data can be assured correct when setting different image format .
The code used is from [https://github.com/SaschaWillems/Vulkan/blob/master/examples/texture/texture.cpp](https://www.stackoverflow.com/
if (0/*useStaging*/) {
// Copy data to an optimal tiled image
// This loads the texture data into a host local buffer that is copied to the optimal tiled image on the device
// Create a host-visible staging buffer that contains the raw image data
// This buffer will be the data source for copying texture data to the optimal tiled image on the device
VkBuffer stagingBuffer;
VkDeviceMemory stagingMemory;
VkBufferCreateInfo bufferCreateInfo = vks::initializers::bufferCreateInfo();
bufferCreateInfo.size = ktxTextureSize;
// This buffer is used as a transfer source for the buffer copy
bufferCreateInfo.usage = VK_BUFFER_USAGE_TRANSFER_SRC_BIT;
bufferCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
VK_CHECK_RESULT(vkCreateBuffer(device, &bufferCreateInfo, nullptr, &stagingBuffer));
// Get memory requirements for the staging buffer (alignment, memory type bits)
vkGetBufferMemoryRequirements(device, stagingBuffer, &memReqs);
memAllocInfo.allocationSize = memReqs.size;
// Get memory type index for a host visible buffer
memAllocInfo.memoryTypeIndex = vulkanDevice->getMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);
VK_CHECK_RESULT(vkAllocateMemory(device, &memAllocInfo, nullptr, &stagingMemory));
VK_CHECK_RESULT(vkBindBufferMemory(device, stagingBuffer, stagingMemory, 0));
// Copy texture data into host local staging buffer
uint8_t *data;
VK_CHECK_RESULT(vkMapMemory(device, stagingMemory, 0, memReqs.size, 0, (void **)&data));
memcpy(data, ktxTextureData, ktxTextureSize);
vkUnmapMemory(device, stagingMemory);
// Setup buffer copy regions for each mip level
std::vector<VkBufferImageCopy> bufferCopyRegions;
uint32_t offset = 0;
for (uint32_t i = 0; i < texture.mipLevels; i++) {
// Calculate offset into staging buffer for the current mip level
ktx_size_t offset;
KTX_error_code ret = ktxTexture_GetImageOffset(ktxTexture, i, 0, 0, &offset);
assert(ret == KTX_SUCCESS);
// Setup a buffer image copy structure for the current mip level
VkBufferImageCopy bufferCopyRegion = {};
bufferCopyRegion.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
bufferCopyRegion.imageSubresource.mipLevel = i;
bufferCopyRegion.imageSubresource.baseArrayLayer = 0;
bufferCopyRegion.imageSubresource.layerCount = 1;
bufferCopyRegion.imageExtent.width = ktxTexture->baseWidth >> i;
bufferCopyRegion.imageExtent.height = ktxTexture->baseHeight >> i;
bufferCopyRegion.imageExtent.depth = 1;
bufferCopyRegion.bufferOffset = offset;
bufferCopyRegions.push_back(bufferCopyRegion);
}
// Create optimal tiled target image on the device
VkImageCreateInfo imageCreateInfo = vks::initializers::imageCreateInfo();
imageCreateInfo.imageType = VK_IMAGE_TYPE_2D;
imageCreateInfo.format = format;
imageCreateInfo.mipLevels = texture.mipLevels;
imageCreateInfo.arrayLayers = 1;
imageCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageCreateInfo.tiling = VK_IMAGE_TILING_OPTIMAL;
imageCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
// Set initial layout of the image to undefined
imageCreateInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
imageCreateInfo.extent = { texture.width, texture.height, 1 };
imageCreateInfo.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT | VK_IMAGE_USAGE_SAMPLED_BIT;
VK_CHECK_RESULT(vkCreateImage(device, &imageCreateInfo, nullptr, &texture.image));
vkGetImageMemoryRequirements(device, texture.image, &memReqs);
memAllocInfo.allocationSize = memReqs.size;
memAllocInfo.memoryTypeIndex = vulkanDevice->getMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
VK_CHECK_RESULT(vkAllocateMemory(device, &memAllocInfo, nullptr, &texture.deviceMemory));
VK_CHECK_RESULT(vkBindImageMemory(device, texture.image, texture.deviceMemory, 0));
VkCommandBuffer copyCmd = vulkanDevice->createCommandBuffer(VK_COMMAND_BUFFER_LEVEL_PRIMARY, true);
// Image memory barriers for the texture image
// The sub resource range describes the regions of the image that will be transitioned using the memory barriers below
VkImageSubresourceRange subresourceRange = {};
// Image only contains color data
subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
// Start at first mip level
subresourceRange.baseMipLevel = 0;
// We will transition on all mip levels
subresourceRange.levelCount = texture.mipLevels;
// The 2D texture only has one layer
subresourceRange.layerCount = 1;
// Transition the texture image layout to transfer target, so we can safely copy our buffer data to it.
VkImageMemoryBarrier imageMemoryBarrier = vks::initializers::imageMemoryBarrier();;
imageMemoryBarrier.image = texture.image;
imageMemoryBarrier.subresourceRange = subresourceRange;
imageMemoryBarrier.srcAccessMask = 0;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_UNDEFINED;
imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
// Insert a memory dependency at the proper pipeline stages that will execute the image layout transition
// Source pipeline stage is host write/read execution (VK_PIPELINE_STAGE_HOST_BIT)
// Destination pipeline stage is copy command execution (VK_PIPELINE_STAGE_TRANSFER_BIT)
vkCmdPipelineBarrier(
copyCmd,
VK_PIPELINE_STAGE_HOST_BIT,
VK_PIPELINE_STAGE_TRANSFER_BIT,
0,
0, nullptr,
0, nullptr,
1, &imageMemoryBarrier);
// Copy mip levels from staging buffer
vkCmdCopyBufferToImage(
copyCmd,
stagingBuffer,
texture.image,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
static_cast<uint32_t>(bufferCopyRegions.size()),
bufferCopyRegions.data());
// Once the data has been uploaded we transfer to the texture image to the shader read layout, so it can be sampled from
imageMemoryBarrier.srcAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL;
imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
// Insert a memory dependency at the proper pipeline stages that will execute the image layout transition
// Source pipeline stage is copy command execution (VK_PIPELINE_STAGE_TRANSFER_BIT)
// Destination pipeline stage fragment shader access (VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT)
vkCmdPipelineBarrier(
copyCmd,
VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
0,
0, nullptr,
0, nullptr,
1, &imageMemoryBarrier);
// Store current layout for later reuse
texture.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
vulkanDevice->flushCommandBuffer(copyCmd, queue, true);
// Clean up staging resources
vkFreeMemory(device, stagingMemory, nullptr);
vkDestroyBuffer(device, stagingBuffer, nullptr);
} else {
// Copy data to a linear tiled image
VkImage mappableImage;
VkDeviceMemory mappableMemory;
// Load mip map level 0 to linear tiling image
VkImageCreateInfo imageCreateInfo = vks::initializers::imageCreateInfo();
imageCreateInfo.imageType = VK_IMAGE_TYPE_2D;
imageCreateInfo.format = format;
imageCreateInfo.mipLevels = 1;
imageCreateInfo.arrayLayers = 1;
imageCreateInfo.samples = VK_SAMPLE_COUNT_1_BIT;
imageCreateInfo.tiling = VK_IMAGE_TILING_LINEAR;
imageCreateInfo.usage = VK_IMAGE_USAGE_SAMPLED_BIT;
imageCreateInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
imageCreateInfo.initialLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;
imageCreateInfo.extent = { texture.width, texture.height, 1 };
VK_CHECK_RESULT(vkCreateImage(device, &imageCreateInfo, nullptr, &mappableImage));
// Get memory requirements for this image like size and alignment
vkGetImageMemoryRequirements(device, mappableImage, &memReqs);
// Set memory allocation size to required memory size
memAllocInfo.allocationSize = memReqs.size;
// Get memory type that can be mapped to host memory
memAllocInfo.memoryTypeIndex = vulkanDevice->getMemoryType(memReqs.memoryTypeBits, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);
VK_CHECK_RESULT(vkAllocateMemory(device, &memAllocInfo, nullptr, &mappableMemory));
VK_CHECK_RESULT(vkBindImageMemory(device, mappableImage, mappableMemory, 0));
// Map image memory
void *data;
VK_CHECK_RESULT(vkMapMemory(device, mappableMemory, 0, memReqs.size, 0, &data));
// Copy image data of the first mip level into memory
memcpy(data, ktxTextureData, memReqs.size);
vkUnmapMemory(device, mappableMemory);
// Linear tiled images don't need to be staged and can be directly used as textures
texture.image = mappableImage;
texture.deviceMemory = mappableMemory;
texture.imageLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
// Setup image memory barrier transfer image to shader read layout
VkCommandBuffer copyCmd = vulkanDevice->createCommandBuffer(VK_COMMAND_BUFFER_LEVEL_PRIMARY, true);
// The sub resource range describes the regions of the image we will be transition
VkImageSubresourceRange subresourceRange = {};
subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
subresourceRange.baseMipLevel = 0;
subresourceRange.levelCount = 1;
subresourceRange.layerCount = 1;
// Transition the texture image layout to shader read, so it can be sampled from
VkImageMemoryBarrier imageMemoryBarrier = vks::initializers::imageMemoryBarrier();;
imageMemoryBarrier.image = texture.image;
imageMemoryBarrier.subresourceRange = subresourceRange;
imageMemoryBarrier.srcAccessMask = VK_ACCESS_HOST_WRITE_BIT;
imageMemoryBarrier.dstAccessMask = VK_ACCESS_SHADER_READ_BIT;
imageMemoryBarrier.oldLayout = VK_IMAGE_LAYOUT_PREINITIALIZED;
imageMemoryBarrier.newLayout = VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL;
// Insert a memory dependency at the proper pipeline stages that will execute the image layout transition
// Source pipeline stage is host write/read execution (VK_PIPELINE_STAGE_HOST_BIT)
// Destination pipeline stage fragment shader access (VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT)
vkCmdPipelineBarrier(
copyCmd,
VK_PIPELINE_STAGE_HOST_BIT,
VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT,
0,
0, nullptr,
0, nullptr,
1, &imageMemoryBarrier);
vulkanDevice->flushCommandBuffer(copyCmd, queue, true);
}
)

How to read R32G32_SFLOAT image from gpu in Vulkan

I am able to dump stuff from R32G32B32A32 image for screenshot. I would like to read out a pixel from R32G32_SFLOAT image as well. But the result look weird.
below is my working image dump code(no validation error)
void DumpImageToFile(VkTool::VulkanDevice &device, VkQueue graphics_queue, VkTool::Wrapper::CommandBuffers &command_buffer, VkImage image, uint32_t width, uint32_t height, const char *filename)
{
auto image_create_info = VkTool::Initializer::GenerateImageCreateInfo(VK_IMAGE_TYPE_2D, VK_FORMAT_R8G8B8A8_UNORM, {width, height, 1},
VK_IMAGE_USAGE_TRANSFER_SRC_BIT | VK_IMAGE_USAGE_TRANSFER_DST_BIT, VK_SAMPLE_COUNT_1_BIT);
VkTool::Wrapper::Image staging_image(device, image_create_info, VK_MEMORY_HEAP_DEVICE_LOCAL_BIT);
auto buffer_create_info = VkTool::Initializer::GenerateBufferCreateInfo(width * height * 4, VK_BUFFER_USAGE_TRANSFER_DST_BIT);
VkTool::Wrapper::Buffer staging_buffer(device, buffer_create_info, VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT | VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);
// Copy texture to buffer
command_buffer.Begin();
auto image_memory_barrier = VkTool::Initializer::GenerateImageMemoryBarrier(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
{ VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 }, staging_image.Get());
device.vkCmdPipelineBarrier(command_buffer.Get(), VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0
, 0, nullptr, 0, nullptr, 1, &image_memory_barrier);
image_memory_barrier = VkTool::Initializer::GenerateImageMemoryBarrier(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL,
{ VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 }, image);
device.vkCmdPipelineBarrier(command_buffer.Get(), VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0
, 0, nullptr, 0, nullptr, 1, &image_memory_barrier);
// Copy!!
VkImageBlit region = {};
region.srcSubresource = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 0, 1 };
region.srcOffsets[0] = { 0, 0, 0 };
region.srcOffsets[1] = { static_cast<int32_t>(width), static_cast<int32_t>(height), 1};
region.dstSubresource = { VK_IMAGE_ASPECT_COLOR_BIT, 0, 0, 1 };
region.dstOffsets[0] = { 0, 0, 0 };
region.dstOffsets[1] = { static_cast<int32_t>(width), static_cast<int32_t>(height), 1 };
device.vkCmdBlitImage(command_buffer.Get(), image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, staging_image.Get(), VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, &region, VK_FILTER_LINEAR);
image_memory_barrier = VkTool::Initializer::GenerateImageMemoryBarrier(VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, VK_IMAGE_LAYOUT_PRESENT_SRC_KHR,
{ VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 }, image);
device.vkCmdPipelineBarrier(command_buffer.Get(), VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT, 0
, 0, nullptr, 0, nullptr, 1, &image_memory_barrier);
image_memory_barrier = VkTool::Initializer::GenerateImageMemoryBarrier(VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL,
{ VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 }, staging_image.Get());
device.vkCmdPipelineBarrier(command_buffer.Get(), VK_PIPELINE_STAGE_TRANSFER_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0
, 0, nullptr, 0, nullptr, 1, &image_memory_barrier);
auto buffer_image_copy = VkTool::Initializer::GenerateBufferImageCopy({ VK_IMAGE_ASPECT_COLOR_BIT , 0, 0, 1 }, { width, height, 1 });
device.vkCmdCopyImageToBuffer(command_buffer.Get(), staging_image.Get(), VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, staging_buffer.Get(), 1, &buffer_image_copy);
command_buffer.End();
std::vector<VkCommandBuffer> raw_command_buffers = command_buffer.GetAll();
auto submit_info = VkTool::Initializer::GenerateSubmitInfo(raw_command_buffers);
VkTool::Wrapper::Fence fence(device);
device.vkQueueSubmit(graphics_queue, 1, &submit_info, fence.Get());
fence.Wait();
fence.Destroy();
const uint8_t *mapped_address = reinterpret_cast<const uint8_t *>(staging_buffer.MapMemory());
lodepng::encode(filename, mapped_address, width, height);
staging_buffer.UnmapMemory();
staging_image.Destroy();
staging_buffer.Destroy();
}
Sorry for the ugly self-made wrapper, there was no official wrapper. Basically, it creates a staging image and buffer. first copy from source image to staging image with vkCmdBlitImage. then use vkCmdCopyImageToBuffer and map the buffer to host memory. This method works on multiple gpus and it does not need to worry about padding.(I guess, correct me if I am wrong).
However, I have no luck to use this method to read R32G32_SFLOAT. at first I thought it was because of endianness until I dump the whole image out.
The image above is I directly convert R32G32_SFLOAT to R8G8B8A8_UNORM, I know it does not make sense. But without changing format, there's still a lot of "hole" in the image and values are deadly wrong.
I am not really sure if it is THE problem, but if I understand your code, you want to put image into filename.
So you want to read from this image. However, you said that the old layout for this image (not the staging one) is UNDEFINED layout. The implementation is free to assume you do not care about data that are stored in it. Use the real layout instead (I think it is COLOR_ATTACHMENT or something like that).
Moreover, you are using one staging image and one staging buffer. I do not really understand why are you doing such a thing? Why not simply use vkCmdCopyImageToBuffer function with image to staging_buffer?
BTW, with Vulkan it is not because one code works on some GPUs that this code is correct.
Also, I think you must use a memory barrier after your transfer to the buffer that implies HOST_STAGE and HOST_READ. In the specification, it is write :
Signaling a fence and waiting on the host does not guarantee that the results of memory accesses will be visible to the host, as the access scope of a memory dependency defined by a fence only includes device access. A memory barrier or other memory dependency must be used to guarantee this. See the description of host access types for more information.
This part of your code seems weird:
image_memory_barrier = VkTool::Initializer::GenerateImageMemoryBarrier(VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, { VK_IMAGE_ASPECT_COLOR_BIT, 0, 1, 0, 1 }, image);
device.vkCmdPipelineBarrier(command_buffer.Get(), VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT, VK_PIPELINE_STAGE_TRANSFER_BIT, 0, 0, nullptr, 0, nullptr, 1, &image_memory_barrier);
This basically means that after the barrier your source image may not have any data. UNDEFINED value used as a source layout doesn't guarantee that the contents of an image are preserved.

generating of mipmaps using vkCmdBlitImage for cubemap textures

what should be the parameters of VkImageBlit.dstOffsets and VkImageBlit.srcOffsets when we are doing dynamic generation of mipmaps?
I am doing layer by layer and for each mipmap level but somewhere it is going wrong, mostly i think offsets. So i have data which has all the six faces with 0th mipmap level.
for(int j=0; j< bufferCopyRegions.size(); j++) {
for (int32_t i = 1; i < mipLevels; i++)
{
VkImageBlit imageBlit{};
// Source
imageBlit.srcSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
imageBlit.srcSubresource.layerCount = 1;
imageBlit.srcSubresource.mipLevel = 0;
imageBlit.srcOffsets[1].x = bitmapInfos[j].width;
imageBlit.srcOffsets[1].y = bitmapInfos[j].height;
imageBlit.srcOffsets[1].z = 1;
// Destination
imageBlit.dstSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
imageBlit.dstSubresource.layerCount = 1;
imageBlit.dstSubresource.mipLevel = i;
imageBlit.dstOffsets[1].x = int32_t(bitmapInfos[j].width >> (i) == 0 ? 1 : int32_t(bitmapInfos[j].width >> (i )));
imageBlit.dstOffsets[1].y = int32_t(bitmapInfos[j].height >> (i) == 0 ? 1 : int32_t(bitmapInfos[j].height >> (i)));
imageBlit.dstOffsets[1].z = 1;
VkImageMemoryBarrier imageMemoryBarrier = {};
imageMemoryBarrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
imageMemoryBarrier.pNext = NULL;
imageMemoryBarrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
imageMemoryBarrier.subresourceRange.baseMipLevel = i;
imageMemoryBarrier.subresourceRange.levelCount = 1;
imageMemoryBarrier.subresourceRange.baseArrayLayer = j;
imageMemoryBarrier.subresourceRange.layerCount = 1;
// change layout of current mip level to transfer dest
setImageLayout(imageMemoryBarrier,
blitCmd,
image,
VK_IMAGE_ASPECT_COLOR_BIT,
VK_IMAGE_LAYOUT_UNDEFINED,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, imageMemoryBarrier.subresourceRange,
VK_PIPELINE_STAGE_TRANSFER_BIT,
VK_PIPELINE_STAGE_HOST_BIT);
// Do blit operation from previous mip level
vkCmdBlitImage(blitCmd, image, VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, image,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL, 1, &imageBlit, VK_FILTER_LINEAR);
setImageLayout(imageMemoryBarrier, blitCmd, image, VK_IMAGE_ASPECT_COLOR_BIT,
VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL, imageMemoryBarrier.subresourceRange,
VK_PIPELINE_STAGE_HOST_BIT,
VK_PIPELINE_STAGE_TRANSFER_BIT);
}
}
I don't see baseArrayLayer of the imageBlit.srcSubresource and imageBlit.dstSubresource set to j. Which is probably your immediate problem.
Also your barriers seem bad to me. Only the top mip needs to be synchronized with host. But even so VK_PIPELINE_STAGE_HOST_BIT should not be necessary, because there is an exception for vkQueueSubmit saying it does this kind of synchronization implicitly if host writes ended before it being called (6.9. Host Write Ordering Guarantees and reminded in the Note in 6.1.3. Access Types).

Setting up effect audio units for CoreAudio

I'm trying to setup a high-pass filter but AUGraphStart gives me -10863 when I try. I cannot find much documntation at all. Here is my attent to set up the filter:
- (void)initializeAUGraph{
AUNode outputNode;
AUNode mixerNode;
AUNode effectNode;
NewAUGraph(&mGraph);
// Create AudioComponentDescriptions for the AUs we want in the graph
// mixer component
AudioComponentDescription mixer_desc;
mixer_desc.componentType = kAudioUnitType_Mixer;
mixer_desc.componentSubType = kAudioUnitSubType_AU3DMixerEmbedded;
mixer_desc.componentFlags = 0;
mixer_desc.componentFlagsMask = 0;
mixer_desc.componentManufacturer = kAudioUnitManufacturer_Apple;
// output component
AudioComponentDescription output_desc;
output_desc.componentType = kAudioUnitType_Output;
output_desc.componentSubType = kAudioUnitSubType_RemoteIO;
output_desc.componentFlags = 0;
output_desc.componentFlagsMask = 0;
output_desc.componentManufacturer = kAudioUnitManufacturer_Apple;
//effect component
AudioComponentDescription effect_desc;
effect_desc.componentType = kAudioUnitType_Effect;
effect_desc.componentSubType = kAudioUnitSubType_HighPassFilter;
effect_desc.componentFlags = 0;
effect_desc.componentFlagsMask = 0;
effect_desc.componentManufacturer = kAudioUnitManufacturer_Apple;
// Add nodes to the graph to hold our AudioUnits
AUGraphAddNode(mGraph, &output_desc, &outputNode);
AUGraphAddNode(mGraph, &mixer_desc, &mixerNode);
AUGraphAddNode(mGraph, &effect_desc, &effectNode);
// Connect the nodes
AUGraphConnectNodeInput(mGraph, mixerNode, 0, effectNode, 0);
AUGraphConnectNodeInput(mGraph, effectNode, 0, outputNode, 0);
//Open Graph
AUGraphOpen(mGraph);
// Get a link to the mixer AU
AUGraphNodeInfo(mGraph, mixerNode, NULL, &mMixer);
// Get a link to the effect AU
AUGraphNodeInfo(mGraph, effectNode, NULL, &mEffect);
//Setup buses
size_t numbuses = track_count;
UInt32 size = sizeof(numbuses);
AudioUnitSetProperty(mMixer, kAudioUnitProperty_ElementCount, kAudioUnitScope_Input, 0, &numbuses, size);
//Setup Stream Format Data
AudioStreamBasicDescription desc;
size = sizeof(desc);
// Setup Stream Format
desc.mSampleRate = kGraphSampleRate;
desc.mFormatID = kAudioFormatLinearPCM;
desc.mFormatFlags = kAudioFormatFlagIsSignedInteger | kAudioFormatFlagIsPacked;
desc.mBitsPerChannel = sizeof(AudioSampleType) * 8; // AudioSampleType == 16 bit signed ints
desc.mChannelsPerFrame = 1;
desc.mFramesPerPacket = 1;
desc.mBytesPerFrame = sizeof(AudioSampleType);
desc.mBytesPerPacket = desc.mBytesPerFrame;
// Loop through and setup a callback for each source you want to send to the mixer.
for (int i = 0; i < numbuses; ++i) {
// Setup render callback struct
AURenderCallbackStruct renderCallbackStruct;
renderCallbackStruct.inputProc = &renderInput;
renderCallbackStruct.inputProcRefCon = self;
// Connect the callback to the mixer input channel
AUGraphSetNodeInputCallback(mGraph, mixerNode, i, &renderCallbackStruct);
// Apply Stream Data
AudioUnitSetProperty(mMixer, kAudioUnitProperty_StreamFormat,kAudioUnitScope_Input,i,&desc,size);
AudioUnitSetParameter(mMixer, k3DMixerParam_Distance, kAudioUnitScope_Input, i, rand() % 6, 0);
rotation[i] = rand() % 360;
rotation_speed[i] = rand() % 5;
AudioUnitSetParameter(mMixer, k3DMixerParam_Azimuth, kAudioUnitScope_Input, i, rotation[i], 0);
AudioUnitSetParameter(mMixer, k3DMixerParam_Elevation, kAudioUnitScope_Input, i, 30, 0);
}
// Reset stream fromat data to 0
memset (&desc, 0, sizeof (desc));
// Setup output stream format
desc.mSampleRate = kGraphSampleRate;
// Apply Stream Data to Output
AudioUnitSetProperty(mEffect,kAudioUnitProperty_StreamFormat,kAudioUnitScope_Input,0,&desc,size);
AudioUnitSetProperty(mEffect,kAudioUnitProperty_StreamFormat,kAudioUnitScope_Output,0,&desc,size);
AudioUnitSetProperty(mMixer,kAudioUnitProperty_StreamFormat,kAudioUnitScope_Output,0,&desc,size);
//All done so initialise
AUGraphInitialize(mGraph);
}
It works when I remove the high pass filter. How do I get the filter working?
Thank you.
PS: Is the 3D elevation supposed to do nothing?
if u still have the problem...u should add a converter unit between the mixernode and the effect node and set the input format of it as the mixer output and the output format to the format u get from audiounitgetproperty (converternode)
Not all Audio Units that are available on OSX are available on iOS. In fact, only a few are. According to below documentation the highpassfilter effect is not supported on iOS : http://developer.apple.com/library/ios/#documentation/MusicAudio/Conceptual/AudioUnitHostingGuide_iOS/UsingSpecificAudioUnits/UsingSpecificAudioUnits.html#//apple_ref/doc/uid/TP40009492-CH17-SW1
"The iPod EQ unit (subtype kAudioUnitSubType_AUiPodEQ) is the only effect unit provided in iOS 4."
Note that it mentions iOS4. But I am unable to find any documentation on this for later versions of iOS.