Can Vulkan synchronization source and destination stage mask be the same? - vulkan

The following code from the Vulkan Tutorial seems to conflict with how synchronization scopes work.
// <dependency> is a subpass dependency.
dependency.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
...
dependency.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
The above code is trying to set both the srcStageMask and dstStageMask to be the same pipeline stage: VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT.
According to Vulkan Specification:
If a synchronization command includes a source stage mask, its first synchronization scope only includes execution of the pipeline stages specified in that mask, ...
If a synchronization command includes a destination stage mask, its second synchronization scope only includes execution of the pipeline stages specified in that mask, ...
In other words, srcStageMask and dstStageMask create a first synchronization scope with specified stage(s) and a second one with the specified stage(s), respectively.
Also, according to the following:
... for two sets of operations, the first set must happen before the second set.
My confusion is that, since the source and destination stage are the same, the subpass dependency is requiring this pipeline stage must complete before the exact same stage starts to execute.
The color attachment output stage is already guaranteed to be finished (the first scope). How can you specify to start to execute the same finished stage again? (the second scope)
So what is this dependency is trying to say?

A stage only exists within an action command that executes some portion of itself within that stage. Synchronization scopes are based on commands first. Once you have defined which commands are in the scope, stage masks can specify which stages within those commands are affected by the synchronization.
As such, all synchronization operations define a set of commands that happen before the synchronization and the set of commands that happen after. These represent the "first synchronization scope" and "second synchronization scope".
The source stage mask applies to the commands in the "first synchronization scope". The destination stage mask applies to commands in the "second synchronization scope". The commands in one scope are a distinct set from the other scope. So even if you're talking about the same pipeline stages, they're stages in different commands that execute at different times.
So what that does is exactly what it says: it creates a dependency between all executions of the color attachment stage from the source subpass (aka: the "first synchronization scope") and all executions of the color attachment stage from the destination subpass (aka: the "second synchronization scope").

Related

Is this memory access barrier flag not sufficient?

In a Vulkan example the author dstStageMask as VK_PIPELINE_STAGE_BOTTOM_OF_PIPE and the dstAccessMask as VK_ACCESS_MEMORY_READ_BIT. Now, from my previous questions asked here, the access mask flags only apply specifically and explicitly to each stage flag provided (not to all stages logically before. So for example if I have an access mask of memory read for a stage of fragment shader, then that access mask (cache invalidation) doesn't apply to vertex shader or vertex input, rather I would have to specify both those stage flags separately.
It seems to me, in light of this, that having a dstAccessMask of VK_ACCESS_MEMORY_READ_BIT with a dstStageMask of VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT wouldn't do anything, as the access mask doesn't apply to "logically earlier stages", but only the "BOTTOM_OF_PIPE" stage. Here is Sascha Willem's code from the multisampling example:
dependencies[0].srcSubpass = VK_SUBPASS_EXTERNAL;
dependencies[0].dstSubpass = 0;
dependencies[0].srcStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT;
dependencies[0].dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependencies[0].srcAccessMask = VK_ACCESS_MEMORY_READ_BIT;
dependencies[0].dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT |
VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
dependencies[0].dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT;
dependencies[1].srcSubpass = 0;
dependencies[1].dstSubpass = VK_SUBPASS_EXTERNAL;
dependencies[1].srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT;
dependencies[1].dstStageMask = VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT;
dependencies[1].srcAccessMask = VK_ACCESS_COLOR_ATTACHMENT_READ_BIT | VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT;
dependencies[1].dstAccessMask = VK_ACCESS_MEMORY_READ_BIT;
dependencies[1].dependencyFlags = VK_DEPENDENCY_BY_REGION_BIT;
And here is the part of the spec that says the access mask only applies to the exact stages as given by each flag explicitly, not logically earlier stages:
Including a particular pipeline stage in the first synchronization
scope of a command implicitly includes logically earlier pipeline
stages in the synchronization scope. Similarly, the second
synchronization scope includes logically later pipeline stages.
However, note that access scopes are not affected in this way - only
the precise stages specified are considered part of each access scope.
VK_PIPELINE_STAGE_NONE (and equivalents) do not have accesses, and the latest practice is to just write 0 (resp. VK_ACCESS_NONE) in the access flag. Early day Vulkan practices were messy, and you will find plenty of unupdated code...

What does "VkImageMemoryBarrier::srcAccessMask = 0" mean?

I just read Images Vulkan tutorial, and I didn't understand about "VkImageMemoryBarrier::srcAccessMask = 0".
code:
barrier.srcAccessMask = 0;
barrier.dstAccessMask = VK_ACCESS_TRANSFER_WRITE_BIT;
and this tutorial say:
Since the transitionImageLayout function executes a command buffer with only a single command, you could use this implicit synchronization and set srcAccessMask to 0 if you ever needed a VK_ACCESS_HOST_WRITE_BIT dependency in a layout transition.
Q1 : If function have commandbuffer with multi command, then can't use this implicit synchronization?
Q2 : According to the manual page, VK_ACCESS_HOST_WRITE_BIT is 0x00004000. but tutorial use "0". why?
it's "0" mean implicit
it's "VK_ACCESS_HOST_WRITE_BIT" mean explicit ?
Am I understanding correctly?
0 access mask means "nothing". As in, there is no memory dependency the barrier introduces.
Implicit synchronization means Vulkan does it for you. As the tutorial says:
One thing to note is that command buffer submission results in implicit VK_ACCESS_HOST_WRITE_BIT synchronization
Specifically this is Host Write Ordering Guarantee.
Implicit means you don't have to do anything. Any host write to mapped memory is already automatically visible to any device access of any vkQueueSubmit called after the mapped memory write.
Explicit in this case would mean to submit a barrier with VK_PIPELINE_STAGE_HOST_BIT and VK_ACCESS_HOST_*_BIT.
Note the sync guarantees only work one way. So CPU → GPU will be automatic\implicit. But GPU → CPU always need to be explicit (you need a barrier with dst = VK_PIPELINE_STAGE_HOST_BIT to perform memory domain transfer operation).

Do the _WRITE_BIT destination access masks imply _READ_BIT access scope?

I think the question is clear, but in case the answer is no I'll describe the conundrum I have:
Minimal setup so a single render pass with a single subpass. Two attachments: color and depth, rendering a cube. The Depth attachment layouts (initial, mid, final) are:
VK_IMAGE_LAYOUT_UNDEFINED
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL
So there's one automatic layout transition. I know that because of my .loadOp = VK_ATTACHMENT_LOAD_OP_CLEAR, I'll get a write-after-write warning If I don't make it visible. So I'll use this subpass dependency:
constexpr VkSubpassDependency in_dependency{
.srcSubpass = VK_SUBPASS_EXTERNAL,
.dstSubpass = 0,
.srcStageMask = VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT,
.dstStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT,
.srcAccessMask = 0,
.dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
};
This targets the early fragment test because that's where the depth att gets clear-loaded. But: Don't I also need to include the _READ_BIT in my .dstAccessMask? Sync validation doesn't seem to care, but I think I do unless I missed some rule about the write visibility implying a read visibility?
In case there is such a thing, a pointer to the spec would be nice.
WRITE does not include READ. This is simply a matter of the operation in question.
Clearing an image uses the WRITE access mode. It does not use the READ access mode. So there's is no further hazard as far as clearing is concerned.
Once the image is cleared, the subpass can begin executing. Since subpass execution happens-after the clearing operation, there's no need for any further dependency.

VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT VkAccessFlags set to 0?

In the Vulkan spec it defines:
VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT is equivalent to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with
VkAccessFlags set to 0 when specified in the second synchronization scope, but specifies no
stages in the first scope.
and similarly:
VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT is equivalent to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with
VkAccessFlags set to 0 when specified in the first synchronization scope, but specifies no stages
in the second scope.
I'm unclear what it means by "with VkAccessFlags set to 0" in this context?
Technically VkAccessFlags is a type, not a variable, so it can't be set to anything.
(It seems to be adjusting the definitions of TOP/BOTTOM_OF_PIPE for some special property of VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with respect to VkAccessFlags, but I can't quite see what that special property is or where it is specified.)
Anyone know what it's talking about?
(or, put another way: If we removed those two utterances of "with VkAccessFlags set to 0" from the spec, what would break?)
It is roundabout way to say the interpretation of the stage flag is different for a memory dependency.
For execution dependency in src it takes the stage bits you provide, and logically-earlier stages are included automatically. Similarly for dst, logically-later stages are included automatically.
But this applies only to the execution dependency. For a memory dependency, only the stage flags you provide count (and none are added automatically).
For example, let's say you have VK_PIPELINE_STAGE_ALL_COMMANDS_BIT + VK_ACCESS_MEMORY_WRITE_BIT in src. That means all memory writes from all previous commands will be made available. But if you have VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT + VK_ACCESS_MEMORY_WRITE_BIT in src, that means all memory writes from only BOTTOM_OF_PIPE stage are made available, so no memory writes are made available (because that particular stage doesn't make any).
Either way IMO, for code clarity it is better to always state all pipeline stages explicitly whenever one can.

Are draw calls sequenced in command buffers? [duplicate]

This question already has an answer here:
Synchronization between drawcalls in Vulkan
(1 answer)
Closed 3 years ago.
Suppose I have two VkPipelines and within a VkCommandBuffer I record...
vkCmdBeginRenderPass(cmd, /*...*/);
vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline1);
vkCmdDraw(cmd, /*...*/); // [1]
vkCmdBindPipeline(cmd, VK_PIPELINE_BIND_POINT_GRAPHICS, pipeline2);
vkCmdDraw(cmd, /*...*/); // [2]
vkCmdEndRenderPass(cmd);
When the command buffer is queued and executed, will it be as-if the rendering operations to the framebuffer attachments of [1] are fully realized before [2] starts executing.
ie Will [2] draw over [1] ?
Most stages in Vulkan execute in an arbitrary order relative to each other. However, rasterization order is respected with regard to framebuffer attachment processes within a subpass (between subpasses, you have to use subpass dependencies, and outside of the renderpass, you'll need either external subpass dependencies or a barrier). Each primitive is ordered relative to each other primitive, and the implementation must respect rasterization order when doing reordering.
The stages that follow rasterization order atomically include depth/stencil test, blending, write masking, and the like, but they do not include the fragment shader itself. That is, FS outputs have to go through rasterization order, but FS side effects (ie: writes via image store or SSBOs) do not.
There's a set of rules defined in 24.2. Rasterization Order regarding primitive drawing in a single subpass. According to these rules, blending operations and color writes of the second primitive should happen strictly after blending operations and color writes of the first primitive.