mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
940 lines
40 KiB
940 lines
40 KiB
.. _context: |
|
|
|
Context |
|
======= |
|
|
|
A Gallium rendering context encapsulates the state which effects 3D |
|
rendering such as blend state, depth/stencil state, texture samplers, |
|
etc. |
|
|
|
Note that resource/texture allocation is not per-context but per-screen. |
|
|
|
|
|
Methods |
|
------- |
|
|
|
CSO State |
|
^^^^^^^^^ |
|
|
|
All Constant State Object (CSO) state is created, bound, and destroyed, |
|
with triplets of methods that all follow a specific naming scheme. |
|
For example, ``create_blend_state``, ``bind_blend_state``, and |
|
``destroy_blend_state``. |
|
|
|
CSO objects handled by the context object: |
|
|
|
* :ref:`Blend`: ``*_blend_state`` |
|
* :ref:`Sampler`: Texture sampler states are bound separately for fragment, |
|
vertex, geometry and compute shaders with the ``bind_sampler_states`` |
|
function. The ``start`` and ``num_samplers`` parameters indicate a range |
|
of samplers to change. NOTE: at this time, start is always zero and |
|
the CSO module will always replace all samplers at once (no sub-ranges). |
|
This may change in the future. |
|
* :ref:`Rasterizer`: ``*_rasterizer_state`` |
|
* :ref:`depth-stencil-alpha`: ``*_depth_stencil_alpha_state`` |
|
* :ref:`Shader`: These are create, bind and destroy methods for vertex, |
|
fragment and geometry shaders. |
|
* :ref:`vertexelements`: ``*_vertex_elements_state`` |
|
|
|
|
|
Resource Binding State |
|
^^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
This state describes how resources in various flavors (textures, |
|
buffers, surfaces) are bound to the driver. |
|
|
|
|
|
* ``set_constant_buffer`` sets a constant buffer to be used for a given shader |
|
type. index is used to indicate which buffer to set (some APIs may allow |
|
multiple ones to be set, and binding a specific one later, though drivers |
|
are mostly restricted to the first one right now). |
|
If take_ownership is true, the buffer reference is passed to the driver, so |
|
that the driver doesn't have to increment the reference count. |
|
|
|
* ``set_inlinable_constants`` sets inlinable constants for constant buffer 0. |
|
|
|
These are constants that the driver would like to inline in the IR |
|
of the current shader and recompile it. Drivers can determine which |
|
constants they prefer to inline in finalize_nir and store that |
|
information in shader_info::*inlinable_uniform*. When the state tracker |
|
or frontend uploads constants to a constant buffer, it can pass |
|
inlinable constants separately via this call. |
|
|
|
Any ``set_constant_buffer`` call invalidates inlinable constants, so |
|
``set_inlinable_constants`` must be called after it. Binding a shader also |
|
invalidates this state. |
|
|
|
There is no ``PIPE_CAP`` for this. Drivers shouldn't set the shader_info |
|
fields if they don't implement ``set_inlinable_constants``. |
|
|
|
* ``set_framebuffer_state`` |
|
|
|
* ``set_vertex_buffers`` |
|
|
|
|
|
Non-CSO State |
|
^^^^^^^^^^^^^ |
|
|
|
These pieces of state are too small, variable, and/or trivial to have CSO |
|
objects. They all follow simple, one-method binding calls, e.g. |
|
``set_blend_color``. |
|
|
|
* ``set_stencil_ref`` sets the stencil front and back reference values |
|
which are used as comparison values in stencil test. |
|
* ``set_blend_color`` |
|
* ``set_sample_mask`` sets the per-context multisample sample mask. Note |
|
that this takes effect even if multisampling is not explicitly enabled if |
|
the framebuffer surface(s) are multisampled. Also, this mask is AND-ed |
|
with the optional fragment shader sample mask output (when emitted). |
|
* ``set_sample_locations`` sets the sample locations used for rasterization. |
|
```get_sample_position``` still returns the default locations. When NULL, |
|
the default locations are used. |
|
* ``set_min_samples`` sets the minimum number of samples that must be run. |
|
* ``set_clip_state`` |
|
* ``set_polygon_stipple`` |
|
* ``set_scissor_states`` sets the bounds for the scissor test, which culls |
|
pixels before blending to render targets. If the :ref:`Rasterizer` does |
|
not have the scissor test enabled, then the scissor bounds never need to |
|
be set since they will not be used. Note that scissor xmin and ymin are |
|
inclusive, but xmax and ymax are exclusive. The inclusive ranges in x |
|
and y would be [xmin..xmax-1] and [ymin..ymax-1]. The number of scissors |
|
should be the same as the number of set viewports and can be up to |
|
PIPE_MAX_VIEWPORTS. |
|
* ``set_viewport_states`` |
|
* ``set_window_rectangles`` sets the window rectangles to be used for |
|
rendering, as defined by GL_EXT_window_rectangles. There are two |
|
modes - include and exclude, which define whether the supplied |
|
rectangles are to be used for including fragments or excluding |
|
them. All of the rectangles are ORed together, so in exclude mode, |
|
any fragment inside any rectangle would be culled, while in include |
|
mode, any fragment outside all rectangles would be culled. xmin/ymin |
|
are inclusive, while xmax/ymax are exclusive (same as scissor states |
|
above). Note that this only applies to draws, not clears or |
|
blits. (Blits have their own way to pass the requisite rectangles |
|
in.) |
|
* ``set_tess_state`` configures the default tessellation parameters: |
|
|
|
* ``default_outer_level`` is the default value for the outer tessellation |
|
levels. This corresponds to GL's ``PATCH_DEFAULT_OUTER_LEVEL``. |
|
* ``default_inner_level`` is the default value for the inner tessellation |
|
levels. This corresponds to GL's ``PATCH_DEFAULT_INNER_LEVEL``. |
|
* ``set_patch_vertices`` sets the number of vertices per input patch |
|
for tessellation. |
|
|
|
* ``set_debug_callback`` sets the callback to be used for reporting |
|
various debug messages, eventually reported via KHR_debug and |
|
similar mechanisms. |
|
|
|
Samplers |
|
^^^^^^^^ |
|
|
|
pipe_sampler_state objects control how textures are sampled |
|
(coordinate wrap modes, interpolation modes, etc). Note that unless |
|
``PIPE_CAP_TEXTURE_BUFFER_SAMPLER`` is enabled, samplers are not used for |
|
texture buffer objects. That is, pipe_context::bind_sampler_views() |
|
will not bind a sampler if the corresponding sampler view refers to a |
|
PIPE_BUFFER resource. |
|
|
|
Sampler Views |
|
^^^^^^^^^^^^^ |
|
|
|
These are the means to bind textures to shader stages. To create one, specify |
|
its format, swizzle and LOD range in sampler view template. |
|
|
|
If texture format is different than template format, it is said the texture |
|
is being cast to another format. Casting can be done only between compatible |
|
formats, that is formats that have matching component order and sizes. |
|
|
|
Swizzle fields specify the way in which fetched texel components are placed |
|
in the result register. For example, ``swizzle_r`` specifies what is going to be |
|
placed in first component of result register. |
|
|
|
The ``first_level`` and ``last_level`` fields of sampler view template specify |
|
the LOD range the texture is going to be constrained to. Note that these |
|
values are in addition to the respective min_lod, max_lod values in the |
|
pipe_sampler_state (that is if min_lod is 2.0, and first_level 3, the first mip |
|
level used for sampling from the resource is effectively the fifth). |
|
|
|
The ``first_layer`` and ``last_layer`` fields specify the layer range the |
|
texture is going to be constrained to. Similar to the LOD range, this is added |
|
to the array index which is used for sampling. |
|
|
|
* ``set_sampler_views`` binds an array of sampler views to a shader stage. |
|
Every binding point acquires a reference |
|
to a respective sampler view and releases a reference to the previous |
|
sampler view. |
|
|
|
Sampler views outside of ``[start_slot, start_slot + num_views)`` are |
|
unmodified. If ``views`` is NULL, the behavior is the same as if |
|
``views[n]`` was NULL for the entire range, i.e. releasing the reference |
|
for all the sampler views in the specified range. |
|
|
|
* ``create_sampler_view`` creates a new sampler view. ``texture`` is associated |
|
with the sampler view which results in sampler view holding a reference |
|
to the texture. Format specified in template must be compatible |
|
with texture format. |
|
|
|
* ``sampler_view_destroy`` destroys a sampler view and releases its reference |
|
to associated texture. |
|
|
|
Hardware Atomic buffers |
|
^^^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
Buffers containing hw atomics are required to support the feature |
|
on some drivers. |
|
|
|
Drivers that require this need to fill the ``set_hw_atomic_buffers`` method. |
|
|
|
Shader Resources |
|
^^^^^^^^^^^^^^^^ |
|
|
|
Shader resources are textures or buffers that may be read or written |
|
from a shader without an associated sampler. This means that they |
|
have no support for floating point coordinates, address wrap modes or |
|
filtering. |
|
|
|
There are 2 types of shader resources: buffers and images. |
|
|
|
Buffers are specified using the ``set_shader_buffers`` method. |
|
|
|
Images are specified using the ``set_shader_images`` method. When binding |
|
images, the ``level``, ``first_layer`` and ``last_layer`` pipe_image_view |
|
fields specify the mipmap level and the range of layers the image will be |
|
constrained to. |
|
|
|
Surfaces |
|
^^^^^^^^ |
|
|
|
These are the means to use resources as color render targets or depthstencil |
|
attachments. To create one, specify the mip level, the range of layers, and |
|
the bind flags (either PIPE_BIND_DEPTH_STENCIL or PIPE_BIND_RENDER_TARGET). |
|
Note that layer values are in addition to what is indicated by the geometry |
|
shader output variable XXX_FIXME (that is if first_layer is 3 and geometry |
|
shader indicates index 2, the 5th layer of the resource will be used). These |
|
first_layer and last_layer parameters will only be used for 1d array, 2d array, |
|
cube, and 3d textures otherwise they are 0. |
|
|
|
* ``create_surface`` creates a new surface. |
|
|
|
* ``surface_destroy`` destroys a surface and releases its reference to the |
|
associated resource. |
|
|
|
Stream output targets |
|
^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
Stream output, also known as transform feedback, allows writing the primitives |
|
produced by the vertex pipeline to buffers. This is done after the geometry |
|
shader or vertex shader if no geometry shader is present. |
|
|
|
The stream output targets are views into buffer resources which can be bound |
|
as stream outputs and specify a memory range where it's valid to write |
|
primitives. The pipe driver must implement memory protection such that any |
|
primitives written outside of the specified memory range are discarded. |
|
|
|
Two stream output targets can use the same resource at the same time, but |
|
with a disjoint memory range. |
|
|
|
Additionally, the stream output target internally maintains the offset |
|
into the buffer which is incremented every time something is written to it. |
|
The internal offset is equal to how much data has already been written. |
|
It can be stored in device memory and the CPU actually doesn't have to query |
|
it. |
|
|
|
The stream output target can be used in a draw command to provide |
|
the vertex count. The vertex count is derived from the internal offset |
|
discussed above. |
|
|
|
* ``create_stream_output_target`` create a new target. |
|
|
|
* ``stream_output_target_destroy`` destroys a target. Users of this should |
|
use pipe_so_target_reference instead. |
|
|
|
* ``set_stream_output_targets`` binds stream output targets. The parameter |
|
offset is an array which specifies the internal offset of the buffer. The |
|
internal offset is, besides writing, used for reading the data during the |
|
draw_auto stage, i.e. it specifies how much data there is in the buffer |
|
for the purposes of the draw_auto stage. -1 means the buffer should |
|
be appended to, and everything else sets the internal offset. |
|
|
|
* ``stream_output_target_offset`` Retrieve the internal stream offset from |
|
an streamout target. This is used to implement Vulkan pause/resume support |
|
which needs to pass the internal offset to the API. |
|
|
|
NOTE: The currently-bound vertex or geometry shader must be compiled with |
|
the properly-filled-in structure pipe_stream_output_info describing which |
|
outputs should be written to buffers and how. The structure is part of |
|
pipe_shader_state. |
|
|
|
Clearing |
|
^^^^^^^^ |
|
|
|
Clear is one of the most difficult concepts to nail down to a single |
|
interface (due to both different requirements from APIs and also driver/hw |
|
specific differences). |
|
|
|
``clear`` initializes some or all of the surfaces currently bound to |
|
the framebuffer to particular RGBA, depth, or stencil values. |
|
Currently, this does not take into account color or stencil write masks (as |
|
used by GL), and always clears the whole surfaces (no scissoring as used by |
|
GL clear or explicit rectangles like d3d9 uses). It can, however, also clear |
|
only depth or stencil in a combined depth/stencil surface. |
|
If a surface includes several layers then all layers will be cleared. |
|
|
|
``clear_render_target`` clears a single color rendertarget with the specified |
|
color value. While it is only possible to clear one surface at a time (which can |
|
include several layers), this surface need not be bound to the framebuffer. |
|
If render_condition_enabled is false, any current rendering condition is ignored |
|
and the clear will be unconditional. |
|
|
|
``clear_depth_stencil`` clears a single depth, stencil or depth/stencil surface |
|
with the specified depth and stencil values (for combined depth/stencil buffers, |
|
it is also possible to only clear one or the other part). While it is only |
|
possible to clear one surface at a time (which can include several layers), |
|
this surface need not be bound to the framebuffer. |
|
If render_condition_enabled is false, any current rendering condition is ignored |
|
and the clear will be unconditional. |
|
|
|
``clear_texture`` clears a non-PIPE_BUFFER resource's specified level |
|
and bounding box with a clear value provided in that resource's native |
|
format. |
|
|
|
``clear_buffer`` clears a PIPE_BUFFER resource with the specified clear value |
|
(which may be multiple bytes in length). Logically this is a memset with a |
|
multi-byte element value starting at offset bytes from resource start, going |
|
for size bytes. It is guaranteed that size % clear_value_size == 0. |
|
|
|
Evaluating Depth Buffers |
|
^^^^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
``evaluate_depth_buffer`` is a hint to decompress the current depth buffer |
|
assuming the current sample locations to avoid problems that could arise when |
|
using programmable sample locations. |
|
|
|
If a depth buffer is rendered with different sample location state than |
|
what is current at the time of reading the depth buffer, the values may differ |
|
because depth buffer compression can depend the sample locations. |
|
|
|
|
|
Uploading |
|
^^^^^^^^^ |
|
|
|
For simple single-use uploads, use ``pipe_context::stream_uploader`` or |
|
``pipe_context::const_uploader``. The latter should be used for uploading |
|
constants, while the former should be used for uploading everything else. |
|
PIPE_USAGE_STREAM is implied in both cases, so don't use the uploaders |
|
for static allocations. |
|
|
|
Usage: |
|
|
|
Call u_upload_alloc or u_upload_data as many times as you want. After you are |
|
done, call u_upload_unmap. If the driver doesn't support persistent mappings, |
|
u_upload_unmap makes sure the previously mapped memory is unmapped. |
|
|
|
Gotchas: |
|
- Always fill the memory immediately after u_upload_alloc. Any following call |
|
to u_upload_alloc and u_upload_data can unmap memory returned by previous |
|
u_upload_alloc. |
|
- Don't interleave calls using stream_uploader and const_uploader. If you use |
|
one of them, do the upload, unmap, and only then can you use the other one. |
|
|
|
|
|
Drawing |
|
^^^^^^^ |
|
|
|
``draw_vbo`` draws a specified primitive. The primitive mode and other |
|
properties are described by ``pipe_draw_info``. |
|
|
|
The ``mode``, ``start``, and ``count`` fields of ``pipe_draw_info`` specify the |
|
the mode of the primitive and the vertices to be fetched, in the range between |
|
``start`` to ``start``+``count``-1, inclusive. |
|
|
|
Every instance with instanceID in the range between ``start_instance`` and |
|
``start_instance``+``instance_count``-1, inclusive, will be drawn. |
|
|
|
If ``index_size`` != 0, all vertex indices will be looked up from the index |
|
buffer. |
|
|
|
In indexed draw, ``min_index`` and ``max_index`` respectively provide a lower |
|
and upper bound of the indices contained in the index buffer inside the range |
|
between ``start`` to ``start``+``count``-1. This allows the driver to |
|
determine which subset of vertices will be referenced during te draw call |
|
without having to scan the index buffer. Providing a over-estimation of the |
|
the true bounds, for example, a ``min_index`` and ``max_index`` of 0 and |
|
0xffffffff respectively, must give exactly the same rendering, albeit with less |
|
performance due to unreferenced vertex buffers being unnecessarily DMA'ed or |
|
processed. Providing a underestimation of the true bounds will result in |
|
undefined behavior, but should not result in program or system failure. |
|
|
|
In case of non-indexed draw, ``min_index`` should be set to |
|
``start`` and ``max_index`` should be set to ``start``+``count``-1. |
|
|
|
``index_bias`` is a value added to every vertex index after lookup and before |
|
fetching vertex attributes. |
|
|
|
When drawing indexed primitives, the primitive restart index can be |
|
used to draw disjoint primitive strips. For example, several separate |
|
line strips can be drawn by designating a special index value as the |
|
restart index. The ``primitive_restart`` flag enables/disables this |
|
feature. The ``restart_index`` field specifies the restart index value. |
|
|
|
When primitive restart is in use, array indexes are compared to the |
|
restart index before adding the index_bias offset. |
|
|
|
If a given vertex element has ``instance_divisor`` set to 0, it is said |
|
it contains per-vertex data and effective vertex attribute address needs |
|
to be recalculated for every index. |
|
|
|
attribAddr = ``stride`` * index + ``src_offset`` |
|
|
|
If a given vertex element has ``instance_divisor`` set to non-zero, |
|
it is said it contains per-instance data and effective vertex attribute |
|
address needs to recalculated for every ``instance_divisor``-th instance. |
|
|
|
attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset`` |
|
|
|
In the above formulas, ``src_offset`` is taken from the given vertex element |
|
and ``stride`` is taken from a vertex buffer associated with the given |
|
vertex element. |
|
|
|
The calculated attribAddr is used as an offset into the vertex buffer to |
|
fetch the attribute data. |
|
|
|
The value of ``instanceID`` can be read in a vertex shader through a system |
|
value register declared with INSTANCEID semantic name. |
|
|
|
|
|
Queries |
|
^^^^^^^ |
|
|
|
Queries gather some statistic from the 3D pipeline over one or more |
|
draws. Queries may be nested, though not all gallium frontends exercise this. |
|
|
|
Queries can be created with ``create_query`` and deleted with |
|
``destroy_query``. To start a query, use ``begin_query``, and when finished, |
|
use ``end_query`` to end the query. |
|
|
|
``create_query`` takes a query type (``PIPE_QUERY_*``), as well as an index, |
|
which is the vertex stream for ``PIPE_QUERY_PRIMITIVES_GENERATED`` and |
|
``PIPE_QUERY_PRIMITIVES_EMITTED``, and allocates a query structure. |
|
|
|
``begin_query`` will clear/reset previous query results. |
|
|
|
``get_query_result`` is used to retrieve the results of a query. If |
|
the ``wait`` parameter is TRUE, then the ``get_query_result`` call |
|
will block until the results of the query are ready (and TRUE will be |
|
returned). Otherwise, if the ``wait`` parameter is FALSE, the call |
|
will not block and the return value will be TRUE if the query has |
|
completed or FALSE otherwise. |
|
|
|
``get_query_result_resource`` is used to store the result of a query into |
|
a resource without synchronizing with the CPU. This write will optionally |
|
wait for the query to complete, and will optionally write whether the value |
|
is available instead of the value itself. |
|
|
|
``set_active_query_state`` Set whether all current non-driver queries except |
|
TIME_ELAPSED are active or paused. |
|
|
|
The interface currently includes the following types of queries: |
|
|
|
``PIPE_QUERY_OCCLUSION_COUNTER`` counts the number of fragments which |
|
are written to the framebuffer without being culled by |
|
:ref:`depth-stencil-alpha` testing or shader KILL instructions. |
|
The result is an unsigned 64-bit integer. |
|
This query can be used with ``render_condition``. |
|
|
|
In cases where a boolean result of an occlusion query is enough, |
|
``PIPE_QUERY_OCCLUSION_PREDICATE`` should be used. It is just like |
|
``PIPE_QUERY_OCCLUSION_COUNTER`` except that the result is a boolean |
|
value of FALSE for cases where COUNTER would result in 0 and TRUE |
|
for all other cases. |
|
This query can be used with ``render_condition``. |
|
|
|
In cases where a conservative approximation of an occlusion query is enough, |
|
``PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE`` should be used. It behaves |
|
like ``PIPE_QUERY_OCCLUSION_PREDICATE``, except that it may return TRUE in |
|
additional, implementation-dependent cases. |
|
This query can be used with ``render_condition``. |
|
|
|
``PIPE_QUERY_TIME_ELAPSED`` returns the amount of time, in nanoseconds, |
|
the context takes to perform operations. |
|
The result is an unsigned 64-bit integer. |
|
|
|
``PIPE_QUERY_TIMESTAMP`` returns a device/driver internal timestamp, |
|
scaled to nanoseconds, recorded after all commands issued prior to |
|
``end_query`` have been processed. |
|
This query does not require a call to ``begin_query``. |
|
The result is an unsigned 64-bit integer. |
|
|
|
``PIPE_QUERY_TIMESTAMP_DISJOINT`` can be used to check the |
|
internal timer resolution and whether the timestamp counter has become |
|
unreliable due to things like throttling etc. - only if this is FALSE |
|
a timestamp query (within the timestamp_disjoint query) should be trusted. |
|
The result is a 64-bit integer specifying the timer resolution in Hz, |
|
followed by a boolean value indicating whether the timestamp counter |
|
is discontinuous or disjoint. |
|
|
|
``PIPE_QUERY_PRIMITIVES_GENERATED`` returns a 64-bit integer indicating |
|
the number of primitives processed by the pipeline (regardless of whether |
|
stream output is active or not). |
|
|
|
``PIPE_QUERY_PRIMITIVES_EMITTED`` returns a 64-bit integer indicating |
|
the number of primitives written to stream output buffers. |
|
|
|
``PIPE_QUERY_SO_STATISTICS`` returns 2 64-bit integers corresponding to |
|
the result of |
|
``PIPE_QUERY_PRIMITIVES_EMITTED`` and |
|
the number of primitives that would have been written to stream output buffers |
|
if they had infinite space available (primitives_storage_needed), in this order. |
|
XXX the 2nd value is equivalent to ``PIPE_QUERY_PRIMITIVES_GENERATED`` but it is |
|
unclear if it should be increased if stream output is not active. |
|
|
|
``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` returns a boolean value indicating |
|
whether a selected stream output target has overflowed as a result of the |
|
commands issued between ``begin_query`` and ``end_query``. |
|
This query can be used with ``render_condition``. The output stream is |
|
selected by the stream number passed to ``create_query``. |
|
|
|
``PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE`` returns a boolean value indicating |
|
whether any stream output target has overflowed as a result of the commands |
|
issued between ``begin_query`` and ``end_query``. This query can be used |
|
with ``render_condition``, and its result is the logical OR of multiple |
|
``PIPE_QUERY_SO_OVERFLOW_PREDICATE`` queries, one for each stream output |
|
target. |
|
|
|
``PIPE_QUERY_GPU_FINISHED`` returns a boolean value indicating whether |
|
all commands issued before ``end_query`` have completed. However, this |
|
does not imply serialization. |
|
This query does not require a call to ``begin_query``. |
|
|
|
``PIPE_QUERY_PIPELINE_STATISTICS`` returns an array of the following |
|
64-bit integers: |
|
Number of vertices read from vertex buffers. |
|
Number of primitives read from vertex buffers. |
|
Number of vertex shader threads launched. |
|
Number of geometry shader threads launched. |
|
Number of primitives generated by geometry shaders. |
|
Number of primitives forwarded to the rasterizer. |
|
Number of primitives rasterized. |
|
Number of fragment shader threads launched. |
|
Number of tessellation control shader threads launched. |
|
Number of tessellation evaluation shader threads launched. |
|
If a shader type is not supported by the device/driver, |
|
the corresponding values should be set to 0. |
|
|
|
``PIPE_QUERY_PIPELINE_STATISTICS_SINGLE`` returns a single counter from |
|
the ``PIPE_QUERY_PIPELINE_STATISTICS`` group. The specific counter must |
|
be selected when calling ``create_query`` by passing one of the |
|
``PIPE_STAT_QUERY`` enums as the query's ``index``. |
|
|
|
Gallium does not guarantee the availability of any query types; one must |
|
always check the capabilities of the :ref:`Screen` first. |
|
|
|
|
|
Conditional Rendering |
|
^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
A drawing command can be skipped depending on the outcome of a query |
|
(typically an occlusion query, or streamout overflow predicate). |
|
The ``render_condition`` function specifies the query which should be checked |
|
prior to rendering anything. Functions always honoring render_condition include |
|
(and are limited to) draw_vbo and clear. |
|
The blit, clear_render_target and clear_depth_stencil functions (but |
|
not resource_copy_region, which seems inconsistent) can also optionally honor |
|
the current render condition. |
|
|
|
If ``render_condition`` is called with ``query`` = NULL, conditional |
|
rendering is disabled and drawing takes place normally. |
|
|
|
If ``render_condition`` is called with a non-null ``query`` subsequent |
|
drawing commands will be predicated on the outcome of the query. |
|
Commands will be skipped if ``condition`` is equal to the predicate result |
|
(for non-boolean queries such as OCCLUSION_QUERY, zero counts as FALSE, |
|
non-zero as TRUE). |
|
|
|
If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the |
|
query to complete before deciding whether to render. |
|
|
|
If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet |
|
completed, the drawing command will be executed normally. If the query |
|
has completed, drawing will be predicated on the outcome of the query. |
|
|
|
If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or |
|
PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above |
|
for the non-REGION modes but in the case that an occlusion query returns |
|
a non-zero result, regions which were occluded may be ommitted by subsequent |
|
drawing commands. This can result in better performance with some GPUs. |
|
Normally, if the occlusion query returned a non-zero result subsequent |
|
drawing happens normally so fragments may be generated, shaded and |
|
processed even where they're known to be obscured. |
|
|
|
The ''render_condition_mem'' function specifies the drawing is dependant |
|
on a value in memory. A buffer resource and offset denote which 32-bit |
|
value to use for the query. This is used for Vulkan API. |
|
|
|
Flushing |
|
^^^^^^^^ |
|
|
|
``flush`` |
|
|
|
PIPE_FLUSH_END_OF_FRAME: Whether the flush marks the end of frame. |
|
|
|
PIPE_FLUSH_DEFERRED: It is not required to flush right away, but it is required |
|
to return a valid fence. If fence_finish is called with the returned fence |
|
and the context is still unflushed, and the ctx parameter of fence_finish is |
|
equal to the context where the fence was created, fence_finish will flush |
|
the context. |
|
|
|
PIPE_FLUSH_ASYNC: The flush is allowed to be asynchronous. Unlike |
|
``PIPE_FLUSH_DEFERRED``, the driver must still ensure that the returned fence |
|
will finish in finite time. However, subsequent operations in other contexts of |
|
the same screen are no longer guaranteed to happen after the flush. Drivers |
|
which use this flag must implement pipe_context::fence_server_sync. |
|
|
|
PIPE_FLUSH_HINT_FINISH: Hints to the driver that the caller will immediately |
|
wait for the returned fence. |
|
|
|
Additional flags may be set together with ``PIPE_FLUSH_DEFERRED`` for even |
|
finer-grained fences. Note that as a general rule, GPU caches may not have been |
|
flushed yet when these fences are signaled. Drivers are free to ignore these |
|
flags and create normal fences instead. At most one of the following flags can |
|
be specified: |
|
|
|
PIPE_FLUSH_TOP_OF_PIPE: The fence should be signaled as soon as the next |
|
command is ready to start executing at the top of the pipeline, before any of |
|
its data is actually read (including indirect draw parameters). |
|
|
|
PIPE_FLUSH_BOTTOM_OF_PIPE: The fence should be signaled as soon as the previous |
|
command has finished executing on the GPU entirely (but data written by the |
|
command may still be in caches and inaccessible to the CPU). |
|
|
|
|
|
``flush_resource`` |
|
|
|
Flush the resource cache, so that the resource can be used |
|
by an external client. Possible usage: |
|
- flushing a resource before presenting it on the screen |
|
- flushing a resource if some other process or device wants to use it |
|
This shouldn't be used to flush caches if the resource is only managed |
|
by a single pipe_screen and is not shared with another process. |
|
(i.e. you shouldn't use it to flush caches explicitly if you want to e.g. |
|
use the resource for texturing) |
|
|
|
Fences |
|
^^^^^^ |
|
|
|
``pipe_fence_handle``, and related methods, are used to synchronize |
|
execution between multiple parties. Examples include CPU <-> GPU synchronization, |
|
renderer <-> windowing system, multiple external APIs, etc. |
|
|
|
A ``pipe_fence_handle`` can either be 'one time use' or 're-usable'. A 'one time use' |
|
fence behaves like a traditional GPU fence. Once it reaches the signaled state it |
|
is forever considered to be signaled. |
|
|
|
Once a re-usable ``pipe_fence_handle`` becomes signaled, it can be reset |
|
back into an unsignaled state. The ``pipe_fence_handle`` will be reset to |
|
the unsignaled state by performing a wait operation on said object, i.e. |
|
``fence_server_sync``. As a corollary to this behavior, a re-usable |
|
``pipe_fence_handle`` can only have one waiter. |
|
|
|
This behavior is useful in producer <-> consumer chains. It helps avoid |
|
unnecessarily sharing a new ``pipe_fence_handle`` each time a new frame is |
|
ready. Instead, the fences are exchanged once ahead of time, and access is synchronized |
|
through GPU signaling instead of direct producer <-> consumer communication. |
|
|
|
``fence_server_sync`` inserts a wait command into the GPU's command stream. |
|
|
|
``fence_server_signal`` inserts a signal command into the GPU's command stream. |
|
|
|
There are no guarantees that the wait/signal commands will be flushed when |
|
calling ``fence_server_sync`` or ``fence_server_signal``. An explicit |
|
call to ``flush`` is required to make sure the commands are emitted to the GPU. |
|
|
|
The Gallium implementation may implicitly ``flush`` the command stream during a |
|
``fence_server_sync`` or ``fence_server_signal`` call if necessary. |
|
|
|
Resource Busy Queries |
|
^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
``is_resource_referenced`` |
|
|
|
|
|
|
|
Blitting |
|
^^^^^^^^ |
|
|
|
These methods emulate classic blitter controls. |
|
|
|
These methods operate directly on ``pipe_resource`` objects, and stand |
|
apart from any 3D state in the context. Blitting functionality may be |
|
moved to a separate abstraction at some point in the future. |
|
|
|
``resource_copy_region`` blits a region of a resource to a region of another |
|
resource, provided that both resources have the same format, or compatible |
|
formats, i.e., formats for which copying the bytes from the source resource |
|
unmodified to the destination resource will achieve the same effect of a |
|
textured quad blitter.. The source and destination may be the same resource, |
|
but overlapping blits are not permitted. |
|
This can be considered the equivalent of a CPU memcpy. |
|
|
|
``blit`` blits a region of a resource to a region of another resource, including |
|
scaling, format conversion, and up-/downsampling, as well as a destination clip |
|
rectangle (scissors) and window rectangles. It can also optionally honor the |
|
current render condition (but either way the blit itself never contributes |
|
anything to queries currently gathering data). |
|
As opposed to manually drawing a textured quad, this lets the pipe driver choose |
|
the optimal method for blitting (like using a special 2D engine), and usually |
|
offers, for example, accelerated stencil-only copies even where |
|
PIPE_CAP_SHADER_STENCIL_EXPORT is not available. |
|
|
|
|
|
Transfers |
|
^^^^^^^^^ |
|
|
|
These methods are used to get data to/from a resource. |
|
|
|
``transfer_map`` creates a memory mapping and the transfer object |
|
associated with it. |
|
The returned pointer points to the start of the mapped range according to |
|
the box region, not the beginning of the resource. If transfer_map fails, |
|
the returned pointer to the buffer memory is NULL, and the pointer |
|
to the transfer object remains unchanged (i.e. it can be non-NULL). |
|
|
|
``transfer_unmap`` remove the memory mapping for and destroy |
|
the transfer object. The pointer into the resource should be considered |
|
invalid and discarded. |
|
|
|
``texture_subdata`` and ``buffer_subdata`` perform a simplified |
|
transfer for simple writes. Basically transfer_map, data write, and |
|
transfer_unmap all in one. |
|
|
|
|
|
The box parameter to some of these functions defines a 1D, 2D or 3D |
|
region of pixels. This is self-explanatory for 1D, 2D and 3D texture |
|
targets. |
|
|
|
For PIPE_TEXTURE_1D_ARRAY and PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth |
|
fields refer to the array dimension of the texture. |
|
|
|
For PIPE_TEXTURE_CUBE, the box:z and box::depth fields refer to the |
|
faces of the cube map (z + depth <= 6). |
|
|
|
For PIPE_TEXTURE_CUBE_ARRAY, the box:z and box::depth fields refer to both |
|
the face and array dimension of the texture (face = z % 6, array = z / 6). |
|
|
|
|
|
.. _transfer_flush_region: |
|
|
|
transfer_flush_region |
|
%%%%%%%%%%%%%%%%%%%%% |
|
|
|
If a transfer was created with ``FLUSH_EXPLICIT``, it will not automatically |
|
be flushed on write or unmap. Flushes must be requested with |
|
``transfer_flush_region``. Flush ranges are relative to the mapped range, not |
|
the beginning of the resource. |
|
|
|
|
|
|
|
.. _texture_barrier: |
|
|
|
texture_barrier |
|
%%%%%%%%%%%%%%% |
|
|
|
This function flushes all pending writes to the currently-set surfaces and |
|
invalidates all read caches of the currently-set samplers. This can be used |
|
for both regular textures as well as for framebuffers read via FBFETCH. |
|
|
|
|
|
|
|
.. _memory_barrier: |
|
|
|
memory_barrier |
|
%%%%%%%%%%%%%%% |
|
|
|
This function flushes caches according to which of the PIPE_BARRIER_* flags |
|
are set. |
|
|
|
|
|
|
|
.. _resource_commit: |
|
|
|
resource_commit |
|
%%%%%%%%%%%%%%% |
|
|
|
This function changes the commit state of a part of a sparse resource. Sparse |
|
resources are created by setting the ``PIPE_RESOURCE_FLAG_SPARSE`` flag when |
|
calling ``resource_create``. Initially, sparse resources only reserve a virtual |
|
memory region that is not backed by memory (i.e., it is uncommitted). The |
|
``resource_commit`` function can be called to commit or uncommit parts (or all) |
|
of a resource. The driver manages the underlying backing memory. |
|
|
|
The contents of newly committed memory regions are undefined. Calling this |
|
function to commit an already committed memory region is allowed and leaves its |
|
content unchanged. Similarly, calling this function to uncommit an already |
|
uncommitted memory region is allowed. |
|
|
|
For buffers, the given box must be aligned to multiples of |
|
``PIPE_CAP_SPARSE_BUFFER_PAGE_SIZE``. As an exception to this rule, if the size |
|
of the buffer is not a multiple of the page size, changing the commit state of |
|
the last (partial) page requires a box that ends at the end of the buffer |
|
(i.e., box->x + box->width == buffer->width0). |
|
|
|
|
|
|
|
.. _pipe_transfer: |
|
|
|
PIPE_MAP |
|
^^^^^^^^^^^^^ |
|
|
|
These flags control the behavior of a transfer object. |
|
|
|
``PIPE_MAP_READ`` |
|
Resource contents read back (or accessed directly) at transfer create time. |
|
|
|
``PIPE_MAP_WRITE`` |
|
Resource contents will be written back at transfer_unmap time (or modified |
|
as a result of being accessed directly). |
|
|
|
``PIPE_MAP_DIRECTLY`` |
|
a transfer should directly map the resource. May return NULL if not supported. |
|
|
|
``PIPE_MAP_DISCARD_RANGE`` |
|
The memory within the mapped region is discarded. Cannot be used with |
|
``PIPE_MAP_READ``. |
|
|
|
``PIPE_MAP_DISCARD_WHOLE_RESOURCE`` |
|
Discards all memory backing the resource. It should not be used with |
|
``PIPE_MAP_READ``. |
|
|
|
``PIPE_MAP_DONTBLOCK`` |
|
Fail if the resource cannot be mapped immediately. |
|
|
|
``PIPE_MAP_UNSYNCHRONIZED`` |
|
Do not synchronize pending operations on the resource when mapping. The |
|
interaction of any writes to the map and any operations pending on the |
|
resource are undefined. Cannot be used with ``PIPE_MAP_READ``. |
|
|
|
``PIPE_MAP_FLUSH_EXPLICIT`` |
|
Written ranges will be notified later with :ref:`transfer_flush_region`. |
|
Cannot be used with ``PIPE_MAP_READ``. |
|
|
|
``PIPE_MAP_PERSISTENT`` |
|
Allows the resource to be used for rendering while mapped. |
|
PIPE_RESOURCE_FLAG_MAP_PERSISTENT must be set when creating |
|
the resource. |
|
If COHERENT is not set, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) |
|
must be called to ensure the device can see what the CPU has written. |
|
|
|
``PIPE_MAP_COHERENT`` |
|
If PERSISTENT is set, this ensures any writes done by the device are |
|
immediately visible to the CPU and vice versa. |
|
PIPE_RESOURCE_FLAG_MAP_COHERENT must be set when creating |
|
the resource. |
|
|
|
Compute kernel execution |
|
^^^^^^^^^^^^^^^^^^^^^^^^ |
|
|
|
A compute program can be defined, bound or destroyed using |
|
``create_compute_state``, ``bind_compute_state`` or |
|
``destroy_compute_state`` respectively. |
|
|
|
Any of the subroutines contained within the compute program can be |
|
executed on the device using the ``launch_grid`` method. This method |
|
will execute as many instances of the program as elements in the |
|
specified N-dimensional grid, hopefully in parallel. |
|
|
|
The compute program has access to four special resources: |
|
|
|
* ``GLOBAL`` represents a memory space shared among all the threads |
|
running on the device. An arbitrary buffer created with the |
|
``PIPE_BIND_GLOBAL`` flag can be mapped into it using the |
|
``set_global_binding`` method. |
|
|
|
* ``LOCAL`` represents a memory space shared among all the threads |
|
running in the same working group. The initial contents of this |
|
resource are undefined. |
|
|
|
* ``PRIVATE`` represents a memory space local to a single thread. |
|
The initial contents of this resource are undefined. |
|
|
|
* ``INPUT`` represents a read-only memory space that can be |
|
initialized at ``launch_grid`` time. |
|
|
|
These resources use a byte-based addressing scheme, and they can be |
|
accessed from the compute program by means of the LOAD/STORE TGSI |
|
opcodes. Additional resources to be accessed using the same opcodes |
|
may be specified by the user with the ``set_compute_resources`` |
|
method. |
|
|
|
In addition, normal texture sampling is allowed from the compute |
|
program: ``bind_sampler_states`` may be used to set up texture |
|
samplers for the compute stage and ``set_sampler_views`` may |
|
be used to bind a number of sampler views to it. |
|
|
|
Mipmap generation |
|
^^^^^^^^^^^^^^^^^ |
|
|
|
If PIPE_CAP_GENERATE_MIPMAP is true, ``generate_mipmap`` can be used |
|
to generate mipmaps for the specified texture resource. |
|
It replaces texel image levels base_level+1 through |
|
last_level for layers range from first_layer through last_layer. |
|
It returns TRUE if mipmap generation succeeds, otherwise it |
|
returns FALSE. Mipmap generation may fail when it is not supported |
|
for particular texture types or formats. |
|
|
|
Device resets |
|
^^^^^^^^^^^^^ |
|
|
|
Gallium frontends can query or request notifications of when the GPU |
|
is reset for whatever reason (application error, driver error). When |
|
a GPU reset happens, the context becomes unusable and all related state |
|
should be considered lost and undefined. Despite that, context |
|
notifications are single-shot, i.e. subsequent calls to |
|
``get_device_reset_status`` will return PIPE_NO_RESET. |
|
|
|
* ``get_device_reset_status`` queries whether a device reset has happened |
|
since the last call or since the last notification by callback. |
|
* ``set_device_reset_callback`` sets a callback which will be called when |
|
a device reset is detected. The callback is only called synchronously. |
|
|
|
Bindless |
|
^^^^^^^^ |
|
|
|
If PIPE_CAP_BINDLESS_TEXTURE is TRUE, the following ``pipe_context`` functions |
|
are used to create/delete bindless handles, and to make them resident in the |
|
current context when they are going to be used by shaders. |
|
|
|
* ``create_texture_handle`` creates a 64-bit unsigned integer texture handle |
|
that is going to be directly used in shaders. |
|
* ``delete_texture_handle`` deletes a 64-bit unsigned integer texture handle. |
|
* ``make_texture_handle_resident`` makes a 64-bit unsigned texture handle |
|
resident in the current context to be accessible by shaders for texture |
|
mapping. |
|
* ``create_image_handle`` creates a 64-bit unsigned integer image handle that |
|
is going to be directly used in shaders. |
|
* ``delete_image_handle`` deletes a 64-bit unsigned integer image handle. |
|
* ``make_image_handle_resident`` makes a 64-bit unsigned integer image handle |
|
resident in the current context to be accessible by shaders for image loads, |
|
stores and atomic operations. |
|
|
|
Using several contexts |
|
---------------------- |
|
|
|
Several contexts from the same screen can be used at the same time. Objects |
|
created on one context cannot be used in another context, but the objects |
|
created by the screen methods can be used by all contexts. |
|
|
|
Transfers |
|
^^^^^^^^^ |
|
A transfer on one context is not expected to synchronize properly with |
|
rendering on other contexts, thus only areas not yet used for rendering should |
|
be locked. |
|
|
|
A flush is required after transfer_unmap to expect other contexts to see the |
|
uploaded data, unless: |
|
|
|
* Using persistent mapping. Associated with coherent mapping, unmapping the |
|
resource is also not required to use it in other contexts. Without coherent |
|
mapping, memory_barrier(PIPE_BARRIER_MAPPED_BUFFER) should be called on the |
|
context that has mapped the resource. No flush is required. |
|
|
|
* Mapping the resource with PIPE_MAP_DIRECTLY.
|
|
|