naga/back/hlsl/
mod.rs

1/*!
2Backend for [HLSL][hlsl] (High-Level Shading Language).
3
4# Supported shader model versions:
5- 5.0
6- 5.1
7- 6.0
8
9# Layout of values in `uniform` buffers
10
11WGSL's ["Internal Layout of Values"][ilov] rules specify how each WGSL
12type should be stored in `uniform` and `storage` buffers. The HLSL we
13generate must access values in that form, even when it is not what
14HLSL would use normally.
15
16The rules described here only apply to WGSL `uniform` variables. WGSL
17`storage` buffers are translated as HLSL `ByteAddressBuffers`, for
18which we generate `Load` and `Store` method calls with explicit byte
19offsets. WGSL pipeline inputs must be scalars or vectors; they cannot
20be matrices, which is where the interesting problems arise.
21
22## Row- and column-major ordering for matrices
23
24WGSL specifies that matrices in uniform buffers are stored in
25column-major order. This matches HLSL's default, so one might expect
26things to be straightforward. Unfortunately, WGSL and HLSL disagree on
27what indexing a matrix means: in WGSL, `m[i]` retrieves the `i`'th
28*column* of `m`, whereas in HLSL it retrieves the `i`'th *row*. We
29want to avoid translating `m[i]` into some complicated reassembly of a
30vector from individually fetched components, so this is a problem.
31
32However, with a bit of trickery, it is possible to use HLSL's `m[i]`
33as the translation of WGSL's `m[i]`:
34
35- We declare all matrices in uniform buffers in HLSL with the
36  `row_major` qualifier, and transpose the row and column counts: a
37  WGSL `mat3x4<f32>`, say, becomes an HLSL `row_major float3x4`. (Note
38  that WGSL and HLSL type names put the row and column in reverse
39  order.) Since the HLSL type is the transpose of how WebGPU directs
40  the user to store the data, HLSL will load all matrices transposed.
41
42- Since matrices are transposed, an HLSL indexing expression retrieves
43  the "columns" of the intended WGSL value, as desired.
44
45- For vector-matrix multiplication, since `mul(transpose(m), v)` is
46  equivalent to `mul(v, m)` (note the reversal of the arguments), and
47  `mul(v, transpose(m))` is equivalent to `mul(m, v)`, we can
48  translate WGSL `m * v` and `v * m` to HLSL by simply reversing the
49  arguments to `mul`.
50
51## Padding in two-row matrices
52
53An HLSL `row_major floatKx2` matrix has padding between its rows that
54the WGSL `matKx2<f32>` matrix it represents does not. HLSL stores all
55matrix rows [aligned on 16-byte boundaries][16bb], whereas WGSL says
56that the columns of a `matKx2<f32>` need only be [aligned as required
57for `vec2<f32>`][ilov], which is [eight-byte alignment][8bb].
58
59To compensate for this, any time a `matKx2<f32>` appears in a WGSL
60`uniform` variable, whether directly as the variable's type or as part
61of a struct/array, we actually emit `K` separate `float2` members, and
62assemble/disassemble the matrix from its columns (in WGSL; rows in
63HLSL) upon load and store.
64
65For example, the following WGSL struct type:
66
67```ignore
68struct Baz {
69        m: mat3x2<f32>,
70}
71```
72
73is rendered as the HLSL struct type:
74
75```ignore
76struct Baz {
77    float2 m_0; float2 m_1; float2 m_2;
78};
79```
80
81The `wrapped_struct_matrix` functions in `help.rs` generate HLSL
82helper functions to access such members, converting between the stored
83form and the HLSL matrix types appropriately. For example, for reading
84the member `m` of the `Baz` struct above, we emit:
85
86```ignore
87float3x2 GetMatmOnBaz(Baz obj) {
88    return float3x2(obj.m_0, obj.m_1, obj.m_2);
89}
90```
91
92We also emit an analogous `Set` function, as well as functions for
93accessing individual columns by dynamic index.
94
95## Sampler Handling
96
97Due to limitations in how sampler heaps work in D3D12, we need to access samplers
98through a layer of indirection. Instead of directly binding samplers, we bind the entire
99sampler heap as both a standard and a comparison sampler heap. We then use a sampler
100index buffer for each bind group. This buffer is accessed in the shader to get the actual
101sampler index within the heap. See the wgpu_hal dx12 backend documentation for more
102information.
103
104[hlsl]: https://docs.microsoft.com/en-us/windows/win32/direct3dhlsl/dx-graphics-hlsl
105[ilov]: https://gpuweb.github.io/gpuweb/wgsl/#internal-value-layout
106[16bb]: https://github.com/microsoft/DirectXShaderCompiler/wiki/Buffer-Packing#constant-buffer-packing
107[8bb]: https://gpuweb.github.io/gpuweb/wgsl/#alignment-and-size
108*/
109
110mod conv;
111mod help;
112mod keywords;
113mod ray;
114mod storage;
115mod writer;
116
117use alloc::{string::String, vec::Vec};
118use core::fmt::Error as FmtError;
119
120use thiserror::Error;
121
122use crate::{back, ir, proc};
123
124#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Hash)]
125#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
126#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
127pub struct BindTarget {
128    pub space: u8,
129    /// For regular bindings this is the register number.
130    ///
131    /// For sampler bindings, this is the index to use into the bind group's sampler index buffer.
132    pub register: u32,
133    /// If the binding is an unsized binding array, this overrides the size.
134    pub binding_array_size: Option<u32>,
135    /// This is the index in the buffer at [`Options::dynamic_storage_buffer_offsets_targets`].
136    pub dynamic_storage_buffer_offsets_index: Option<u32>,
137    /// This is a hint that we need to restrict indexing of vectors, matrices and arrays.
138    ///
139    /// If [`Options::restrict_indexing`] is also `true`, we will restrict indexing.
140    #[cfg_attr(any(feature = "serialize", feature = "deserialize"), serde(default))]
141    pub restrict_indexing: bool,
142}
143
144#[derive(Clone, Debug, Default, PartialEq, Eq, Hash)]
145#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
146#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
147/// BindTarget for dynamic storage buffer offsets
148pub struct OffsetsBindTarget {
149    pub space: u8,
150    pub register: u32,
151    pub size: u32,
152}
153
154#[cfg(any(feature = "serialize", feature = "deserialize"))]
155#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
156#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
157struct BindingMapSerialization {
158    resource_binding: crate::ResourceBinding,
159    bind_target: BindTarget,
160}
161
162#[cfg(feature = "deserialize")]
163fn deserialize_binding_map<'de, D>(deserializer: D) -> Result<BindingMap, D::Error>
164where
165    D: serde::Deserializer<'de>,
166{
167    use serde::Deserialize;
168
169    let vec = Vec::<BindingMapSerialization>::deserialize(deserializer)?;
170    let mut map = BindingMap::default();
171    for item in vec {
172        map.insert(item.resource_binding, item.bind_target);
173    }
174    Ok(map)
175}
176
177// Using `BTreeMap` instead of `HashMap` so that we can hash itself.
178pub type BindingMap = alloc::collections::BTreeMap<crate::ResourceBinding, BindTarget>;
179
180/// A HLSL shader model version.
181#[allow(non_snake_case, non_camel_case_types)]
182#[derive(Copy, Clone, Debug, Hash, Eq, PartialEq, PartialOrd)]
183#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
184#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
185pub enum ShaderModel {
186    V5_0,
187    V5_1,
188    V6_0,
189    V6_1,
190    V6_2,
191    V6_3,
192    V6_4,
193    V6_5,
194    V6_6,
195    V6_7,
196}
197
198impl ShaderModel {
199    pub const fn to_str(self) -> &'static str {
200        match self {
201            Self::V5_0 => "5_0",
202            Self::V5_1 => "5_1",
203            Self::V6_0 => "6_0",
204            Self::V6_1 => "6_1",
205            Self::V6_2 => "6_2",
206            Self::V6_3 => "6_3",
207            Self::V6_4 => "6_4",
208            Self::V6_5 => "6_5",
209            Self::V6_6 => "6_6",
210            Self::V6_7 => "6_7",
211        }
212    }
213}
214
215impl crate::ShaderStage {
216    pub const fn to_hlsl_str(self) -> &'static str {
217        match self {
218            Self::Vertex => "vs",
219            Self::Fragment => "ps",
220            Self::Compute => "cs",
221            Self::Task | Self::Mesh => unreachable!(),
222        }
223    }
224}
225
226impl crate::ImageDimension {
227    const fn to_hlsl_str(self) -> &'static str {
228        match self {
229            Self::D1 => "1D",
230            Self::D2 => "2D",
231            Self::D3 => "3D",
232            Self::Cube => "Cube",
233        }
234    }
235}
236
237#[derive(Clone, Copy, Debug, Hash, Eq, Ord, PartialEq, PartialOrd)]
238#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
239#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
240pub struct SamplerIndexBufferKey {
241    pub group: u32,
242}
243
244#[derive(Clone, Debug, Hash, PartialEq, Eq)]
245#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
246#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
247#[cfg_attr(feature = "deserialize", serde(default))]
248pub struct SamplerHeapBindTargets {
249    pub standard_samplers: BindTarget,
250    pub comparison_samplers: BindTarget,
251}
252
253impl Default for SamplerHeapBindTargets {
254    fn default() -> Self {
255        Self {
256            standard_samplers: BindTarget {
257                space: 0,
258                register: 0,
259                binding_array_size: None,
260                dynamic_storage_buffer_offsets_index: None,
261                restrict_indexing: false,
262            },
263            comparison_samplers: BindTarget {
264                space: 1,
265                register: 0,
266                binding_array_size: None,
267                dynamic_storage_buffer_offsets_index: None,
268                restrict_indexing: false,
269            },
270        }
271    }
272}
273
274#[cfg(any(feature = "serialize", feature = "deserialize"))]
275#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
276#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
277struct SamplerIndexBufferBindingSerialization {
278    group: u32,
279    bind_target: BindTarget,
280}
281
282#[cfg(feature = "deserialize")]
283fn deserialize_sampler_index_buffer_bindings<'de, D>(
284    deserializer: D,
285) -> Result<SamplerIndexBufferBindingMap, D::Error>
286where
287    D: serde::Deserializer<'de>,
288{
289    use serde::Deserialize;
290
291    let vec = Vec::<SamplerIndexBufferBindingSerialization>::deserialize(deserializer)?;
292    let mut map = SamplerIndexBufferBindingMap::default();
293    for item in vec {
294        map.insert(
295            SamplerIndexBufferKey { group: item.group },
296            item.bind_target,
297        );
298    }
299    Ok(map)
300}
301
302// We use a BTreeMap here so that we can hash it.
303pub type SamplerIndexBufferBindingMap =
304    alloc::collections::BTreeMap<SamplerIndexBufferKey, BindTarget>;
305
306#[cfg(any(feature = "serialize", feature = "deserialize"))]
307#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
308#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
309struct DynamicStorageBufferOffsetTargetSerialization {
310    index: u32,
311    bind_target: OffsetsBindTarget,
312}
313
314#[cfg(feature = "deserialize")]
315fn deserialize_storage_buffer_offsets<'de, D>(
316    deserializer: D,
317) -> Result<DynamicStorageBufferOffsetsTargets, D::Error>
318where
319    D: serde::Deserializer<'de>,
320{
321    use serde::Deserialize;
322
323    let vec = Vec::<DynamicStorageBufferOffsetTargetSerialization>::deserialize(deserializer)?;
324    let mut map = DynamicStorageBufferOffsetsTargets::default();
325    for item in vec {
326        map.insert(item.index, item.bind_target);
327    }
328    Ok(map)
329}
330
331pub type DynamicStorageBufferOffsetsTargets = alloc::collections::BTreeMap<u32, OffsetsBindTarget>;
332
333/// Shorthand result used internally by the backend
334type BackendResult = Result<(), Error>;
335
336#[derive(Clone, Debug, PartialEq, thiserror::Error)]
337#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
338#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
339pub enum EntryPointError {
340    #[error("mapping of {0:?} is missing")]
341    MissingBinding(crate::ResourceBinding),
342}
343
344/// Configuration used in the [`Writer`].
345#[derive(Clone, Debug, Hash, PartialEq, Eq)]
346#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
347#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
348#[cfg_attr(feature = "deserialize", serde(default))]
349pub struct Options {
350    /// The hlsl shader model to be used
351    pub shader_model: ShaderModel,
352    /// Map of resources association to binding locations.
353    #[cfg_attr(
354        feature = "deserialize",
355        serde(deserialize_with = "deserialize_binding_map")
356    )]
357    pub binding_map: BindingMap,
358    /// Don't panic on missing bindings, instead generate any HLSL.
359    pub fake_missing_bindings: bool,
360    /// Add special constants to `SV_VertexIndex` and `SV_InstanceIndex`,
361    /// to make them work like in Vulkan/Metal, with help of the host.
362    pub special_constants_binding: Option<BindTarget>,
363    /// Bind target of the push constant buffer
364    pub push_constants_target: Option<BindTarget>,
365    /// Bind target of the sampler heap and comparison sampler heap.
366    pub sampler_heap_target: SamplerHeapBindTargets,
367    /// Mapping of each bind group's sampler index buffer to a bind target.
368    #[cfg_attr(
369        feature = "deserialize",
370        serde(deserialize_with = "deserialize_sampler_index_buffer_bindings")
371    )]
372    pub sampler_buffer_binding_map: SamplerIndexBufferBindingMap,
373    /// Bind target for dynamic storage buffer offsets
374    #[cfg_attr(
375        feature = "deserialize",
376        serde(deserialize_with = "deserialize_storage_buffer_offsets")
377    )]
378    pub dynamic_storage_buffer_offsets_targets: DynamicStorageBufferOffsetsTargets,
379    /// Should workgroup variables be zero initialized (by polyfilling)?
380    pub zero_initialize_workgroup_memory: bool,
381    /// Should we restrict indexing of vectors, matrices and arrays?
382    pub restrict_indexing: bool,
383    /// If set, loops will have code injected into them, forcing the compiler
384    /// to think the number of iterations is bounded.
385    pub force_loop_bounding: bool,
386}
387
388impl Default for Options {
389    fn default() -> Self {
390        Options {
391            shader_model: ShaderModel::V5_1,
392            binding_map: BindingMap::default(),
393            fake_missing_bindings: true,
394            special_constants_binding: None,
395            sampler_heap_target: SamplerHeapBindTargets::default(),
396            sampler_buffer_binding_map: alloc::collections::BTreeMap::default(),
397            push_constants_target: None,
398            dynamic_storage_buffer_offsets_targets: alloc::collections::BTreeMap::new(),
399            zero_initialize_workgroup_memory: true,
400            restrict_indexing: true,
401            force_loop_bounding: true,
402        }
403    }
404}
405
406impl Options {
407    fn resolve_resource_binding(
408        &self,
409        res_binding: &crate::ResourceBinding,
410    ) -> Result<BindTarget, EntryPointError> {
411        match self.binding_map.get(res_binding) {
412            Some(target) => Ok(*target),
413            None if self.fake_missing_bindings => Ok(BindTarget {
414                space: res_binding.group as u8,
415                register: res_binding.binding,
416                binding_array_size: None,
417                dynamic_storage_buffer_offsets_index: None,
418                restrict_indexing: false,
419            }),
420            None => Err(EntryPointError::MissingBinding(*res_binding)),
421        }
422    }
423}
424
425/// Reflection info for entry point names.
426#[derive(Default)]
427pub struct ReflectionInfo {
428    /// Mapping of the entry point names.
429    ///
430    /// Each item in the array corresponds to an entry point index. The real entry point name may be different if one of the
431    /// reserved words are used.
432    ///
433    /// Note: Some entry points may fail translation because of missing bindings.
434    pub entry_point_names: Vec<Result<String, EntryPointError>>,
435}
436
437/// A subset of options that are meant to be changed per pipeline.
438#[derive(Debug, Default, Clone)]
439#[cfg_attr(feature = "serialize", derive(serde::Serialize))]
440#[cfg_attr(feature = "deserialize", derive(serde::Deserialize))]
441#[cfg_attr(feature = "deserialize", serde(default))]
442pub struct PipelineOptions {
443    /// The entry point to write.
444    ///
445    /// Entry points are identified by a shader stage specification,
446    /// and a name.
447    ///
448    /// If `None`, all entry points will be written. If `Some` and the entry
449    /// point is not found, an error will be thrown while writing.
450    pub entry_point: Option<(ir::ShaderStage, String)>,
451}
452
453#[derive(Error, Debug)]
454pub enum Error {
455    #[error(transparent)]
456    IoError(#[from] FmtError),
457    #[error("A scalar with an unsupported width was requested: {0:?}")]
458    UnsupportedScalar(crate::Scalar),
459    #[error("{0}")]
460    Unimplemented(String), // TODO: Error used only during development
461    #[error("{0}")]
462    Custom(String),
463    #[error("overrides should not be present at this stage")]
464    Override,
465    #[error(transparent)]
466    ResolveArraySizeError(#[from] proc::ResolveArraySizeError),
467    #[error("entry point with stage {0:?} and name '{1}' not found")]
468    EntryPointNotFound(ir::ShaderStage, String),
469}
470
471#[derive(PartialEq, Eq, Hash)]
472enum WrappedType {
473    ZeroValue(help::WrappedZeroValue),
474    ArrayLength(help::WrappedArrayLength),
475    ImageSample(help::WrappedImageSample),
476    ImageQuery(help::WrappedImageQuery),
477    ImageLoadScalar(crate::Scalar),
478    Constructor(help::WrappedConstructor),
479    StructMatrixAccess(help::WrappedStructMatrixAccess),
480    MatCx2(help::WrappedMatCx2),
481    Math(help::WrappedMath),
482    UnaryOp(help::WrappedUnaryOp),
483    BinaryOp(help::WrappedBinaryOp),
484    Cast(help::WrappedCast),
485}
486
487#[derive(Default)]
488struct Wrapped {
489    types: crate::FastHashSet<WrappedType>,
490    /// If true, the sampler heaps have been written out.
491    sampler_heaps: bool,
492    // Mapping from SamplerIndexBufferKey to the name the namer returned.
493    sampler_index_buffers: crate::FastHashMap<SamplerIndexBufferKey, String>,
494}
495
496impl Wrapped {
497    fn insert(&mut self, r#type: WrappedType) -> bool {
498        self.types.insert(r#type)
499    }
500
501    fn clear(&mut self) {
502        self.types.clear();
503    }
504}
505
506/// A fragment entry point to be considered when generating HLSL for the output interface of vertex
507/// entry points.
508///
509/// This is provided as an optional parameter to [`Writer::write`].
510///
511/// If this is provided, vertex outputs will be removed if they are not inputs of this fragment
512/// entry point. This is necessary for generating correct HLSL when some of the vertex shader
513/// outputs are not consumed by the fragment shader.
514pub struct FragmentEntryPoint<'a> {
515    module: &'a crate::Module,
516    func: &'a crate::Function,
517}
518
519impl<'a> FragmentEntryPoint<'a> {
520    /// Returns `None` if the entry point with the provided name can't be found or isn't a fragment
521    /// entry point.
522    pub fn new(module: &'a crate::Module, ep_name: &'a str) -> Option<Self> {
523        module
524            .entry_points
525            .iter()
526            .find(|ep| ep.name == ep_name)
527            .filter(|ep| ep.stage == crate::ShaderStage::Fragment)
528            .map(|ep| Self {
529                module,
530                func: &ep.function,
531            })
532    }
533}
534
535pub struct Writer<'a, W> {
536    out: W,
537    names: crate::FastHashMap<proc::NameKey, String>,
538    namer: proc::Namer,
539    /// HLSL backend options
540    options: &'a Options,
541    /// Per-stage backend options
542    pipeline_options: &'a PipelineOptions,
543    /// Information about entry point arguments and result types.
544    entry_point_io: crate::FastHashMap<usize, writer::EntryPointInterface>,
545    /// Set of expressions that have associated temporary variables
546    named_expressions: crate::NamedExpressions,
547    wrapped: Wrapped,
548    written_committed_intersection: bool,
549    written_candidate_intersection: bool,
550    continue_ctx: back::continue_forward::ContinueCtx,
551
552    /// A reference to some part of a global variable, lowered to a series of
553    /// byte offset calculations.
554    ///
555    /// See the [`storage`] module for background on why we need this.
556    ///
557    /// Each [`SubAccess`] in the vector is a lowering of some [`Access`] or
558    /// [`AccessIndex`] expression to the level of byte strides and offsets. See
559    /// [`SubAccess`] for details.
560    ///
561    /// This field is a member of [`Writer`] solely to allow re-use of
562    /// the `Vec`'s dynamic allocation. The value is no longer needed
563    /// once HLSL for the access has been generated.
564    ///
565    /// [`Storage`]: crate::AddressSpace::Storage
566    /// [`SubAccess`]: storage::SubAccess
567    /// [`Access`]: crate::Expression::Access
568    /// [`AccessIndex`]: crate::Expression::AccessIndex
569    temp_access_chain: Vec<storage::SubAccess>,
570    need_bake_expressions: back::NeedBakeExpressions,
571}