forked from Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
282 lines
12 KiB
282 lines
12 KiB
ISASPEC - XML Based ISA Specification |
|
===================================== |
|
|
|
isaspec provides a mechanism to describe an instruction set in xml, and |
|
generate a disassembler and assembler (eventually). The intention is |
|
to describe the instruction set more formally than hand-coded assembler |
|
and disassembler, and better decouple the shader compiler from the |
|
underlying instruction encoding to simplify dealing with instruction |
|
encoding differences between generations of GPU. |
|
|
|
Benefits of a formal ISA description, compared to hand-coded assemblers |
|
and disassemblers, include easier detection of new bit combinations that |
|
were not seen before in previous generations due to more rigorous |
|
description of bits that are expect to be '0' or '1' or 'x' (dontcare) |
|
and verification that different encodings don't have conflicting bits |
|
(ie. that the specification cannot result in more than one valid |
|
interpretation of any bit pattern). |
|
|
|
The isaspec tool and xml schema are intended to be generic (not specific |
|
to ir3), although there are currently a couple limitations due to short- |
|
cuts taken to get things up and running (which are mostly not inherent to |
|
the xml schema, and should not be too difficult to remove from the py and |
|
decode/disasm utility): |
|
|
|
* Maximum "field" size is 64b |
|
* Fixed instruction size |
|
|
|
Often times, especially when new functionality is added in later gens |
|
while retaining (or at least mostly retaining) backwards compatibility |
|
with encodings used in earlier generations, the actual encoding can be |
|
rather messy to describe. To support this, isaspec provides many flexible |
|
mechanism, such as conditional overrides and derived fields. This not |
|
only allows for describing an irregular instruction encoding, but also |
|
allows matching an existing disasm syntax (which might not have been |
|
design around the idea of disassembly based on a formal ISA description). |
|
|
|
Bitsets |
|
------- |
|
|
|
The fundamental concept of matching a bit-pattern to an instruction |
|
decoding/encoding is the concept of a hierarchial tree of bitsets. |
|
This is intended to match how the hw decodes instructions, where certain |
|
bits describe the instruction (and sub-encoding, and so on), and other |
|
bits describe various operands to the instruction. |
|
|
|
Bitsets can also be used recursively as the type of a field described |
|
in another bitset. |
|
|
|
The leaves of the tree of instruction bitsets represent every possible |
|
instruction. Deciding which instruction a bitpattern is amounts to: |
|
|
|
.. code-block:: c |
|
|
|
m = (val & bitsets[n]->mask) & ~bitsets[n]->dontcare; |
|
|
|
if (m == bitsets[n]->match) { |
|
/* we've found the instruction description */ |
|
} |
|
|
|
For example, the starting point to decode an ir3 instruction is a 64b |
|
bitset: |
|
|
|
.. code-block:: xml |
|
|
|
<bitset name="#instruction" size="64"> |
|
<doc> |
|
Encoding of an ir3 instruction. All instructions are 64b. |
|
</doc> |
|
</bitset> |
|
|
|
In the first level of instruction encoding hierarchy, the high three bits |
|
group things into instruction "categories": |
|
|
|
.. code-block:: xml |
|
|
|
<bitset name="#instruction-cat2" extends="#instruction"> |
|
<field name="DST" low="32" high="39" type="#reg-gpr"/> |
|
<field name="REPEAT" low="40" high="41" type="#rptN"/> |
|
<field name="SAT" pos="42" type="bool" display="(sat)"/> |
|
<field name="SS" pos="44" type="bool" display="(ss)"/> |
|
<field name="UL" pos="45" type="bool" display="(ul)"/> |
|
<field name="DST_CONV" pos="46" type="bool"> |
|
<doc> |
|
Destination register is opposite precision as source, ie. |
|
if {FULL} is true then destination is half precision, and |
|
visa versa. |
|
</doc> |
|
</field> |
|
<derived name="DST_HALF" expr="#dest-half" type="bool" display="h"/> |
|
<field name="EI" pos="47" type="bool" display="(ei)"/> |
|
<field name="FULL" pos="52" type="bool"> |
|
<doc>Full precision source registers</doc> |
|
</field> |
|
<field name="JP" pos="59" type="bool" display="(jp)"/> |
|
<field name="SY" pos="60" type="bool" display="(sy)"/> |
|
<pattern low="61" high="63">010</pattern> <!-- cat2 --> |
|
<!-- |
|
NOTE, both SRC1_R and SRC2_R are defined at this level because |
|
SRC2_R is still a valid bit for (nopN) (REPEAT==0) for cat2 |
|
instructions with only a single src |
|
--> |
|
<field name="SRC1_R" pos="43" type="bool" display="(r)"/> |
|
<field name="SRC2_R" pos="51" type="bool" display="(r)"/> |
|
<derived name="ZERO" expr="#zero" type="bool" display=""/> |
|
</bitset> |
|
|
|
The ``<pattern>`` elements are the part(s) that determine which leaf-node |
|
bitset matches against a given bit pattern. The leaf node's match/mask/ |
|
dontcare bitmasks are a combination of those defined at the leaf node and |
|
recursively each parent bitclass. |
|
|
|
For example, cat2 instructions (ALU instructions with up to two src |
|
registers) can have either one or two source registers: |
|
|
|
.. code-block:: xml |
|
|
|
<bitset name="#instruction-cat2-1src" extends="#instruction-cat2"> |
|
<override expr="#cat2-cat3-nop-encoding"> |
|
<display> |
|
{SY}{SS}{JP}{SAT}(nop{NOP}) {UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} |
|
</display> |
|
<derived name="NOP" expr="#cat2-cat3-nop-value" type="uint"/> |
|
<field name="SRC1" low="0" high="15" type="#multisrc"> |
|
<param name="ZERO" as="SRC_R"/> |
|
<param name="FULL"/> |
|
</field> |
|
</override> |
|
<display> |
|
{SY}{SS}{JP}{SAT}{REPEAT}{UL}{NAME} {EI}{DST_HALF}{DST}, {SRC1} |
|
</display> |
|
<pattern low="16" high="31">xxxxxxxxxxxxxxxx</pattern> |
|
<pattern low="48" high="50">xxx</pattern> <!-- COND --> |
|
<field name="SRC1" low="0" high="15" type="#multisrc"> |
|
<param name="SRC1_R" as="SRC_R"/> |
|
<param name="FULL"/> |
|
</field> |
|
</bitset> |
|
|
|
<bitset name="absneg.f" extends="#instruction-cat2-1src"> |
|
<pattern low="53" high="58">000110</pattern> |
|
</bitset> |
|
|
|
In this example, ``absneg.f`` is a concrete cat2 instruction (leaf node of |
|
the bitset inheritance tree) which has a single src register. At the |
|
``#instruction-cat2-1src`` level, bits that are used for the 2nd src arg |
|
and condition code (for cat2 instructions which use a condition code) are |
|
defined as 'x' (dontcare), which matches our understanding of the hardware |
|
(but also lets the disassembler flag cases where '1' bits show up in places |
|
we don't expect, which may signal a new instruction (sub)encoding). |
|
|
|
You'll notice that ``SRC1`` refers back to a different bitset hierarchy |
|
that describes various different src register encoding (used for cat2 and |
|
cat4 instructions), ie. GPR vs CONST vs relative GPR/CONST. For fields |
|
which have bitset types, parameters can be "passed" in via ``<param>`` |
|
elements, which can be referred to by the display template string, and/or |
|
expressions. For example, this helps to deal with cases where other fields |
|
outside of that bitset control the encoding/decoding, such as in the |
|
``#multisrc`` example: |
|
|
|
.. code-block:: xml |
|
|
|
<bitset name="#multisrc" size="16"> |
|
<doc> |
|
Encoding for instruction source which can be GPR/CONST/IMMED |
|
or relative GPR/CONST. |
|
</doc> |
|
</bitset> |
|
|
|
... |
|
|
|
<bitset name="#multisrc-gpr" extends="#multisrc"> |
|
<display> |
|
{ABSNEG}{SRC_R}{HALF}{SRC} |
|
</display> |
|
<derived name="HALF" expr="#multisrc-half" type="bool" display="h"/> |
|
<field name="SRC" low="0" high="7" type="#reg-gpr"/> |
|
<pattern low="8" high="13">000000</pattern> |
|
<field name="ABSNEG" low="14" high="15" type="#absneg"/> |
|
</bitset> |
|
|
|
At some level in the bitset inheritance hiearchy, there is expected to be a |
|
``<display>`` element specifying a template string used during bitset |
|
decoding. The display template consists of references to fields (which may |
|
be derived fields) specified as ``{FIELDNAME}`` and other characters |
|
which are just echoed through to the resulting decoded bitset. |
|
|
|
It is possible to define a line column alignment value per field to influence |
|
the visual output. It needs to be pecified as ``{FIELDNAME:align=xx}``. |
|
|
|
The ``<override>`` element will be described in the next section, but it |
|
provides for both different decoded instruction syntax/mnemonics (when |
|
simply providing a different display template string) as well as instruction |
|
encoding where different ranges of bits have a different meaning based on |
|
some other bitfield (or combination of bitfields). In this example it is |
|
used to cover the cases where ``SRCn_R`` has a different meaning and a |
|
different disassembly syntax depending on whether ``REPEAT`` equals zero. |
|
|
|
Overrides |
|
--------- |
|
|
|
In many cases, a bitset is not convenient for describing the expected |
|
disasm syntax, and/or interpretation of some range of bits differs based |
|
on some other field or combination of fields. These *could* be modeled |
|
as different derived bitsets, at the expense of a combinatorical explosion |
|
of the size of the bitset inheritance tree. For example, *every* cat2 |
|
(and cat3) instruction has both a ``(nopN)`` interpretation in addtion to |
|
the ``(rptN`)`` interpretation. |
|
|
|
An ``<override>`` in a bitset allows to redefine the display string, and/or |
|
field definitions from the default case. If the override's expr(ession) |
|
evaluates to non-zero, ``<display>``, ``<field>``, and ``<derived>`` |
|
elements take precedence over what is defined in the toplevel of the |
|
bitset (ie. the default case). |
|
|
|
Expressions |
|
----------- |
|
|
|
Both ``<override>`` and ``<derived>`` fields make use of ``<expr>`` elements, |
|
either defined inline, or defined and named at the top level and referred to |
|
by name in multiple other places. An expression is a simple 'C' expression |
|
which can reference fields (including other derived fields) with the same |
|
``{FIELDNAME}`` syntax as display template strings. For example: |
|
|
|
.. code-block:: xml |
|
|
|
<expr name="#cat2-cat3-nop-encoding"> |
|
(({SRC1_R} != 0) || ({SRC2_R} != 0)) && ({REPEAT} == 0) |
|
</expr> |
|
|
|
In the case of ``<override>`` elements, the override applies if the expression |
|
evaluates to non-zero. In the case of ``<derived>`` fields, the expression |
|
evaluates to the value of the derived field. |
|
|
|
Encoding |
|
-------- |
|
|
|
To facilitate instruction encoding, ``<encode>`` elements can be provided |
|
to teach the generated instruction packing code how to map from data structures |
|
representing the IR to fields. For example: |
|
|
|
.. code-block:: xml |
|
|
|
<bitset name="#instruction" size="64"> |
|
<doc> |
|
Encoding of an ir3 instruction. All instructions are 64b. |
|
</doc> |
|
<gen min="300"/> |
|
<encode type="struct ir3_instruction *" case-prefix="OPC_"> |
|
<!-- |
|
Define mapping from encode src to individual fields, |
|
which are common across all instruction categories |
|
at the root instruction level |
|
|
|
Not all of these apply to all instructions, but we |
|
can define mappings here for anything that is used |
|
in more than one instruction category. For things |
|
that are specific to a single instruction category, |
|
mappings should be defined at that level instead. |
|
--> |
|
<map name="DST">src->regs[0]</map> |
|
<map name="SRC1">src->regs[1]</map> |
|
<map name="SRC2">src->regs[2]</map> |
|
<map name="SRC3">src->regs[3]</map> |
|
<map name="REPEAT">src->repeat</map> |
|
<map name="SS">!!(src->flags & IR3_INSTR_SS)</map> |
|
<map name="JP">!!(src->flags & IR3_INSTR_JP)</map> |
|
<map name="SY">!!(src->flags & IR3_INSTR_SY)</map> |
|
<map name="UL">!!(src->flags & IR3_INSTR_UL)</map> |
|
<map name="EQ">0</map> <!-- We don't use this (yet) --> |
|
<map name="SAT">!!(src->flags & IR3_INSTR_SAT)</map> |
|
</encode> |
|
</bitset> |
|
|
|
The ``type`` attribute specifies that the input to encoding an instruction |
|
is a ``struct ir3_instruction *``. In the case of bitset hierarchies with |
|
multiple possible leaf nodes, a ``case-prefix`` attribute should be supplied |
|
along with a function that maps the bitset encode source to an enum value |
|
with the specified prefix prepended to uppercase'd leaf node name. Ie. in |
|
this case, "add.f" becomes ``OPC_ADD_F``. |
|
|
|
Individual ``<map>`` elements teach the encoder how to map from the encode |
|
source to fields in the encoded instruction.
|
|
|