forked from Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
297 lines
12 KiB
297 lines
12 KiB
=============================================== |
|
The irq_domain interrupt number mapping library |
|
=============================================== |
|
|
|
The current design of the Linux kernel uses a single large number |
|
space where each separate IRQ source is assigned a different number. |
|
This is simple when there is only one interrupt controller, but in |
|
systems with multiple interrupt controllers the kernel must ensure |
|
that each one gets assigned non-overlapping allocations of Linux |
|
IRQ numbers. |
|
|
|
The number of interrupt controllers registered as unique irqchips |
|
show a rising tendency: for example subdrivers of different kinds |
|
such as GPIO controllers avoid reimplementing identical callback |
|
mechanisms as the IRQ core system by modelling their interrupt |
|
handlers as irqchips, i.e. in effect cascading interrupt controllers. |
|
|
|
Here the interrupt number loose all kind of correspondence to |
|
hardware interrupt numbers: whereas in the past, IRQ numbers could |
|
be chosen so they matched the hardware IRQ line into the root |
|
interrupt controller (i.e. the component actually fireing the |
|
interrupt line to the CPU) nowadays this number is just a number. |
|
|
|
For this reason we need a mechanism to separate controller-local |
|
interrupt numbers, called hardware irq's, from Linux IRQ numbers. |
|
|
|
The irq_alloc_desc*() and irq_free_desc*() APIs provide allocation of |
|
irq numbers, but they don't provide any support for reverse mapping of |
|
the controller-local IRQ (hwirq) number into the Linux IRQ number |
|
space. |
|
|
|
The irq_domain library adds mapping between hwirq and IRQ numbers on |
|
top of the irq_alloc_desc*() API. An irq_domain to manage mapping is |
|
preferred over interrupt controller drivers open coding their own |
|
reverse mapping scheme. |
|
|
|
irq_domain also implements translation from an abstract irq_fwspec |
|
structure to hwirq numbers (Device Tree and ACPI GSI so far), and can |
|
be easily extended to support other IRQ topology data sources. |
|
|
|
irq_domain usage |
|
================ |
|
|
|
An interrupt controller driver creates and registers an irq_domain by |
|
calling one of the irq_domain_add_*() or irq_domain_create_*() functions |
|
(each mapping method has a different allocator function, more on that later). |
|
The function will return a pointer to the irq_domain on success. The caller |
|
must provide the allocator function with an irq_domain_ops structure. |
|
|
|
In most cases, the irq_domain will begin empty without any mappings |
|
between hwirq and IRQ numbers. Mappings are added to the irq_domain |
|
by calling irq_create_mapping() which accepts the irq_domain and a |
|
hwirq number as arguments. If a mapping for the hwirq doesn't already |
|
exist then it will allocate a new Linux irq_desc, associate it with |
|
the hwirq, and call the .map() callback so the driver can perform any |
|
required hardware setup. |
|
|
|
Once a mapping has been established, it can be retrieved or used via a |
|
variety of methods: |
|
|
|
- irq_resolve_mapping() returns a pointer to the irq_desc structure |
|
for a given domain and hwirq number, and NULL if there was no |
|
mapping. |
|
- irq_find_mapping() returns a Linux IRQ number for a given domain and |
|
hwirq number, and 0 if there was no mapping |
|
- irq_linear_revmap() is now identical to irq_find_mapping(), and is |
|
deprecated |
|
- generic_handle_domain_irq() handles an interrupt described by a |
|
domain and a hwirq number |
|
|
|
Note that irq domain lookups must happen in contexts that are |
|
compatible with a RCU read-side critical section. |
|
|
|
The irq_create_mapping() function must be called *atleast once* |
|
before any call to irq_find_mapping(), lest the descriptor will not |
|
be allocated. |
|
|
|
If the driver has the Linux IRQ number or the irq_data pointer, and |
|
needs to know the associated hwirq number (such as in the irq_chip |
|
callbacks) then it can be directly obtained from irq_data->hwirq. |
|
|
|
Types of irq_domain mappings |
|
============================ |
|
|
|
There are several mechanisms available for reverse mapping from hwirq |
|
to Linux irq, and each mechanism uses a different allocation function. |
|
Which reverse map type should be used depends on the use case. Each |
|
of the reverse map types are described below: |
|
|
|
Linear |
|
------ |
|
|
|
:: |
|
|
|
irq_domain_add_linear() |
|
irq_domain_create_linear() |
|
|
|
The linear reverse map maintains a fixed size table indexed by the |
|
hwirq number. When a hwirq is mapped, an irq_desc is allocated for |
|
the hwirq, and the IRQ number is stored in the table. |
|
|
|
The Linear map is a good choice when the maximum number of hwirqs is |
|
fixed and a relatively small number (~ < 256). The advantages of this |
|
map are fixed time lookup for IRQ numbers, and irq_descs are only |
|
allocated for in-use IRQs. The disadvantage is that the table must be |
|
as large as the largest possible hwirq number. |
|
|
|
irq_domain_add_linear() and irq_domain_create_linear() are functionally |
|
equivalent, except for the first argument is different - the former |
|
accepts an Open Firmware specific 'struct device_node', while the latter |
|
accepts a more general abstraction 'struct fwnode_handle'. |
|
|
|
The majority of drivers should use the linear map. |
|
|
|
Tree |
|
---- |
|
|
|
:: |
|
|
|
irq_domain_add_tree() |
|
irq_domain_create_tree() |
|
|
|
The irq_domain maintains a radix tree map from hwirq numbers to Linux |
|
IRQs. When an hwirq is mapped, an irq_desc is allocated and the |
|
hwirq is used as the lookup key for the radix tree. |
|
|
|
The tree map is a good choice if the hwirq number can be very large |
|
since it doesn't need to allocate a table as large as the largest |
|
hwirq number. The disadvantage is that hwirq to IRQ number lookup is |
|
dependent on how many entries are in the table. |
|
|
|
irq_domain_add_tree() and irq_domain_create_tree() are functionally |
|
equivalent, except for the first argument is different - the former |
|
accepts an Open Firmware specific 'struct device_node', while the latter |
|
accepts a more general abstraction 'struct fwnode_handle'. |
|
|
|
Very few drivers should need this mapping. |
|
|
|
No Map |
|
------ |
|
|
|
:: |
|
|
|
irq_domain_add_nomap() |
|
|
|
The No Map mapping is to be used when the hwirq number is |
|
programmable in the hardware. In this case it is best to program the |
|
Linux IRQ number into the hardware itself so that no mapping is |
|
required. Calling irq_create_direct_mapping() will allocate a Linux |
|
IRQ number and call the .map() callback so that driver can program the |
|
Linux IRQ number into the hardware. |
|
|
|
Most drivers cannot use this mapping, and it is now gated on the |
|
CONFIG_IRQ_DOMAIN_NOMAP option. Please refrain from introducing new |
|
users of this API. |
|
|
|
Legacy |
|
------ |
|
|
|
:: |
|
|
|
irq_domain_add_simple() |
|
irq_domain_add_legacy() |
|
irq_domain_create_simple() |
|
irq_domain_create_legacy() |
|
|
|
The Legacy mapping is a special case for drivers that already have a |
|
range of irq_descs allocated for the hwirqs. It is used when the |
|
driver cannot be immediately converted to use the linear mapping. For |
|
example, many embedded system board support files use a set of #defines |
|
for IRQ numbers that are passed to struct device registrations. In that |
|
case the Linux IRQ numbers cannot be dynamically assigned and the legacy |
|
mapping should be used. |
|
|
|
As the name implies, the \*_legacy() functions are deprecated and only |
|
exist to ease the support of ancient platforms. No new users should be |
|
added. Same goes for the \*_simple() functions when their use results |
|
in the legacy behaviour. |
|
|
|
The legacy map assumes a contiguous range of IRQ numbers has already |
|
been allocated for the controller and that the IRQ number can be |
|
calculated by adding a fixed offset to the hwirq number, and |
|
visa-versa. The disadvantage is that it requires the interrupt |
|
controller to manage IRQ allocations and it requires an irq_desc to be |
|
allocated for every hwirq, even if it is unused. |
|
|
|
The legacy map should only be used if fixed IRQ mappings must be |
|
supported. For example, ISA controllers would use the legacy map for |
|
mapping Linux IRQs 0-15 so that existing ISA drivers get the correct IRQ |
|
numbers. |
|
|
|
Most users of legacy mappings should use irq_domain_add_simple() or |
|
irq_domain_create_simple() which will use a legacy domain only if an IRQ range |
|
is supplied by the system and will otherwise use a linear domain mapping. |
|
The semantics of this call are such that if an IRQ range is specified then |
|
descriptors will be allocated on-the-fly for it, and if no range is |
|
specified it will fall through to irq_domain_add_linear() or |
|
irq_domain_create_linear() which means *no* irq descriptors will be allocated. |
|
|
|
A typical use case for simple domains is where an irqchip provider |
|
is supporting both dynamic and static IRQ assignments. |
|
|
|
In order to avoid ending up in a situation where a linear domain is |
|
used and no descriptor gets allocated it is very important to make sure |
|
that the driver using the simple domain call irq_create_mapping() |
|
before any irq_find_mapping() since the latter will actually work |
|
for the static IRQ assignment case. |
|
|
|
irq_domain_add_simple() and irq_domain_create_simple() as well as |
|
irq_domain_add_legacy() and irq_domain_create_legacy() are functionally |
|
equivalent, except for the first argument is different - the former |
|
accepts an Open Firmware specific 'struct device_node', while the latter |
|
accepts a more general abstraction 'struct fwnode_handle'. |
|
|
|
Hierarchy IRQ domain |
|
-------------------- |
|
|
|
On some architectures, there may be multiple interrupt controllers |
|
involved in delivering an interrupt from the device to the target CPU. |
|
Let's look at a typical interrupt delivering path on x86 platforms:: |
|
|
|
Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU |
|
|
|
There are three interrupt controllers involved: |
|
|
|
1) IOAPIC controller |
|
2) Interrupt remapping controller |
|
3) Local APIC controller |
|
|
|
To support such a hardware topology and make software architecture match |
|
hardware architecture, an irq_domain data structure is built for each |
|
interrupt controller and those irq_domains are organized into hierarchy. |
|
When building irq_domain hierarchy, the irq_domain near to the device is |
|
child and the irq_domain near to CPU is parent. So a hierarchy structure |
|
as below will be built for the example above:: |
|
|
|
CPU Vector irq_domain (root irq_domain to manage CPU vectors) |
|
^ |
|
| |
|
Interrupt Remapping irq_domain (manage irq_remapping entries) |
|
^ |
|
| |
|
IOAPIC irq_domain (manage IOAPIC delivery entries/pins) |
|
|
|
There are four major interfaces to use hierarchy irq_domain: |
|
|
|
1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt |
|
controller related resources to deliver these interrupts. |
|
2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controller |
|
related resources associated with these interrupts. |
|
3) irq_domain_activate_irq(): activate interrupt controller hardware to |
|
deliver the interrupt. |
|
4) irq_domain_deactivate_irq(): deactivate interrupt controller hardware |
|
to stop delivering the interrupt. |
|
|
|
Following changes are needed to support hierarchy irq_domain: |
|
|
|
1) a new field 'parent' is added to struct irq_domain; it's used to |
|
maintain irq_domain hierarchy information. |
|
2) a new field 'parent_data' is added to struct irq_data; it's used to |
|
build hierarchy irq_data to match hierarchy irq_domains. The irq_data |
|
is used to store irq_domain pointer and hardware irq number. |
|
3) new callbacks are added to struct irq_domain_ops to support hierarchy |
|
irq_domain operations. |
|
|
|
With support of hierarchy irq_domain and hierarchy irq_data ready, an |
|
irq_domain structure is built for each interrupt controller, and an |
|
irq_data structure is allocated for each irq_domain associated with an |
|
IRQ. Now we could go one step further to support stacked(hierarchy) |
|
irq_chip. That is, an irq_chip is associated with each irq_data along |
|
the hierarchy. A child irq_chip may implement a required action by |
|
itself or by cooperating with its parent irq_chip. |
|
|
|
With stacked irq_chip, interrupt controller driver only needs to deal |
|
with the hardware managed by itself and may ask for services from its |
|
parent irq_chip when needed. So we could achieve a much cleaner |
|
software architecture. |
|
|
|
For an interrupt controller driver to support hierarchy irq_domain, it |
|
needs to: |
|
|
|
1) Implement irq_domain_ops.alloc and irq_domain_ops.free |
|
2) Optionally implement irq_domain_ops.activate and |
|
irq_domain_ops.deactivate. |
|
3) Optionally implement an irq_chip to manage the interrupt controller |
|
hardware. |
|
4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap, |
|
they are unused with hierarchy irq_domain. |
|
|
|
Hierarchy irq_domain is in no way x86 specific, and is heavily used to |
|
support other architectures, such as ARM, ARM64 etc. |
|
|
|
Debugging |
|
========= |
|
|
|
Most of the internals of the IRQ subsystem are exposed in debugfs by |
|
turning CONFIG_GENERIC_IRQ_DEBUGFS on.
|
|
|