mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
159 lines
7.4 KiB
159 lines
7.4 KiB
================================================================= |
|
Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) |
|
================================================================= |
|
|
|
Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature |
|
supports Ethernet functionality over Omni-Path fabric by encapsulating |
|
the Ethernet packets between HFI nodes. |
|
|
|
Architecture |
|
============= |
|
The patterns of exchanges of Omni-Path encapsulated Ethernet packets |
|
involves one or more virtual Ethernet switches overlaid on the Omni-Path |
|
fabric topology. A subset of HFI nodes on the Omni-Path fabric are |
|
permitted to exchange encapsulated Ethernet packets across a particular |
|
virtual Ethernet switch. The virtual Ethernet switches are logical |
|
abstractions achieved by configuring the HFI nodes on the fabric for |
|
header generation and processing. In the simplest configuration all HFI |
|
nodes across the fabric exchange encapsulated Ethernet packets over a |
|
single virtual Ethernet switch. A virtual Ethernet switch, is effectively |
|
an independent Ethernet network. The configuration is performed by an |
|
Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM) |
|
application. HFI nodes can have multiple VNICs each connected to a |
|
different virtual Ethernet switch. The below diagram presents a case |
|
of two virtual Ethernet switches with two HFI nodes:: |
|
|
|
+-------------------+ |
|
| Subnet/ | |
|
| Ethernet | |
|
| Manager | |
|
+-------------------+ |
|
/ / |
|
/ / |
|
/ / |
|
/ / |
|
+-----------------------------+ +------------------------------+ |
|
| Virtual Ethernet Switch | | Virtual Ethernet Switch | |
|
| +---------+ +---------+ | | +---------+ +---------+ | |
|
| | VPORT | | VPORT | | | | VPORT | | VPORT | | |
|
+--+---------+----+---------+-+ +-+---------+----+---------+---+ |
|
| \ / | |
|
| \ / | |
|
| \/ | |
|
| / \ | |
|
| / \ | |
|
+-----------+------------+ +-----------+------------+ |
|
| VNIC | VNIC | | VNIC | VNIC | |
|
+-----------+------------+ +-----------+------------+ |
|
| HFI | | HFI | |
|
+------------------------+ +------------------------+ |
|
|
|
|
|
The Omni-Path encapsulated Ethernet packet format is as described below. |
|
|
|
==================== ================================ |
|
Bits Field |
|
==================== ================================ |
|
Quad Word 0: |
|
0-19 SLID (lower 20 bits) |
|
20-30 Length (in Quad Words) |
|
31 BECN bit |
|
32-51 DLID (lower 20 bits) |
|
52-56 SC (Service Class) |
|
57-59 RC (Routing Control) |
|
60 FECN bit |
|
61-62 L2 (=10, 16B format) |
|
63 LT (=1, Link Transfer Head Flit) |
|
|
|
Quad Word 1: |
|
0-7 L4 type (=0x78 ETHERNET) |
|
8-11 SLID[23:20] |
|
12-15 DLID[23:20] |
|
16-31 PKEY |
|
32-47 Entropy |
|
48-63 Reserved |
|
|
|
Quad Word 2: |
|
0-15 Reserved |
|
16-31 L4 header |
|
32-63 Ethernet Packet |
|
|
|
Quad Words 3 to N-1: |
|
0-63 Ethernet packet (pad extended) |
|
|
|
Quad Word N (last): |
|
0-23 Ethernet packet (pad extended) |
|
24-55 ICRC |
|
56-61 Tail |
|
62-63 LT (=01, Link Transfer Tail Flit) |
|
==================== ================================ |
|
|
|
Ethernet packet is padded on the transmit side to ensure that the VNIC OPA |
|
packet is quad word aligned. The 'Tail' field contains the number of bytes |
|
padded. On the receive side the 'Tail' field is read and the padding is |
|
removed (along with ICRC, Tail and OPA header) before passing packet up |
|
the network stack. |
|
|
|
The L4 header field contains the virtual Ethernet switch id the VNIC port |
|
belongs to. On the receive side, this field is used to de-multiplex the |
|
received VNIC packets to different VNIC ports. |
|
|
|
Driver Design |
|
============== |
|
Intel OPA VNIC software design is presented in the below diagram. |
|
OPA VNIC functionality has a HW dependent component and a HW |
|
independent component. |
|
|
|
The support has been added for IB device to allocate and free the RDMA |
|
netdev devices. The RDMA netdev supports interfacing with the network |
|
stack thus creating standard network interfaces. OPA_VNIC is an RDMA |
|
netdev device type. |
|
|
|
The HW dependent VNIC functionality is part of the HFI1 driver. It |
|
implements the verbs to allocate and free the OPA_VNIC RDMA netdev. |
|
It involves HW resource allocation/management for VNIC functionality. |
|
It interfaces with the network stack and implements the required |
|
net_device_ops functions. It expects Omni-Path encapsulated Ethernet |
|
packets in the transmit path and provides HW access to them. It strips |
|
the Omni-Path header from the received packets before passing them up |
|
the network stack. It also implements the RDMA netdev control operations. |
|
|
|
The OPA VNIC module implements the HW independent VNIC functionality. |
|
It consists of two parts. The VNIC Ethernet Management Agent (VEMA) |
|
registers itself with IB core as an IB client and interfaces with the |
|
IB MAD stack. It exchanges the management information with the Ethernet |
|
Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees |
|
the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions |
|
set by HW dependent VNIC driver where required to accommodate any control |
|
operation. It also handles the encapsulation of Ethernet packets with an |
|
Omni-Path header in the transmit path. For each VNIC interface, the |
|
information required for encapsulation is configured by the EM via VEMA MAD |
|
interface. It also passes any control information to the HW dependent driver |
|
by invoking the RDMA netdev control operations:: |
|
|
|
+-------------------+ +----------------------+ |
|
| | | Linux | |
|
| IB MAD | | Network | |
|
| | | Stack | |
|
+-------------------+ +----------------------+ |
|
| | | |
|
| | | |
|
+----------------------------+ | |
|
| | | |
|
| OPA VNIC Module | | |
|
| (OPA VNIC RDMA Netdev | | |
|
| & EMA functions) | | |
|
| | | |
|
+----------------------------+ | |
|
| | |
|
| | |
|
+------------------+ | |
|
| IB core | | |
|
+------------------+ | |
|
| | |
|
| | |
|
+--------------------------------------------+ |
|
| | |
|
| HFI1 Driver with VNIC support | |
|
| | |
|
+--------------------------------------------+
|
|
|