mirror of https://github.com/Qortal/Brooklyn
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
212 lines
8.4 KiB
212 lines
8.4 KiB
================================================= |
|
Linux API for read access to z/VM Monitor Records |
|
================================================= |
|
|
|
Date : 2004-Nov-26 |
|
|
|
Author: Gerald Schaefer ([email protected]) |
|
|
|
|
|
|
|
|
|
Description |
|
=========== |
|
This item delivers a new Linux API in the form of a misc char device that is |
|
usable from user space and allows read access to the z/VM Monitor Records |
|
collected by the `*MONITOR` System Service of z/VM. |
|
|
|
|
|
User Requirements |
|
================= |
|
The z/VM guest on which you want to access this API needs to be configured in |
|
order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the |
|
IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is |
|
restricted (likely), you also need the NAMESAVE <DCSS NAME> statement. |
|
This item will use the IUCV device driver to access the z/VM services, so you |
|
need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1. |
|
|
|
There are two options for being able to load the monitor DCSS (examples assume |
|
that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the |
|
location of the monitor DCSS with the Class E privileged CP command Q NSS MAP |
|
(the values BEGPAG and ENDPAG are given in units of 4K pages). |
|
|
|
See also "CP Command and Utility Reference" (SC24-6081-00) for more information |
|
on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning |
|
and Administration" (SC24-6116-00) for more information on DCSSes. |
|
|
|
1st option: |
|
----------- |
|
You can use the CP command DEF STOR CONFIG to define a "memory hole" in your |
|
guest virtual storage around the address range of the DCSS. |
|
|
|
Example: DEF STOR CONFIG 0.140M 200M.200M |
|
|
|
This defines two blocks of storage, the first is 140MB in size an begins at |
|
address 0MB, the second is 200MB in size and begins at address 200MB, |
|
resulting in a total storage of 340MB. Note that the first block should |
|
always start at 0 and be at least 64MB in size. |
|
|
|
2nd option: |
|
----------- |
|
Your guest virtual storage has to end below the starting address of the DCSS |
|
and you have to specify the "mem=" kernel parameter in your parmfile with a |
|
value greater than the ending address of the DCSS. |
|
|
|
Example:: |
|
|
|
DEF STOR 140M |
|
|
|
This defines 140MB storage size for your guest, the parameter "mem=160M" is |
|
added to the parmfile. |
|
|
|
|
|
User Interface |
|
============== |
|
The char device is implemented as a kernel module named "monreader", |
|
which can be loaded via the modprobe command, or it can be compiled into the |
|
kernel instead. There is one optional module (or kernel) parameter, "mondcss", |
|
to specify the name of the monitor DCSS. If the module is compiled into the |
|
kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified |
|
in the parmfile. |
|
|
|
The default name for the DCSS is "MONDCSS" if none is specified. In case that |
|
there are other users already connected to the `*MONITOR` service (e.g. |
|
Performance Toolkit), the monitor DCSS is already defined and you have to use |
|
the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name |
|
of the monitor DCSS, if already defined, and the users connected to the |
|
`*MONITOR` service. |
|
Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor |
|
DCSS if your z/VM doesn't have one already, you need Class E privileges to |
|
define and save a DCSS. |
|
|
|
Example: |
|
-------- |
|
|
|
:: |
|
|
|
modprobe monreader mondcss=MYDCSS |
|
|
|
This loads the module and sets the DCSS name to "MYDCSS". |
|
|
|
NOTE: |
|
----- |
|
This API provides no interface to control the `*MONITOR` service, e.g. specify |
|
which data should be collected. This can be done by the CP command MONITOR |
|
(Class E privileged), see "CP Command and Utility Reference". |
|
|
|
Device nodes with udev: |
|
----------------------- |
|
After loading the module, a char device will be created along with the device |
|
node /<udev directory>/monreader. |
|
|
|
Device nodes without udev: |
|
-------------------------- |
|
If your distribution does not support udev, a device node will not be created |
|
automatically and you have to create it manually after loading the module. |
|
Therefore you need to know the major and minor numbers of the device. These |
|
numbers can be found in /sys/class/misc/monreader/dev. |
|
|
|
Typing cat /sys/class/misc/monreader/dev will give an output of the form |
|
<major>:<minor>. The device node can be created via the mknod command, enter |
|
mknod <name> c <major> <minor>, where <name> is the name of the device node |
|
to be created. |
|
|
|
Example: |
|
-------- |
|
|
|
:: |
|
|
|
# modprobe monreader |
|
# cat /sys/class/misc/monreader/dev |
|
10:63 |
|
# mknod /dev/monreader c 10 63 |
|
|
|
This loads the module with the default monitor DCSS (MONDCSS) and creates a |
|
device node. |
|
|
|
File operations: |
|
---------------- |
|
The following file operations are supported: open, release, read, poll. |
|
There are two alternative methods for reading: either non-blocking read in |
|
conjunction with polling, or blocking read without polling. IOCTLs are not |
|
supported. |
|
|
|
Read: |
|
----- |
|
Reading from the device provides a 12 Byte monitor control element (MCE), |
|
followed by a set of one or more contiguous monitor records (similar to the |
|
output of the CMS utility MONWRITE without the 4K control blocks). The MCE |
|
contains information on the type of the following record set (sample/event |
|
data), the monitor domains contained within it and the start and end address |
|
of the record set in the monitor DCSS. The start and end address can be used |
|
to determine the size of the record set, the end address is the address of the |
|
last byte of data. The start address is needed to handle "end-of-frame" records |
|
correctly (domain 1, record 13), i.e. it can be used to determine the record |
|
start offset relative to a 4K page (frame) boundary. |
|
|
|
See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description |
|
of the monitor control element layout. The layout of the monitor records can |
|
be found here (z/VM 5.1): https://www.vm.ibm.com/pubs/mon510/index.html |
|
|
|
The layout of the data stream provided by the monreader device is as follows:: |
|
|
|
... |
|
<0 byte read> |
|
<first MCE> \ |
|
<first set of records> | |
|
... |- data set |
|
<last MCE> | |
|
<last set of records> / |
|
<0 byte read> |
|
... |
|
|
|
There may be more than one combination of MCE and corresponding record set |
|
within one data set and the end of each data set is indicated by a successful |
|
read with a return value of 0 (0 byte read). |
|
Any received data must be considered invalid until a complete set was |
|
read successfully, including the closing 0 byte read. Therefore you should |
|
always read the complete set into a buffer before processing the data. |
|
|
|
The maximum size of a data set can be as large as the size of the |
|
monitor DCSS, so design the buffer adequately or use dynamic memory allocation. |
|
The size of the monitor DCSS will be printed into syslog after loading the |
|
module. You can also use the (Class E privileged) CP command Q NSS MAP to |
|
list all available segments and information about them. |
|
|
|
As with most char devices, error conditions are indicated by returning a |
|
negative value for the number of bytes read. In this case, the errno variable |
|
indicates the error condition: |
|
|
|
EIO: |
|
reply failed, read data is invalid and the application |
|
should discard the data read since the last successful read with 0 size. |
|
EFAULT: |
|
copy_to_user failed, read data is invalid and the application should |
|
discard the data read since the last successful read with 0 size. |
|
EAGAIN: |
|
occurs on a non-blocking read if there is no data available at the |
|
moment. There is no data missing or corrupted, just try again or rather |
|
use polling for non-blocking reads. |
|
EOVERFLOW: |
|
message limit reached, the data read since the last successful |
|
read with 0 size is valid but subsequent records may be missing. |
|
|
|
In the last case (EOVERFLOW) there may be missing data, in the first two cases |
|
(EIO, EFAULT) there will be missing data. It's up to the application if it will |
|
continue reading subsequent data or rather exit. |
|
|
|
Open: |
|
----- |
|
Only one user is allowed to open the char device. If it is already in use, the |
|
open function will fail (return a negative value) and set errno to EBUSY. |
|
The open function may also fail if an IUCV connection to the `*MONITOR` service |
|
cannot be established. In this case errno will be set to EIO and an error |
|
message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER |
|
codes are described in the "z/VM Performance" book, Appendix A. |
|
|
|
NOTE: |
|
----- |
|
As soon as the device is opened, incoming messages will be accepted and they |
|
will account for the message limit, i.e. opening the device without reading |
|
from it will provoke the "message limit reached" error (EOVERFLOW error code) |
|
eventually.
|
|
|