dropdown menu

I/O - AIO, DIO, CIO, RAW


Filesystem I/O

AIX has special features to enhance the performance of of filesystem I/O for general-purpose file access. These features include read ahead, write behind and I/O buffering. Oracle employs its own I/O optimization and buffering that  in most cases are redundant to those provided by AIX file systems. Oracle uses buffer cache management (data blocks buffered to shared memory), and AIX uses virtual memory management (data buffered in virtual memory). If both try to manage data caching, it result in wasted memory, cpu and suboptimal performance.

Generally better to allow Oracle to manage I/O buffering, because it has information regarding the context, so can better optimze memory usage.


Asynchronous I/O

A read can be considered to be synchronous if a disk operation is required to read the data into memory. In this case, application processing cannot continue until the I/O operation is complete. Asynchronous I/O allows applications to initate read or write operations without being blocked, since all I/O operations are done in background. This can improve performance, because I/O operations and appl. processing can run simultaenously.

Asynchronous I/O on filesystems is handled through a kernel process called: aioserver (in this case each I/O is handled by a single kproc)

The minimum number of servers (aioserver) configured, when asynchronous I/O is enabled, is 1 (minservers). Additional aioservers are started when more asynchronous I/O is requested. Tha maximum number of servers is controlled by maxservers. aioserver kernel threads do not go away once started, until the system reboots (so with "ps -k" we can see what was the max number of aio servers that were needed concurrently at some time in the past)

How many should you configure?
The rule of thumb is to set the maximum number of servers (maxservers) equal to ten times the amount of disk or ten times the amount of processors. MinServers would be set at half of this amount. Other than having some more kernel processes hanging out that really don't get used (using a small amount of kernel memory), there really is little risk in oversizing the amount of MaxServers, so don't be afraid to bump it up.

root@aix31: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True
fastpath   enable  State of fast path                       True
kprocprio  39      Server PRIORITY                          True
maxreqs    4096    Maximum number of REQUESTS               True   
maxservers 10      MAXIMUM number of servers per cpu        True   
minservers 1       MINIMUM number of servers                True   

maxreqs     <-max number of aio requests that can  be outstanding at one time
maxservers  <-if you have 4 CPU then tha max count of aio kernel threds would be 40
minservers  <-this amount will start at boot (this is not per CPU)

Oracle takes full advantage of Asynchronous I/O provided by AIX, resulting in faster database access.

on AIX 6.1: ioo is handling aio (ioo -a)

mkdev -l aio0                 enables the AIO device driver (smitty aio)
aioo -a                       shows the value of minservers, maxservers...(or lsattr -El aio0)
chdev -l aio0 -a maxservers='30'    changes the maxserver value to 30 (it will show the new value, butit will be active only after reboot)
ps -k    | grep aio | wc -l   shows how many aio servers are running
                              (these are not necessarily are in use, maybe many of them are just hanging there)
pstat -a                      shows the asynchronous I/O servers by name

iostat -AQ 2 2                it will show if any aio is in use by filesystems
iostat -AQ 1 | grep -v "            0              "    it will omit the empty lines
                              it will show which filesystems are active in regard to aio.
                              under the count column will show the specified fs requested how much aio...
                              (it is good to see which fs is aio intensive)

root@aix10: /root # ps -kf | grep aio        <--it will show the accumulated CPU time of each aio process
    root  127176       1   0   Mar 24      - 15:07 aioserver
    root  131156       1   0   Mar 24      - 14:40 aioserver
    root  139366       1   0   Mar 24      - 14:51 aioserver
    root  151650       1   0   Mar 24      - 14:02 aioserver

It is good to compare these times of each process to see if more aioservers are needed or not. If the times are identical (only few minutes differences) it means all of them are used maximally so more precesses are needed.

------------------------------------

iostat -A                     reports back asynchronous I/O statistics

aio: avgc avfc maxgc maxfc maxreqs avg-cpu: % user % sys % idle % iowait
     10.2  0.0     5     0    4096            20.6   4.5   64.7     10.3


avgc: This reports back the average global asynchronous I/O request per second of the interval you specified.
avfc: This reports back the average fastpath request count per second for your interval.

------------------------------------

Changing aio parameters:
You can set the values online, with no interruption of service – BUT – they will not take affect until the next time the kernel is booted

1. lsattr -El aio0                          <-- check current setting for aio0 device
2. chdev -l aio0 -a maxreqs=<value> -P      <-- set the value of maxreqs permanently for next reboot
3. restart server

------------------------------------

0509-036 Cannot load program aioo because...

if you receive this:
root@aix30: / # aioo -a
exec(): 0509-036 Cannot load program aioo because of the following errors:
        0509-130 Symbol resolution failed for aioo because:
    ....

probably aioserver is in defined state:

root@aix30: / # lsattr -El aio0
autoconfig defined STATE to be configured at system restart True

You should make it available with: mkdev -l aio0
(and also change it for future restarts: chdev -l aio0 -a autoconfig=available, or with 'smitty aio')

------------------------------------



DIRECT I/O (DIO)

Direct I/O is an alternative non-caching policy which causes file data to be transferred directly to the disk from the application or directly from the disk to the application without going through the VMM file cache.

Direct I/O reads cause synchrounous reads from the disk whereas with the normal cached policy the reads may be satisfied from the cache. This can result in poor performance if the data was likely to be in memory under the normal caching policy.

Direct I/O can be enabled: mount -o dio

If JFS2 DIO or CIO options are active, no filesystem cache is being used for Oracle .dbf and/or online redo logs files.

Databases normally manage data caching at application level, so the do not need the filesystem to implement this service for them. The use of the file buffer cache result in undesirable overhead, since data is first moved from the disk to the file buffer cache  and from there to the application buffer. This "double-copying" of data results in additional CPU and memory consumption.

JFS2 supports DIO as well CIO. The CIO model is built on top of the DIO. For JFS2 based environments, CIO should always be used (instead of DIO) for those situations where the bypass of filesystem cache is appropriate.

JFS DIO should only be used:
On Oracle data (.dbf) files, where DB_BLOCK_SIZE is 4k or graeter. (Use of JFS DIO on any other files (e.g redo logs, control files) is likely to result in a severe performance penalty.

------------------------------

CONCURRENT I/O (CIO)

The inode lock imposes write serialization at the file level. JFS2 (by default) employs serialization mechanisms to ensure the integrity of data being updated. An inode lock is used to ensure that there is at most one outstanding write I/O to a file at any point in time, reads are not allowed because they may result in reading stale data.
Oracle implements its own I/O serialization mechanisms to ensure data integrity, so JFS2 offers Concurrent I/O option. Under CIO, multiple threads may simulteanously perform reads and writes on a shared file. Applications that do not enforce serialization should not use CIO (data corruption or perf. issues can occure).

CIO invokes direct I/O, so it has all the other performance considerations associated with direct I/O. With standard direct I/O, inodes are locked to prevent a condition where multiple threads might try to change the consults of a file simultaneously. Concurrent I/O bypasses the inode lock, which allows multiple threads to read and write data concurrently to the same file.

CIO includes the performance benefits previously available with DIO, plus the elimination of the contention on the inode lock.

Concurrent I/O should only be used:
Oracle .dbf files, online redo logs and/or control files.

When used for online redo logs or control files, these files should be isolated in their own JFS2 filesystems, with agblksize= 512.
Filesystems containing .dbf files, should be created with:
    -agblksize=2048 if DB_BLOCK_SIZE=2k
    -agblksize=4096 if DB_BLOCK_SIZE>=4k

(Failure to implement these agblksize values is likely to result in a severe performance penalty.)

Do not under aby circumstances, use CIO mount option for the filesystem containing the Oracle binaries (!!!).
Additionaly, do not use DIO/CIO options for filesystems containing archive logs or any other files do not discussed here.

Applications that use raw logical volumes fo data storage don't encounter inode lock contention since they don't access files.


fsfastpath should be enabled to initiate aio requestes directly to LVM or disk, for maximum performance (aioo -a)
------------------------------

RAW I/O

When using raw devices with Oracle, the devices are either raw logical volumes or raw disks. When using raw disks, the LVM layer is bypassed. The use of raw lv's is recommended for Oracle data files, unless ASM is used. ASM has the capability to create data files, which do not need to be mapped directly to disks. With ASM, using raw disks is preferred.

15 comments:

  1. No non-sense post.. Cheers!

    ReplyDelete
  2. Thanks!! Very informative...

    ReplyDelete
  3. Excellent friend , you can update this blog .... very good

    ReplyDelete
  4. This is the best blog I ever saw for AIX, you should start Video demonstrations.

    ReplyDelete
    Replies
    1. Thanks...I'll think about that.

      Delete
  5. HI Can you please tell me how to create raw file systems

    ReplyDelete
  6. HI Can you please tell me how to create raw file systems or raw devices in an AIX server

    ReplyDelete
    Replies
    1. If a logical volume has no file system in it , that is consider as raw device

      Delete
  7. raw filesystem is non existent, raw lv means it has not been formatted by any filesystem type (e.g. jfs, jfs2, jfs2log)

    ReplyDelete