HDK Technical Reference


DMA (Direct-Memory Access) refers to the I/O strategy where data is transferred between system memory and the device without the aid of the CPU. There are two main categories of DMA devices:

See also:

Allocating DMA memory (DDI)

DDI drivers specify the physical requirements for DMA memory in the physreq(D4) structure that is allocated with the physreq_alloc(D3) function, then populated, then prepared with the physreq_prep(D3) function. The memory is then allocated with one of the following functions that utilizes the physreq specifications:

Allocate memory that can be accessed by the device. This is mostly used for memory that is used for control and status information.

Same as kmem_alloc_phys( ) except that the allocated memory is zeroed on allocation.

kmem_alloc_phys(D3) (DDI 8 only)
Allocate memory that can be accessed by the device.

Free memory obtained with kmem_alloc_physreq( ), kmem_zalloc_physreq( ), or kmem_alloc_phys( ).

Allocate STREAMS message memory using a physreq structure.

Check that the transmitted message satisfies the specified physical requirements.

Concatenate the bytes of the message if necessary to create a new message that matches the physical requirements.


Allocating DMA memory (ODDI)

Allocate kernel memory. By default in ODDI, the memory allocated is below the 16MB boundary and so can be used for DMA for any device. Use the KM_NO_DMA flag when the allocated memory can be above the 16MB limit.

Similar to kmem_alloc( ) but the contents of the buffer are zeroed out before returning.

Allocate either physical or virtual memory depending on the flags provided. Use the DMAABLE flag if the memory allocated must be under 16MB.

Allocate STREAMS memory.

Considerations for allocating DMA memory

The term ``DMA-able memory'' is a bit ambiguous since each device has a different definition of what constitutes DMA-able memory. Generally, devices must be concerned with physical contiguity and physical addresses.

Devices that do not support scatter/gather operations can only address one chunk of physically-contiguous memory, identified by the physical base address and the size of the chunk. Other devices may be able to address more than one chunk of physically-contiguous memory, but still have limits as to how many chunks there can be. If there are more memory fragments in a buffer that the device needs to access, it must copy the extra fragments into the last buffer it can address. Some drivers that can address unlimited chunks of memory may perform better if they copy data rather than manage a large set of DMA descriptors.

Devices are also limited by the number of address lines they have. Devices with 24 address lines can address up to 16MB of memory, devices with 32 address lines can address any memory below 4 gigabytes, and devices with 36 address lines can address up to 64 gigabytes of memory.

The following list summarizes these issues and points to the functions and/or structures that the driver writer can use to satisfy them.

These considerations apply only to memory that is accessed by the device as well as the kernel. DMA device drivers may also allocate memory that is used only by the driver software; this memory has fewer constraints and can be allocated as discussed in the ``Memory allocation'' article.

Memory ranges
Some DMA devices have severe retrictions on the physical memory to and from which I/O can be performed. Other devices can access a large range of physical memory and performance is greatly enhanced when they are allowed to access the full range. DDI 7 and ODDI drivers can access memory up to 4GB; DDI 8 drivers can access memory up to 64GB.

Specify the memory range for the device with the phys_dmasize member of the physreq(D4) structure. Set to 24 for ISA devices, 32 for devices that can access up to 4GB memory, and 64 for devices that can access up to 64GB memory. See ``DMA up to 64 bits (DDI only)'' for more information about DMA that uses 64-bit addressing.

Specify the DMAABLE flag to the sptalloc(D3oddi) or getcpages(D3oddi) function or call the kmem_alloc(D3oddi) function without the KM_NO_DMA flag to allocate memory below the 16MB threshhold. Otherwise, the allocated memory can come from anywhere in the first 4GB of system memory.

DMA operations require that the system memory being used is physically contiguous. The memory that is used for command/status operations often falls into this category. Each scatter/gather region also needs to be physically contiguous, although the different regions do not all need to be physically contiguous, nor does the list of scatter/gather regions.

Drivers that use DDI versions prior to version 8 can specify the PREQ_PHYSCONTIG flag for the phys_flags member of the physreq(D4) structure to specify that the allocated memory must be physically contiguous.

Use the getcpages(D3oddi) function to allocate physically contiguous memory segments that are larger than a page. kmem_alloc(D3oddi) and sptalloc(D3oddi) can be used if the physically contiguous segment is required to be 1 page (4KB) or smaller.

Alignment and boundary
Some devices require that an address be word-aligned or that the data buffer use for the I/O data not cross some boundary.

Specify alignment requirements in the phys_alignment member of the physreq structure. Note that the memory used for command/status operations may have different alignment requirements than the memory used for the actual I/O data buffer. For example, phys_alignment may be set to 512 (the size of a physical disk block) for data transfers on a scatter/gather device, but set to 4 for the command/status operations.

Specify boundary requirements in the phys_boundary member of the physreq structure. If this member is set to a non-zero value, buffers will not span addresses that are multiples of this value. For example, if the data buffer used for the I/O data not cross a 64KB boundary, set phys_boundary to 64.

The physio(D3oddi), and dma_breakup(D3oddi) functions can be used to handle many alignment and boundary requirements. ODDI is not as powerful in DDI in this case, so, in some cases, the driver writer must to code these checks manually in a driver-specific subroutine.

Device-specific units
Mapping a user process's view of a device into a set of device-specific units (for example, sectors for disks) when performing I/O. For example, when a user process performs direct I/O from a random access device, with reads and writes interspersed with lseeks, mapping is needed to translate the relative location of I/O as seen by the process into a series of disk sectors that can be used directly by the driver.

Locality (NUMA systems only)
DMA support for drivers running on NUMA systems requires that the kernel knows the ``locality'' of memory, which is expressed in terms of the CPU group with which the memory used for DMA buffers is associated. Drivers must also know the physical requirements of the individual device's DMA engine.

Only SVR5 supports NUMA so only DDI drivers are equipped to control locality. The kernel determines this information from the buf(D4) structure (or other memory handle) and the physreq(D4) structure that is associated with the driver's resource manager key for DDI 8 drivers.

Programming the system DMA controller (DDI)

The following DDI functions are used to program the system motherboard DMA controller for a DMA device that does not have a DMA controller on the device controller itself. These functions acquire the command and data blocks needed to describe the type of transfer and program the system DMA controller and device to complete the transfer.

Program an ISA-style DMA channel for cascade mode

Disable recognition of hardware requests on an ISA-style DMA channel

Enable the ISA-style channel to respond to DMA requests from the device

Free a previously-allocated DMA buffer descriptor

Free a previously-allocated DMA command block

Determine best transfer mode for an ISA-style DMA channel

Allocate a DMA buffer descriptor

Allocate a DMA command block that can be used to identify all the DMA requirements in terms of:


Program a particular ISA-style DMA channel based on driver input expressed by the control and data blocks.

Stop software-initiated DMA operation on a channel and release it

Program a DMA operation for a subsequent software request.

Initiate a DMA operation via software request

Programming the system DMA controller (ODDI)

The following ODDI functions are used to program the system motherboard DMA controller for a DMA device that does not have a DMA controller on the device controller itself. These functions acquire the command and data blocks needed to describe the type of transfer and program the system DMA controller and device to complete the transfer.

Allocate a DMA channel.

Size DMA request into 512-byte blocks.

Begin DMA transfer.

Set up the system DMA controller for a DMA transfer.

Release previously allocated DMA channel.

Return the number of bytes not transferred during a DMA request.

Queue the DMA request.

© 2005 The SCO Group, Inc. All rights reserved.
OpenServer 6 and UnixWare (SVR5) HDK - June 2005