Здавалка
Главная | Обратная связь

Handling Insufficient Hardware Resources



To avoid shared, interlocked data structures, each DPC should have adequate dedicated resources. For example, if a network adapter implements option 1a and supports only four receive descriptor queues, yet there were eight processors in the system, then insufficient resources will be present. The miniport driver does not have to support more than four parallel DPCs, because this type of support would require that a miniport driver support sharing of hardware resources between multiple DPCs. It is expected that the overhead associated with the sharing of resources will outweigh the benefits gained from RSS. Consequently, in some system configurations a miniport driver might support fewer DPCs than the number of CPUs. In this case, the following question must be answered: Which CPUs should the miniport driver use to schedule its DPCs?

A straightforward answer to the question that will provide good performance and localize cache-thrashing of host memory data structures to a small set of CPUs is to simply mask the output from the Indirection Table with the number of receive queues that are supported. For example, assume an implementation only supports four queues but the network adapter is installed on an eight processor system. In this case, masking the Indirection Table output means that the least-significant two bits are used. Thus all network traffic receive processing will be limited to CPUs zero through three if the BaseCPUNumber is set to zero. If the BaseCPUNumber was non-zero, then the resultant masked value is added to the BaseCPUNumber. Thus if the BaseCPUNumber was four, then CPUs four through seven would be used.

RSS Limitations

Receive-Side Scaling (RSS) requires a significant number of CPU cycles if the algorithm is implemented in software on the host CPU, because RSS is cryptographically secure. Thus a software implementation of RSS could make the system perform worse than if RSS were not enabled. As a result, implementations should not support RSS if the network adapter cannot generate the hash result.

The types of protocols that are received limit RSS load balancing. Load balancing on a per-connection basis is supported only for TCP. Depending on the hash type setting, other protocols such as the User Datagram Protocol (UDP), IPsec, IGMP, and ICMP are hashed on the source and destination IP address. For incoming packets that are not IP packets (for example, on Ethernet this would be a different EtherType than the one assigned to IPv4 or IPv6), the packets cannot be classified and will be handled in a fashion similar to the NDIS 5.1 method, where no hash value is set and all packets are indicated on a single CPU DPC.

If an application is not running on the CPU that RSS has scheduled the receive traffic to be processed on, some cache optimizations may not occur. To eliminate this issue, a mechanism will be provided to allow the application to query the current processor that it should run on for best performance. If the application is not modified in this fashion, application performance on RSS is still expected to be significantly better than performance on the current NDIS 5.1 infrastructure. However, in extremely rare conditions on nonuniform-memory-access (NUMA) systems, the system administrator may want to disable RSS.

Finally, RSS may need to be configured on systems where network processing is restricted to a subset of the processors in the system. Systems with large processor counts (for example 16- and 32-way processors) may not want all processors simultaneously processing network traffic. In such cases, RSS should be restricted to a small subset of the processors. To limit hardware and software complexity, the administrator can restrict only processors ranges that start at zero and are a power-of-two number of processors. The RSS hashing must also go to a power-of-two number of processors. So for example, if the system is a seven-processor system, where the administrator wants to forbid CPUs 0 through 2 from participating in RSS, the administrator must instead restrict CPUs 0 through 3. Because the number of RSS CPUs must be a power of two, only CPUs 4 and 5 can be used. (CPU 6 will not be used by RSS either.)







©2015 arhivinfo.ru Все права принадлежат авторам размещенных материалов.