Next: Diff-Serv queues
Up: QoS Support in Linux
Previous: Filters
This section discusses Class Based Queues in detail. The terms commonly used
in the CBQ context and the user-level syntax to set up these queues are discussed
in this section.
Let us first define some basic terms in CBQ. In CBQ, every class has variables
idle and avgidle and parameter maxidle used in computing the limit status for
the class, and the parameter offtime used in determining how long to restrict
throughput for overlimit classes.
- 1.
- Idle: The variable idle is the difference between the desired time and
the measured actual time between the most recent packet transmissions for the
last two packets sent from this class. When the connection is sending more
than its allocated bandwidth, then idle is negative. When the connection is
sending perfectly at its alloted rate, then idle is zero.
- 2.
- avgidle: The variable avgidle is the average of idle, and it computed
using an exponential weigted moving average (EWMA). When the avgidle is zero
or lower, then the class is overlimit (the class has been exceeding its
allocated bandwidth in a recent short time interval).
- 3.
- maxidle: The parameter maxidle gives an upper bound for avgidle. Thus
maxidle limits the credit given to a class that has recently been under its
allocation.
- 4.
- offtime: The parameter offtime gives the time interval that a overlimit
must wait before sending another packet. This parameter determines the
steady-state burst size for a class when the class is running over its limit.
- 5.
- minidle: The minidle parameter gives a (negative) lower bound for
avgidle. Thus, a negative minidle lets the scheduler remember that a class has
recently used more than its allocated bandwidth.
There are three types of classes, namely leaf classes (such as a video class)
that have directly assigned connections; nonleaf classes used for link-sharing;
and the root class that represents the entire output link.
The syntax to create a CBQ is shown below:
tc qdisc [ add | del | replace | change | get ] dev STRING \
cbq bandwidth BPS [ avpkt BYTES ] [ mpu BYTES ] [ cell BYTES ] [ ewma LOG ]
The interpretation of the fields:
- bandwidth represents the maximum bandwidth available to the device to
which the queue is attached.
- avpkt represents the average packet size. This is used in determining the
transmission time which is given as
- mpu represents the minimum number of bytes that will be sent in a packet.
Packets that are of size lesser than mpu are set to mpu. This is done because
for ethernet-like interfaces, the minimum packet size is 64. This value is
usually set to 64.
- cell represents the boundaries of the bytes in the packets that are
transmitted. It is used to index into an rtab table, that maintains the packet
transmission times for various packet sizes.
For e.g.
tc qdisc add dev eth0 root handle 1: cbq bandwidth 10Mbit allot 1514 cell 8
avpkt 1000 mpu 64
In the above example, a class based queue is created and attached to device
eth0. The handle for the queue is 1: (that is, 1:0), where 1 represents the
major number and 0 represents the minor number. The bandwidth available on the
outgoing link is 10 Mbit. allot is a parameter that is used by the link
sharing scheduler. A cell value of 8 indicates that the packet transmission
time will be measured in terms of 8 bytes.
Let us now discuss the syntax for creating a class for a CBQ.
tc qdisc [add | del | replace | change] cbq bandwidth BPS rate BPS maxburst PKTS \
[ avpkt BYTES ] [ minburst PKTS ] [ bounded ] [ isolated ] [ allot BYTES ] \
[ mpu BYTES ] [ weight RATE ] [ prio NUMBER ] [ cell BYTES ] [ ewma LOG ] \
[ estimator INTERVAL TIME_CONSTANT ] [ split CLASSID ] [ defmap MASK/CHANGE ]
The interpretation of the fields:
- bandwidth represents the maximum bandwidth that is available to the queuing
discipline owned by this class.
- rate represents the bandwidth that is allocated to this class. The
kernel does not use this directly. It uses pre-calculated rate translation
tables.
- maxburst represents the number of bytes that will be sent in the
longest possible burst.
- avpkt represents the average number of bytes in a packet belonging to
this class.
- minburst represents the number of bytes that will be sent in the
shortest possible burst.
- bounded indicates that the class cannot borrow unused bandwidth from
its ancestors. If this is not specified, then the class can borrow unused
bandwidth from the parent.
- isolated indicates that the class will not share bandwidth with any of
non-descendant classes
- allot, cell, mpu, estimator and ewma have already been explained.
- weight should be made proportional to the rate.
- The spilt field is used for fast access. This is normally the root of the CBQ
tree. It can be set to any node in the hierarchy thereby enabling the use of a
simple and fast classifier, which is configured only for a limited set of keys
to point to this node. Only classes with split node set to this node will be
matched. The type of service (TOS in the IP header) and sk->priority is not
used for this purpose.
- prio represents the priority that is assigned to this class.
- This again, is concerned with classification. It is intended to make fallback
classification. When a packet does not match any classifier, this fallback
classification is used. This is done in the following manner. The TOS byte
in the incoming packets or the SO_PRIORITY in the locally generated packets is
used as a logical priority. If a class is ready to serve a logical priority,
the defmap option is used. If a packet matches a classifier, this logical
priority is not used.
For e.g.
tc class add dev eth1 parent 1:1 classid 1:2 cbq bandwidth 10Mbit rate 1Mbit allot 1514 cell 8 weight 100Kbit prio 3 maxburst 20 avpkt 1000 split 1:0 defmap c0
In this example, a CBQ class with handle 1:2 is created. Its parent is
identified by the handle 1:1. The priority assigned to it is 3, the average
packet size is 1000 bytes. The split node is 1:0, which represents the root
of the link sharing structure. The defmap is c0, that is, packets with this
TOS (for incoming packets) or SO_PRIORITY (for locally generated packets)
that DO NOT classify under any class are considered to belong to the class
with handle 1:2. This was an excellent implementation innovation by Alexey
Kuznetsov.
Next: Diff-Serv queues
Up: QoS Support in Linux
Previous: Filters
Saravanan Radhakrishnan
1999-09-30