next up previous contents
Next: Packaging a netlink packet Up: Netlink sockets Previous: Sending and Receiving messages   Contents

NETLINK_ROUTE Family

Now let us take a look at how the kernel netlink socket is created. For now let us focus on how the socket for the NETLINK_ROUTE family is created.

In net/core/rtnetlink.c, there is an rtnetlink_init which is of interest to us.

__initfunc(void rtnetlink_init(void))
{
#ifdef RTNL_DEBUG
    printk("Initializing RT netlink socket\n");
#endif
    rtnl = netlink_kernel_create(NETLINK_ROUTE, rtnetlink_rcv);
    if (rtnl == NULL)
        panic("rtnetlink_init: cannot initialize rtnetlink\n");
    register_netdevice_notifier(&rtnetlink_dev_notifier);
    rtnetlink_links[PF_UNSPEC] = link_rtnetlink_table;
    rtnetlink_links[PF_PACKET] = link_rtnetlink_table;
}

This function is called as part of the sock_init function in net/socket.c The function creates a netlink socket in the kernel which handles the user requests. The code of the netlink_kernel_create is

struct sock *
netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len))
{
    .
    .
    if (netlink_create(sock, unit) < 0) {
        sock_release(sock);
        return NULL;
    }
    sk = sock->sk;
    if (input)
        sk->data_ready = input;

    netlink_insert(sk);
    .
    .
}
The function creates a netlink socket and then makes an entry in the nl_table, infact since this socket is created when the system comes up, it will be the first entry in that table. This netlink socket which is created will have a pid = 0, which is the reason that all user netlink sockets which want to perfrom NETLINK_ROUTE related functions have to contact this socket by setting the pid to be 0. Also note that the function is called with a function pointer rtnetlink_rcv and the data_ready pointer is set to this value. This function is significant in the sense that this is the entry point into the kernel.

The link_rtnetlink_table is a table of structures

struct rtnetlink_link
{
    int (*doit)(struct sk_buff *, struct nlmsghdr*, void *attr);
    int (*dumpit)(struct sk_buff *, struct netlink_callback *cb);
};
which consists of the doit and dumpit function pointers. The table can be indexed by the action to be performed say RTM_NEWQDISC, RTM_DELQDISC etc and the corresponding function called.

This table is furthur filled up in sched/sch_api.c as

        link_p[RTM_NEWQDISC-RTM_BASE].doit = tc_modify_qdisc;
        link_p[RTM_DELQDISC-RTM_BASE].doit = tc_get_qdisc;
        link_p[RTM_GETQDISC-RTM_BASE].doit = tc_get_qdisc;
        link_p[RTM_GETQDISC-RTM_BASE].dumpit = tc_dump_qdisc;
        link_p[RTM_NEWTCLASS-RTM_BASE].doit = tc_ctl_tclass;
        link_p[RTM_DELTCLASS-RTM_BASE].doit = tc_ctl_tclass;
        link_p[RTM_GETTCLASS-RTM_BASE].doit = tc_ctl_tclass;
        link_p[RTM_GETTCLASS-RTM_BASE].dumpit = tc_dump_tclass;

and the route related function pointers are stored in /net/ipv4/devinet.c

static struct rtnetlink_link inet_rtnetlink_table[RTM_MAX-RTM_BASE+1] =
{
    .
    .    
    { inet_rtm_newroute,    NULL,           },
    { inet_rtm_delroute,    NULL,           },
    { inet_rtm_getroute,    inet_dump_fib,  },
    .
    .
}
rtnetlink_links[PF_INET] = inet_rtnetlink_table;

Now let us trace how the netlink packet from the user space finds its way in the kernel. The send_msg is mapped to sys_sendmsg which inturn calls the netlink_sendmsg() in our case, this function calls the netlink_unicast() or netlink_broadcast() as the case may be. This function identifies to which netlink socket this message has to be passed by comparing the pids of all the netlink sockets in the nl_table and calls the data_ready function of that socket which is the rtnetlink_rcv() for NETLINK_ROUTE case. The relevant section of the code is

int netlink_unicast(struct sock *ssk, struct sk_buff *skb, u32 pid, int
nonblock)
{
    .
    .
    for (sk = nl_table[protocol]; sk; sk = sk->next) {
        if (sk->protinfo.af_netlink.pid != pid)
            continue;
    .
    sk->data_ready(sk, len);
}

The flow of code from rtnetlink_rcv() is that the skb is dequeued and then passed on to rtnetlink_rcv_skb() which inturn calls the rtnetlink_rcv_msg(), this function actually extracts the operation to be performed from the netlink packet and calls the corresponding doit function by indexing into the rtnetlink_links array depending on the family, eg. for queue and class related stuff, the family is AF_UNSPEC and the indexing is done into the link_rtnetlink_table, whereas for route modifications, the indexing is done into the inet_rtnetlink_table because the family is AF_INET. Thus the appropriate function is reached and the necessary action taken and the success/failure reported to the user.


next up previous contents
Next: Packaging a netlink packet Up: Netlink sockets Previous: Sending and Receiving messages   Contents
Gowri Dhandapani
1999-10-03