Saturday, February 26, 2022

[SOLVED] ebpf: drop ICMP packet in socket filter program on lo interface

Issue

Consider a very simple ebpf code of BPF_PROG_TYPE_SOCKET_FILTER type:

struct bpf_insn prog[] = {
   BPF_MOV64_IMM(BPF_REG_0, -1),
   BPF_EXIT_INSN(),
};

The code snippets below from net/core/filter.c and net/core/sock/c show how the filter will be invoked:

static inline int pskb_trim(struct sk_buff *skb, unsigned int len)
{
    return (len < skb->len) ? __pskb_trim(skb, len) : 0;
}

...

int sk_filter_trim_cap(struct sock *sk, struct sk_buff *skb, unsigned int cap)
{
        int err;
        ...

        if (filter) {
                pkt_len = bpf_prog_run_save_cb(filter->prog, skb);
                skb->sk = save_sk;
                err = pkt_len ? pskb_trim(skb, max(cap, pkt_len)) : -EPERM;
        }
        ...
        return err;
}
...

static inline int sk_filter(struct sock *sk, struct sk_buff *skb)
{
    return sk_filter_trim_cap(sk, skb, 1);
}

Eventually sk_filter() will be called by sock_queue_rcv_skb(), i.e. a packet reaching socket will be processed by the filter and queued if sk_filter() returns 0.

If I understand this code correctly, in this case (return code 0xffffffff) will result in a packet being dropped. However, my simple ebpf code when attached to a AF_PACKET raw socket (bound to lo interface) does not drop icmp packets sent across the loopback interface. Does it have anything to do with the eBPF or behaviour of ICMP on loopback interface?

UPDATE As pchaigno has pointed out, socket filter programs deal with the copies of packets. In my case, ebpf application basically creates a tap socket (AF_PACKET raw socket), which will be handed ingress packets before they are delivered up to a protocol layer. I did some investigation in the kernel code and found that a received packet will eventually land in __netif_receive_skb_core() function, which does the following:

list_for_each_entry_rcu(ptype, &skb->dev->ptype_all, list) {
        if (pt_prev)
            ret = deliver_skb(skb, pt_prev, orig_dev);
        pt_prev = ptype;
}

This will pass the packet to the AF_PACKET handler, which will run ebpf filter eventually.

As a way to confirm that the ebpf filter actually filters on raw AF_PACKET sockets, one can dump stats as follows:

struct tpacket_stats stats;
...
len = sizeof(stats);
err = getsockopt(sock, SOL_PACKET, PACKET_STATISTICS, &stats, &len);

This stats will indicate the filter behaviour.


Solution

It's the other way around: it will drop the packet if 0 is returned. From the code:

*   sk_filter_trim_cap - run a packet through a socket filter
*   [...]
*
* Run the eBPF program and then cut skb->data to correct size returned by
* the program. If pkt_len is 0 we toss packet. If skb->len is smaller
* than pkt_len we keep whole skb->data. [...]


Answered By - pchaigno
Answer Checked By - Willingham (WPSolving Volunteer)