3

I need to use socket() but the args given to the function make me confused.

I have to do an school exercice where I have to use socket for intercept ethernet frame (more specifically arp spoofing).

First, I need to specify the domain, which must correspond to an address family. In my case, that would be AF_PACKET (which isnt really an address family, right?), and from what I understand, it provides direct access to Ethernet frames.

Second, I need to set the type of protocol, in my case SOCK_RAW, which, as far as I understand, gives access to all raw frames without being tied to a specific protocol like TCP/UDP. But doesn’t this parameter basically do the same thing as the first one?

Finally, there’s the third argument: the protocol to use. Why does this one, unlike the others, need to be converted to network byte order (and what is this parameter for)?

Reading manuals doesn't help me at all. The Cisco "network basics" course did (I stopped at arp exercise).

This is all quite confusing to me, so thank you for any help you can provide.

If someone is french and can explain me in french, maybe it would help.

1 Answer 1

6

Raw sockets are a "special case" for the socket API and their parameters won't necessarily make complete sense in and of themselves, although they will still make sense in context of the regular order.

So in order to explain the parameters of socket() in raw-sockets usage, you first need to understand their regular use for stream and datagram sockets, and expand from there. For example, the 3rd parameter is easier to understand in the context of AF_INET – so you should be familiar with TCP and UDP first, then I'd expand to e.g. SCTP and raw IP, and finally go from raw IP to raw 'packet'.

Also: Use Wireshark. Or any other packet capture tool (tcpdump or tshark or Microsoft Network Monitor can do the job), but Wireshark is the most commonly used. It is much easier to understand protocols when you can see visually the packets sent and received, e.g. to see the correspondence between the 'protocol' API parameter and the respective Ethernet header field, or between a close() call and a TCP FIN packet being sent.

First, I need to specify the domain, which must correspond to an address family. In my case, that would be AF_PACKET (which isnt really an address family, right?), and from what I understand it provides access directly to Ethernet frames.

Sure, it's not a "definite" address family, just a pseudo-family that indicates "all kinds of link layer" networking. (Most other AF_ constants correspond 1:1 to a protocol, and yes, it would probably make more sense to have individual AF_ETHER and AF_TOKENRING and such, but someone decided to make raw packet sockets somewhat an exception and it's what we have now.)

Although it of course still involves addresses, only the actual address family will depend on what link you're dealing with. If you bind the socket to an Ethernet interface, you will be using Ethernet addresses (the 48-bit "MAC" addresses), and so on.

Second, I need to set the type of protocol, in my case SOCK_RAW, which, as I understand it, gives access to all raw frames without being tied to a specific protocol like TCP/UDP. But doesn’t this parameter basically do the same thing as the first one?

Not necessarily. AF_PACKET only says "work on Ethernet layer", but doesn't necessarily say how to work there.

For example, you also have the option of using AF_PACKET with SOCK_DGRAM, which makes the socket behave a lot like any other datagram socket: the OS will process the Ethernet header for you, and you will be specifying addresses through 'struct sockaddr_ll' instead of crafting the raw headers yourself – just like you would with IP/UDP for example.

And there actually was a connection-oriented transport protocol for Ethernet, called "LLC2" and used largely for IBM mainframes (and I think also for X.25). Although I don't know any real implementations, conceptually it would certainly have fit into a SOCK_STREAM slot at the same AF_PACKET layer.

You will not encounter LLC2 today, but the socket API dates back to an era where there was a huge variety of network protocols as well as physical network types.

And side note: When you say "without being tied to TCP", you might be thinking of AF_INET-level SOCK_RAW, which is different from AF_PACKET-level. Both are "raw" but at different layers – with AF_PACKET it would be more correct to say "without being tied to IPv4".

Finally, there’s the third argument: the protocol to use. Why does this one, unlike the others, need to be converted to network byte order

The value here is compared directly against the value in the packet without conversion, and Ethernet uses the 'network' byte order, so you need to pre-convert it yourself. (Just like you need to do with port numbers in TCP/UDP sockets, if I remember correctly.)

I don't know if there is a good reason that the BSD socket API was designed in this way, but that's how it was designed.

the protocol to use. […] (and what is this parameter for) ?

  • Some network protocols (AFs/PFs) have more than one upper-layer protocol of given type.

    For example, AF_INET has two transport protocols – TCP and SCTP – capable of being used as SOCK_STREAM transports, so you have a parameter that lets you specify this. (Although SCTP is natively more of a SOCK_SEQPACKET protocol, it can as well be used in stream mode.)

    There used to be even more, e.g. Bell Labs' IL was also a SOCK_STREAM protocol that ran over AF_INET (though its native OS did not use the BSD sockets API).

  • The protocol number specified here isn't only used by the OS; it is also specified in the IP header of each packet. So if you were to use AF_INET with SOCK_RAW – bypassing the OS's TCP implementation – then you could still use the third parameter to indicate in the IP header that you are sending hand-crafted TCP packets, for example, and vice versa the OS would filter sockets so that your socket only receives the desired type.

  • And in the case of AF_PACKET (that is, Ethernet), you also have upper-layer protocols: the Ethernet header has a 16-bit field (called 'Ethertype') which allows the OS to distinguish between frames carrying IPv4, IPv6, ARP, CLNP, LAT, and many others. If you specify htons(ETH_P_IP) in the 3rd parameter, then you will receive only Ethernet frames carrying IPv4. (This would be especially useful with SOCK_DGRAM socket mode.)

Sign up to request clarification or add additional context in comments.

2 Comments

I find this to be an amazing answer, especially the historical aspect that you add, which is pretty important considering we're talking about C and Sockets networking
Thank you very much for your comprehensive explanation!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.