Data Link Layer

The Data Link Layer focuses on the node to node transfer of data between adjacent connections. These networked nodes can consist of various devices like computers, routers, and switches.

The data link layer is only concerned with providing transport from a single node to another node. A higher layer protocol then chains together these short hops, to transmit messages across networks. The data link layer abstract away the need for other layers to care about the physical layer and what hardware (whether it be twisted pair, fiber, or wireless) is in use.

In this example there are 4 independent data link layer exchanges necessary to send a message from Host1 to Host 2.

Within the OSI model, the data link layer can be further divided into two sublayers the Logical Link Control (or LLC sublayer) and the Media Access Control (or MAC sublayer).

Media Access Control

The medium access control sublayer controls the hardware responsible for interaction with the wired, optical or wireless transmission mediums. It translates data into signal.

When receiving signal from the physical layer, the Media Access control is in charge of converting that physical signal whether it be electrical current, optical light, or wireless radio frequencies into bits and arranging those bits into frames for the the Logical Link Control.

It must also do this in the reverse. When transmitting data the Media Access control receives a frame from the Logical Link Control and is in charge of converting that frame into a physical signal.

Logical Link Control

On top of the Media Access Control sublayer is the Logical Link Control.

The LLC sublayer acts as an interface between the media access control and the network layer.

  1. It receives a message from an upper layer protocol (usually IP packet),
  2. chooses which network interface (or NIC) to send the data out on (for example on a laptop with both wireless and ethernet capabilities, ethernet is typically chosen by the operating system).
  3. If necessary it performs multiplexing, which is used to ensure multiple network layer protocols can coexist. Multiplexing is the process of combining multiple signals from different sources into a single useable one.
  4. Finally It wraps the message with the appropriate header, and sends it to the MAC sublayer to be sent out.
  5. The entire process is performed in reverse when receiving a frame from the MAC sublayer.

The transmission medium will determine the required header. The encapsulated packet is called a frame.

In modern applications there are two major family of protocols each with their own headers:

Each of these protocols has had multiple iterations over their existence, with each upgrade changing a one or two letter suffix, for example 802.11 a/b/g/n/ac and more recently 802.11ax or more commonly referred to as Wi-Fi 6

Historically we also had a protocol specifically for fiber connections called Fiber Distributed Data Interface (or FDDI, but it is now obsolete, replaced with Ethernet over Fiber)

 

Why do we use frames?

In The next section we will go into the details of the Ethernet frame, but it's important to keep in mind on a high level how frames work and why we need them. We've already mentioned that the Data link layer deals with the node to node transmission of data and that multiple exchanges are used in conjunction with a higher layer protocol to deliver packets across the Internet.

In order to get a message from one device to the next we need to specify additional information, for example who the sender is and who the receiver is. This type of meta-information is what the frame is composed of. It is just header information that is added by the LLC sublayer in order to ensure the delivery of a message from one node to the next.

At each hop along it's path the header information is discarded. This is because the goal of the data link layer frame is completed once it arrives at the next device. The information regarding where it's next hop will be, is stored one layer above inside the network layer header. (We will come back to that in a later lecture regarding routing, so we can blackbox that for now) Once it's chosen it's next hop, a new data link layer frame is then generated (whether it be Ethernet or 802.11) all the fields are recomputed using the current hardware device's properties.

This is for scalability reaso ns, our network may have hundreds or even thousands of devices connected together, but our frame size will always remains constant, no matter the number of hops required.

Ethernet (IEEE 802.3)

The Ethernet frame is a family of frames that is used for wired connections, both copper and fiber.

It contains 6 or 7 fields with each frame separated by an interframe gap of 12 bytes for synchronization purposes.

We will be looking at the Ethernet II frame structure, but there are minor changes in the IEEE Ethernet standard used today which we need not concern ourselves with.

 

 

1050px-Ethernet_Type_II_Frame_format.svg

 

MAC address

The next two fields are addressing fields.

Addressing on the data link layer is done via the MAC address.

The MAC address is a universally unique field used to represent all devices in existence.

It is a 48-bit number (which gives us 281 trillion combinations). Each network device is assigned a unique MAC address by it's manufacturer. To prevent name collisions between manufacturers, each MAC address is composed of two parts, a 24-bit Organizationally Unique Identifier (OUI) followed by a 24-bit vendor assigned portion. Manufacturer's of networking hardware coordinate with the IEEE, and are allocated blocks of OUI numbers, under the condition that each networking device has a unique MAC address.

It is possible to tell who manufactured a device by looking up it's OUI.

Open up powershell and type in

If on mac type into terminal

 

You are looking for a 48bit number called the Physical Address, typically represented as 6 sets of hexadecimal pairs separated by dashes or colons. The first half of that number is the OUI, we can put that number into a search engine or OUI repository websites (like https://hwaddress.com/) to find the manufacturer of your network card.

In this example Micro-Star International was the creator of my Ethernet NIC.

 

Many devices will have multiple MAC addresses, if it supports simultaneous communication. For example routers and switches will have a separate MAC address for each physical port that exists on the device. A computer that can communicate via Ethernet and Wi-Fi will have separate MAC addresses for each networking card.

You may also have one or more software based MAC addresses if you are running a virtual machine or VPN.

In the past there were other addressing mechanisms on the data link layer (like DECnet), but as these addresses were assigned by hardware manufacturer's, having multiple competing standards was not a good idea and the MAC address became the standard format used by all data link layer protocols.

1050px-Ethernet_Type_II_Frame_format.svg

 

Special MAC Addresses

Most routing is unicast, where there is a single sender and receiver, but there are two special category of addresses:

IEEE 802.1Q

 

EthernetFrame

EtherType (2 Bytes)

Used to indicate which protocol is encapsulated in the payload of the frame

An EtherType value of:

Payload (46-1500 Bytes)

This is the content that is being delivered, typically an IP packet.

Frame Check Sequence (FCS) (4 Bytes)

Provides Data integrity

The FCS Uses a 32bit Cyllic Redundancy Check Algorithm (or CRC32)

https://en.wikipedia.org/wiki/Cyclic_redundancy_check

Before data is transmitted on the wire, it takes all the data and passes it through the CRC32 algorithm which produces a digest, this digest is then added to the frame.

On the receiving end, the CRC32 algorithm is run again on the received data, and the computed digest is compared with the one in the FCS. If they don't match there was something corrupt and the packet gets dropped.

The frame check sequence only provides data integrity, there is no recovery aspect built into it, that is handled by a higher layer protocol.

 

Switches

Now that we have a basic understanding of what is inside an Ethernet frame we can discuss switches in more detail.

A network switch is a data link layer hardware device that has the ability to inspect frames and forward them only to the applicable party and not the entire network.

 

d1.png

 

In the following network diagram, if all the devices were connected via repeaters, AA:AA:AA:AA:AA:AA transmitting to BB:BB:BB:BB:BB:BB would prevent DD:DD:DD:DD:DD:DD from transmitting to EE:EE:EE:EE:EE:EE as each signal would be duplicated to all endpoints at each repeater. Without CSMA The messages that B and E receive would likely be corrupted.

The switch however is able to inspect each frame and forward it only to the physical port associated with the destination MAC address.

This ability to inspect the contents of the frame is what makes the switch a data link layer hardware device in contrast to the hub which is a physical layer hardware device. In our next lecture we will introduce the router which is a network layer hardware device and as you may have guessed, can inspect the contents of the network layer packet.

 

Switches maintain a memory of attached devices called the Forwarding information base (FIB) or MAC Address Tables. It stores each known device on the local network. This table has four columns VLAN, MAC Address, Type (which can be either static or dynamic), and Physical Port. The job of the Forwarding information base is to map MAC addresses to physical ports. That way when a device wants to send a frame to a specified ethernet address the switch knows which port to send it out on.

 

Port here refers to the physical connection on the switch. (When we get to the Transport layer the term port is unfortunately used again, but they are unrelated. To avoid confusion I will use the terms Physical Ports when referring to switches and Logical Ports when referring to transport layer sockets.

 

The switch uses the Destination MAC address in the frame combined with the MAC Address Table to look up the Physical Port, where it then forwards the message. These forwards are chained together until the final destination is reached.

 

Example of a MAC Address table inside a switch

 

Switches learn about attached devices through two processes.

When a frame is received, the switch will inspect the contents, and if the source address is not currently in the Forwarding information base, the MAC Address and physical port on which the frame was received will be added.

 

In this example we have two switches with empty MAC address tables

Machine A attempts to send a message to machine D. The first step that the switch does upon receiving a message from A, is inspect the frame's source address and associated A's MAC address with the port Fa0/4 (Note port names are arbitrary and vendor dependent)

 

If a Switch receives a frame with an unknown destination address it uses a technique called unicast flooding to try and discover the device.

A switch will make a request to all physical ports on the network to try to discover who owns the unknown MAC address. Those who do not own the specified MAC address ignore the message.

 

Other switches on the network will first check to see if they have a record for the specified MAC address in their FIB, and if it does, will send back the information, otherwise the intermediate switch will recursively unicast flood the request.

The computer with the queried MAC address will send an acknowledgement. Computers with different MAC address's will ignore the message. If a response is received, each switch along the path will add the MAC Address and port used to it's Mac Address Table.

That is in order to get a message from A to D, the data must be first received on switch 1 using port Fa0/4, then forwarded off of port Fa/07, received on switch 2 using port Ga0/3 and forwarded out on port Ga/01. Any future requests from A to D or D to A will be much faster now that there are entries for both devices in the MAC address table.

 

Wi-Fi (IEEE 802.11)

802.11 or Wi-Fi operates in Half Duplex (Wifi messages are exchanged in Data + Ack pairs). Wi-Fi can be extremely lossy depending on how far away the sender and receiver are (possibly up to 10% loss and still usable). This is generally what the number of bars or signal quality on your Wi-Fi connection is referring to.

Because of this large percentage of loss, the frame used for Wi-Fi are different than the one for Ethernet, it includes mechanisms for acknowledging the receipt of messages.

Each Wireless station will have a Service Set Identifier (SSID) used as the human readable name for a wireless access point, and a Basic Service Set Identifier (BSSID) which is the station's unique wireless identifier (or MAC address of the wireless station)

Wi-Fi Frame

802.11_frame
Image: Buhadram / Wikicommons CC BY-SA 4.0
802.11_frame
 

MAC Address

There are four MAC address fields on each 802.11 frame.

The Last two are optional and depend on the type of frame


 

 

Our next topic will be the network layer where we will learn how millions of messages are routed from our local machines to a server possibly thousands of miles away, all in a few seconds.

 

Additional Reference: