TCP/IP (Transmission Control Protocol/Internet Protocol) is a set of protocols, that are used for data transmission over computer networks. The TCP/IP model recognizes the main functionalities of the theoretical OSI model. The image below presents the corresponding layers of both TCP/IP and OSI models.
Every message, which is sent by an application, has to pass through all the TCP/IP layers, from the application layer to the lowest network interface layer. Then, it is transmitted over network to another computer. Finally, it moves all the way up to the application layer and then to the target application.
While data is passed down from the application to the network, each layer adds its own header to every message. Each header is then handled by a corresponding layer on the receiving computer (where, as we said earlier, messages are passed from the network up to the application layer and beyond). Both the content and the size of each header depend on the protocol that has been used in the layer.
The first of four TCP/IP layers makes the communication between computer programs and lower layer protocols thus allowing applications to use networks. The programs can use one of many application layer protocols to request different kinds of actions.
There are a lot of application layer protocols that use TCP/IP data transmission. Some of the popular protocols are:
- HTTP, HTTPS - for web browsing,
- FTP, TFTP, NFS - for file transfer,
- SMTP - for sending email messages,
- POP3 - for receiving email messages,
- IMAP - for managing email messages on the server,
- Telnet, rLogin - for accessing remote computers,
- SNMP - for network management,
- DNS - for finding IP addresses assigned to Web addresses,
- IRC - for online chats
Application layer messages vary depending on the protocol that has been used. Each protocol requires different input data and produces different queries that are to be sent to the transport layer. Irrespective of what was produced by the application layer, the transport layer treats every received message as data and doesn't care about its content.
Internet sockets are structures that are used for communication between application and transport layers. Every process or application trying to connect to the network, has to associate its input and output channels by defining the corresponding internet sockets objects.
An internet socket contain an IP address, a port number and a transport layer protocol name. A unique combination of those three values determines a proper process that should deal with the message.
The port number can be assigned automatically by the operating system, manually by the user or is set as a default for some popular applications. The port number is a 16-bit integer (0 - 65535).
Some popular application layer protocols use by default predefined and well-known port numbers. For example, HTTP uses port 80, HTTPS uses port 443, SMTP port 25, Telnet port 23, and FTP uses two ports: 20 for data transmission and 21 for transmission control. The list of such default port numbers is managed by the Internet Assigned Numbers Authority organization.
The process of associating an application to a socket is called binding. After successful binding the application doesn't need to care about network management because all further operations are handled by protocols of lower layers of TCP/IP.
In some operating systems some special privileges are required for applications to bind to port numbers less than 1024. Therefore, a lot of processes prefer using higher port numbers allocated for short term use. Such ports are called ephemeral ports.
A user can specify a port number in a URL. For instance the following URL forces the browser to try to reach the website using port 8080, instead of default HTTP port 80:
The transport layer receives messages from the application layer. It divides them into smaller packets, adds a header, and sends the messages down to the internet layer. The header contains several control information, especially source and target port numbers.
Port numbers are used by the transport layer while handling incoming packets from the internet layer (thus, during receiving data). Thanks to the port number, it is possible to determine what kind of contents there is inside the incoming message, thus which application layer protocol should receive it. For example, a packet with the target port number equal to 25 will be delivered to the protocol connected to this port, usually SMTP. In this case, SMTP will provide data to the email application that requested it.
The most common protocol used in the transport layer is TCP (Transmission Control Protocol). This is a connection oriented protocol. TCP offers reliable, peer-acknowledged, ordered, session-based connectivity between two hosts.
All the features mentioned above are provided by the TCP layer itself. This means, that it may operate with other, unreliable, protocols in the lower layers and that this shouldn't affect the communication from the application layer perspective.
During sending data, TCP assures that data has been provided to the recipient. The receiver checks if the received packet was intact during transmission (by checking the checksum of the data) and, if so, the receiver confirms it by sending an acknowledgement to the sender. If the sender doesn't receive the acknowledgement for a message within some time period, it will resend the lost packet.
After several unsuccessful attempts, TCP assumes that the receiver is unreachable and informs the application layer that the transmission has failed.
The TCP header contains a field with the message sequence number. The sequence number is incremented by one for every message sent. During receiving data, TCP rearranges incoming packets and put them in the right order. Thanks to that, the application layer doesn't need to care about the ordering of network packets.
The TCP header consists of 20 or more bytes. The size depends on the fact whether or not the optional options field is used. The maximum size of the options field is 40 bytes, thus the maximum size of the whole header is 60 bytes.
Two applications need to establish a session to exchange data. TCP requires three messages to create the session:
- SYN - the first application (the client) sends a synchronize packet to the host. The message contains a random sequence number, which has been set by the client.
- SYN-ACK - the host responds to the client. It increases the sequence number from the client by one and sends it back in the message as an acknowledgement number. Also, the response message contains another sequence number chosen randomly by the host.
- ACK - the client sends an acknowledgement message to the host. The message contains both received numbers increased by one.
When the transmission is completed, the session should be terminated. Each side can terminate the session. The second side is supposed to acknowledge that.
TCP is widely used by protocols and applications that require high reliability. It is not as fast as UDP but, if configured properly, it still provides quite good speed together with high quality of transmitted data.
There are a lot of application layer protocols that are most mostly used together with TCP. Some of the most popular ones are:
- HTTP, HTTPS
The second popular protocol that is used in the transport layer is UDP (User Datagram Protocol or Universal Datagram Protocol), a simpler, connectionless protocol. One program just sends some packages to another, without creating any kind of relation between them.
Due to its simplicity UDP is faster than TCP. On the other hand, it doesn't provide such reliability as TCP. There is no guarantee that the messages would reach the receiver. UDP doesn't deliver packets in the same order that they were sent. It is up to the application to check that the received messages are intact and to deal with data in the correct order.
The UDP header is 8-byte long. It is much shorter and simpler than the corresponding TCP header.
UDP is preferred if unimportant data is transmitted or the communication has to be really fast. For example, UDP is used for DNS requests (because of a huge number of clients sending many short messages to relatively few DNS servers). Similarly, during audio and video transmission the loss of some packets is not so damaging to the receiver.
There are a lot of application layer protocols that use UDP, for example:
Datagram Congestion Control Protocol is a protocol that allows application to use congestion control mechanisms and to maintain reliable connections. It doesn't provide reliable in-order delivery.
DCCP is used by applications which work with quickly changing data (streaming media, online games, VoIP). In such situations it is often better to use new piece of available data than ask for retransmitting the old damaged package.
Resource Reservation Protocol allows for reservation of resources across a network. It is mainly used by routers and hosts to assure delivering specific levels of quality of service (QoS) for clients.
RSVP can reserve bandwidth for one-to-one and one-to-many transmissions. The protocol is initiated by the client (receiver), which asks the router to reserve some resources.
Stream Control Transmission Protocol allows sending multiple streams of data through one stream. It ensures reliable and in-order transmission with congestion control, similarly to TCP, but allows sending related data streams together in the same messages.
In general, SCTP is quite a powerful protocol. However, due to the poor support of routers and operating systems, at present it is not popular and widely used.
The internet layer adds another header to the messages received from the transport layer. The most important fields in the new header are IP addresses of both source and target machines. The IP address is a unique virtual number that allows to find the device in the network.
Each network device has also another special number assigned to it, called a MAC address. This is a unique number that cannot be changed (it is stored in ROM) and that allows to identify the device throughout the world. However, locating a device based on MAC in a global network is practically impossible because this number is strictly hardware related and it doesn't tell us anything about position of the device. On the other hand, IP addresses allow us to find any computer by using DNS servers. Every computer can query a DNS server and obtain information about the location of the target device in the network.
In general, messages travel through several routers before reaching the target server (pointed out by the target IP address). To find out a way between the computer and the server, one could use the Windows command:
There are a few protocols that work in the internet layer. The most important, and the most popular, of them is IP (Internet Protocol). It would be a good idea to name some other internet layer protocols:
- ARP (Address Resolution Protocol)
- RARP (Reverse Address Resolution Protocol)
- ICMP (Internet Control Message Protocol)
IP is used for transmitting data packets over the network. At present two versions of this protocol are in use: IPv4 (IP version 4) and IPv6 (IP version 6).
IP doesn't provide any acknowledge system that means it is unreliable. It is up to TCP operating in the transport layer to make sure that all the requested data has been delivered. Therefore the TCP/IP connection will be reliable.
The data packets are taken from the transport layer and divided into datagrams. Every datagram consists of the IP header and the bytes received from the transport layer. The maximum size of a datagram depends on the IP version: 216−1 bytes for IPv4, and 232−1 for IPv6. If the transport layer packet is too large, it will be divided into several smaller datagrams.
Usually the data is divided into even smaller datagrams. It is caused by the limited capabilities of physical networks. For example, the maximum size of an Ethernet datagram is 1 500 bytes, so usually the datagrams created in the internet layer based on Ethernet will be slightly smaller than 1 500 bytes (to allow the lower layers to add additional headers). The maximum datagram size for a network is called MTU (Maximum Transfer Unit).
IP allows to divide a datagram into smaller datagrams if this datagram has to go through a different kind of network with smaller MTU. When the smaller datagrams arrive to the previous type of network, they can be re-assembled into the original datagram. There is a special field in the IP header to allow such operations (called Fragment Offset).
The network interface layer allows datagrams from the internet layer to be sent via physical network to another computer, where they are passed up through the corresponding network layer to the internet layer and beyond. At present, most computers are connected to Ethernet networks, which may be either wired or wireless. Therefore, usually the TCP/IP protocols of the upper layers are used together with the set of Ethernet protocols.
There are three Ethernet layers. The first two, Logic Link Control (LLC) and Media Access Control (MAC), correspond to the data link layer of the OSI reference model. The lowest layer is called the physical layer, as in the OSI reference model.
The main functionality of the first Ethernet layer is to inform the target machine which internet layer protocol ought to be used to properly deal with the incoming message.
The layer simply adds the information about the protocol used in the internet layer, and about the protocol that is intended to receive the message. This allows the LLC layer on the target computer to deliver datagrams correctly.
The layer is defined by IEEE 802.2 standard.
The media access control layer creates the final message (Ethernet frame) that will be sent over the network.
The layer creates its own header, similarly like other layers. It contains the source MAC address and the target MAC address, that is the physical addresses of both machines which want to exchange information. If the target machine is located beyond a router, in a different network, the target MAC address will be the router MAC address (and it will be changed to another one during processing the message by the router).
The media access control layer adds also 4 CRC bytes which may be used for data correction.
The physical layer is responsible for converting messages into electricity or electromagnetic waves (depending on the type of the network) and for transmitting them over the physical network between communicating machines.