.. _firewalling:

Firewalling
###########

Concept
=======

A *firewall* is a basic defense tool for hosts connected to the internet. The basic concept is that network traffic (usually in the form of IP packets, see :ref:`ip`) is allowed or disallowed based on the rules configured in the firewall. Disallowed traffic is either silently discarded ("dropped") or an error message is returned to the sender ("rejected"). Firewalls often also have other features such as re-writing parts of packets (e.g. for Network Address Translation).

On Linux systems, firewalling is handled in the kernel and can be configured from the userspace as root via iptables or nftables and tools using either of those. Generally, firewalls are configured with very different philosophical approaches based on where they are employed. A firewall on a router may have very different requirements than a firewall on a server or end-user system.

Firewalls often mainly work on the transport layer and below. Layers above are rarely taken into account, and when they are, the technique is usually called Deep Packet Inspection.

There are two classes of firewalls: state-less and state-ful firewalls. State-less firewalls look at each packet entirely in isolation. No information from other packets is taken into account. These days, entirely state-less firewalls are rare on end-user and server systems, because they have difficulties filtering connection-oriented transports such as TCP: from a single TCP packet, it is not trivial to distinguish inbound and outbound traffic for example. The Linux packet filtering used by iptables and nftables is state-ful. An example of a stateful mechanism with iptables is conntrack, which allows to track the state of TCP connections. We will discuss an example of that later on.

IPtables
========

iptables rules are organised in chains which are organised in tables. We will only discuss the `filter` table in this document; it is responsible for accepting and rejecting inbound, outbound and forwarded traffic. Other tables are `mangle` and `nat`, which are used for more advanced scenarios and which I personally avoid to write rules for by hand.

The `filter` table has three chains, ``INPUT``, ``OUTPUT`` and ``FORWARD``. The ``INPUT`` chain processes traffic directed at the host itself. The ``OUTPUT`` chain processes traffic *originating* from the host and the ``FORWARD`` chain processes traffic which is only forwarded by the host. Traffic which goes through the ``FORWARD`` chain does not pass through ``INPUT`` or ``OUTPUT``, since it is neither directed to the host itself nor originating from there.

Practical Approach
==================

As hinted on above, a firewall is only as useful as the rules programmed into it. To create these rules, there are two basic approaches: blacklisting and whitelisting. With blacklisting, all traffic is allowed by default and only unwanted traffic is filtered out. With whitelisting, all traffic is disallowed by default and only known good traffic is passed on (the extent to which the firewall decides whether traffic is "good" depends on how deep it inspects the packets; see above).

.. note::

   Generally, I recommend the blacklisting approach for routers and the whitelisting approach for server and end-user systems. Some people argue that firewalls should not run at all on server systems; I hold against that that a firewall is a good defense-in-depth measure. Of course a (not too stateful) firewall does not help against an exploit in OpenSSH or any other daemon purposefully running on the system and listening into the wide internet.

   However, a firewall can very well help with protecting additionally protecting services e.g. by adding IP-address-based filters (which are at least partially useful with connection-oriented protocols).

Configuration
=============

As mentioned, firewalls on Linux are configured (on the lowest level within userspace) with iptables or nftables. The most common tool at the time of writing is iptables; nftables is gaining traction, and I recommend reading up on it by yourself.

Maintaining iptables rules by hand is cumbersome, which is why many users resort to wrappers around iptables, such as ferm. If one is familiar with the iptables syntax, ferm will be easy to learn. It allows to factor out common parts of iptables rules, making them much easier to read and maintain. In addition, it usually comes with a service definition which takes care of applying those rules at boot time.

A simple example of a ferm ruleset is shown below::

   domain (ip ip6) table filter {
       chain INPUT {
           policy DROP;

           # connection tracking
           mod state state INVALID DROP;
           mod state state (ESTABLISHED RELATED) ACCEPT;

           # allow local packet
           interface lo ACCEPT;

           # respond to ping
           proto icmp ACCEPT;

           # allow SSH connections
           proto tcp dport ssh ACCEPT;
       }

       chain OUTPUT {
           policy ACCEPT;

           # connection tracking
           #mod state state INVALID DROP;
           mod state state (ESTABLISHED RELATED) ACCEPT;
       }
       chain FORWARD {
           policy DROP;

           # connection tracking
           mod state state INVALID DROP;
           mod state state (ESTABLISHED RELATED) ACCEPT;
       }
   }

Let us disect that step by step. First of all, the braces (``{`` and ``}``) group rules together; those can be nested, too. So the first line essentially says "everything between the outer pair of braces applies to both IPv4 and IPv6 and to the filter table of iptables".

Then there are three blocks, one for inbound traffic (started by ``chain INPUT``), one for outbound traffic (started by ``chain OUTPUT``) and one for forwarded traffic (``chain FORWARD``). The ``table`` and ``chain`` directives of ferm directly relate to the tables and chains of iptables.

The ``policy`` statement tells iptables what to do with traffic which is not matched by any rule. In this case, the ``OUTPUT`` chain is set to accept all traffic by default, while ``INPUT`` and ``FORWARD`` chains use whitelisting, i.e. they drop all traffic by default.

.. note::

   In my opinion, there is rarely a use-case for filtering on the ``OUTPUT`` chain. One prominent one is however to prevent a system from sending mail to any host except specific hosts.

The ``mod state state INVALID DROP`` line in the ``INPUT`` chain can be understood as follows:

* ``mod state``: use the ``state`` iptables module (see the ``iptables-extensions`` manpage for more information on iptables modules)
* ``state INVALID``: select packets in ``INVALID`` state according to the ``state`` module
* ``DROP``: apply the ``DROP`` action (simply discard the packets)

The line below uses a ferm feature which I personally consider one of the most important ones: each "value" in a rule can be parenthesied to make the rule apply to multiple values without repeating it. So the line ``mod state tsate (ESTABLISHED RELATED) ACCEPT`` tells iptables to accept traffic which is in ``ESTABLISHED`` or ``RELATED`` state.

In general, all ferm statements are constructed by concatenating keywords (such as ``chain``, ``mod``, ``state``) followed by arguments (such as ``INPUT``, ``state`` and ``ESTABLISHED``) to form a rule for matching packets and finished with an action (such as ``ACCEPT``, ``DROP`` or ``REJECT``).

With that in mind, the ``interface lo ACCEPT`` statement should be clear: all traffic arriving on the ``lo`` interface shall be accepted.

Another important rule in that piece of ferm configuration is ``proto tcp dport ssh ACCEPT``. Two things are important about that: first, a rule like that should *always* be in your firewall: it allows traffic to the SSH daemon. Second: ``dport ssh`` does *not* mean to accept SSH traffic. It in fact means to accept traffic on the port number associated with the ssh service; note that those associations are defined by the IANA and have nothing to do with your SSH daemons configuration.

If you change your SSH config to use port 1234, you will have to adapt the rule to ``dport 1234``. This is a reason why I prefer numeric port numbers over the port names: it avoids the reader to be mislead by the name (I could also have an HTTP daemon listening on port 22 to confuse people).

The above piece of ferm configuration can "safely" be deployed on any system whose SSH daemon listens on port 22: you will still be able to connect to that after applying this piece of configuration.

* Debugging

  * iptables-save