You’ve probably heard about how hard it is to operate networks. The complexity of the devices, the diversity of protocols operating simultaneously, and the sheer size of the network are all well-known reasons for getting something wrong. As a result, network changes require stressful, carefully planned change windows to ensure the desired outcome, with no unintended consequences.
In this post, I want to discuss an additional dimension to the complexity of the problem: cross-vendor, cross-platform and cross-version behavior differences. If not well understood or glanced over by the network operator, these little-known differences may result in security breaches, connectivity loss or network downtime, even when they affect a single device.
Imagine you are the network operator for an enterprise network. All the hosts in your network are firewalled and NATed by a Palo Alto Security Platform. All your hosts have a private IP address and their access is restricted by the firewall to match the security policy of the organization. In particular, the security policy of the organization dictates that the host with IP address 10.1.1.10, which hosts sensitive financial data, should have all of its web communications to the outside be encrypted. The configuration you may put on the the Palo Alto box simply translates the private IP address of 10.1.1.10 to a public IP address of 2.2.2.10, and denies any web browsing application other than SSL-based when it comes from 10.1.1.10.
After a few months, as part of a hardware refresh, you are asked to replace this firewall with a Cisco ASA. The box you purchased is running ASA OS version 8.2. You do a bit of research online and translate each piece of config in your Palo Alto device to equivalent ASA configuration:
Pretty simple, right? Not so much…
We just introduced a security hole.
The subtle catch here is that the order of applying NAT and firewall rules can change the outcome of the configuration above. If the device applies firewall rules first, then NATs the traffic, the above configurations will work fine. But as it turns out, Palo Alto firewalls perform Firewall and NAT in that order, but on ASA versions 8.2 and lower, firewall rules are applied after the translation and therefore the rules matching on 10.1.1.10 as source IP address will not have any effect. Instead, we should have applied those rules to the post-translated address, 2.2.2.10.
Six months later, Cisco comes out with ASA version 8.4 and you decide to upgrade. You may think this time everything should be fine. After all, it’s the same vendor! But, not quite… To start, you notice some syntax differences in how you define NATs, which with the help of CLI, you fix. But at this point, can you be sure that the new configuration meets your desired behavior?
As it turns out, the answer is again, no! Going from version 8.2 to 8.4, the order of applying NAT and Firewall is changed and now it matches the original Palo Alto firewall! Bummer!
As the example above illustrates, different vendors, and even different OS versions or platforms coming from the same vendor may have subtle differences in the implementation of the same concept, protocol or standard.
While syntactical differences can be caught by the device’s CLI or management UI, behavioral differences are very hard to catch.
What makes it even worse is that sometimes these subtleties are not well-documented. They can lead to error-prone configurations, security breaches, or network downtime if not well understood by network operators.
Even the simplest configuration constructs have surprisingly large differences between implementations. We have seen these subtleties mislead network operators in some of our customer networks in the past. For example, some vendors require each VLAN to be explicitly declared before using them on an access or trunk interface (e.g. Cisco IOS and NX-OS), and forgetting to do so will silently drop any traffic arriving on that interface with the undefined VLAN tag.
Similarly, some vendors enforce referential consistency, which means that each object used in ACLs should be defined, while some others don’t have such requirements (e.g. many Cisco routers). In these cases, the ACL rule that references an undefined object will be ignored. As a result, a simple typo in the naming of an object may open up a security hole in the network.
In the best case, not fully understanding these nuances can waste hours of your time; in the worst case, it can lead to a security breach that puts your company in the news.