Asymmetric routing with Cisco ASA firewalls

Last month I installed a new Cisco ASA 5510 for a client and came across an issue where traffic was hitting the “inside” interface of the firewall before travelling back out the same interface and into another router on the internal LAN – an issue as reported in this article Cisco ASA Deny TCP (no connection)

The diagram below demonstrates the network setup with PC1 trying to communicate with PC2. When the traffic leaves the MPLS router (RED line) it does not traverse the ASA and the next packet will follow the original route (GREEN then ORANGE lines) to get to PC2

Representing the traffic flow

Long term the resolution is to place the extra routers into their own DMZ networks on the perimeter network but as this didn’t exist at the time I needed to disable the TCP SYN checking for the traffic being routed to the MPLS routers – a process described in this article by Cisco – Configuring TCP State Bypass

First thing we do is create an ACL for all the items we want to bypass the SYN check

Now we create a class map to match the ACL

Then apply this to a policy map

Finally we assign that policy to the inside interface on the firewall

Traffic that hits the inside interface of the firewall that matches the rules on the ACL will not be checked for their tcp state and traffic should now flow.

In the long term it is recommended that this isnt the adopted approach and the firewall is configured to have the traffic traverse through from the inside to a DMZ interface to prevent the issues with the TCP SYN issue

Opsview – patch for check_route plugin

I was playing around with the check_route plugin and noticed a few issues with it not running. In order to get it to work on my Opsview boxes I had to install a new package, change some settings on the traceroute program and then make a patch in the script itself.

First thing you need to do is download the traceroute package if its not already installed

Once installed you will find that the plugin will fail and show the following error:

Googling the first line I found that you have to setuid root for the traceroute binary

Trying the plugin again you get the following error

To get around this you need the plugin to ignore the first line of the output from the traceroute which can be done with the following patch

http://snipt.net/mattywhi/opsview-check_route-diff/

Now the script runs as expected and you get the following output

 

Configuring Juniper SSG Firewalls to failover between Internet connections

I have been working with the Netscreen, and then Juniper firewall products for the past five years and am still learning new and interesting features they offer. One thing that I have been configuring more and more recently are secondary Internet connections and fail-over between them for clients. This post runs through the steps required to configure an SSG firewall to use track-IP to monitor IP addresses on the Internet and then automatically fail-over and fail-back an Internet connection.

The first thing we need to do is move the interfaces that will contain the Internet connections so each is in their own virtual router. This will allow us to have an active default route for each connection and they can behave independently of each other.

For this example I am using the 192.0.2.0/24 address range for my WAN connections – this was defined by the IETF as a subnet to be used for testing and documentation in RFC 5735. As these interfaces are both public facing I am also going to restrict the management to secure protocols only

Now we need to setup the default routes out of each virtual router so that each connection can communicate with the rest of the Internet

We need to ensure that our internal users are able to route to both the untrust-vr and adsl-vr. This can be done by exporting the default static route from the untrust-vr and adsl-vr

This will import both default routes to the trust-vr and set maintain the preference of the export from the untrust-vr at 20 whilst setting the metric of the adsl-vr export to 140.

Now that our users can connect to the Internet we need to make sure that should there be an issue with the primary internet circuit the backup circuit can be used for Internet access. This is achieved by using track-ip to monitor a number of hosts on the Internet and should they become unreachable shut the interface down.

In this example we are using the IP address of some of the root DNS servers as the addresses the firewall will use to check for a valid Internet connection but they could be any IP addresses that you expect to remain online and will respond to PING requests

This will PING the three addresses every second and will consider the address to have failed when the test has failed 25 times consecutively. Summing these three failures together will hit the weight and threshold limits of 75 needed to shut down the interface.

UPDATE: Since this was written Juniper released newer firmware that allowed you to specify the interface threshold for failure in addition to the Track-IP threshold. This would mean that track-ip would fail at 75 but the interface default was set to 255 for failover, the config above has been amended accordingly to reflect this change in behaviour.

If you want to test the status of the track-ip monitoring you can issue the following commands

and you will be able to see the failure statistics as well as whether the interface is failed or not.

When the interface is shut down the default route no longer becomes valid in the untrust-vr and will be deleted in the trust-vr leaving the export from the adsl-vr active and Internet traffic will continue to function as normal. In the background, the management address on the primary connection will continue to poll the IP addresses configured and when they become available the weight and threshold will be below the failure values, the interface comes back up and the untrust-vr route export re-appaers in the trust-vr.

The only other thing to consider here is inbound services on the backup line such as MX records to permit mail delivery to a MIP or VIP on the secondary circuit

If this is all configured correctly the only things the user should notice is that any websites/services that login and use session data (eg online banking) will need to login after fail-over or fail-back as their existing session will no longer be valid.

The only remaining task is to commit the changes you have made to flash

 

Opsview Labs

I had an unexpected, but much welcomed, tweet today from the team at Opsview who would like to make use of some of my writing about Opsview on their own Labs blog. I can honestly say that I wasn’t expecting the blog to be read and picked up in this way but I am pleased that I can hopefully reach a few more people with the data appearing on the Opsview Labs blog as well.

 

Monitoring HP ESXi Hosts using Insight Remote Support

This is just a direct link to the HP Blog article itself but worth a read if you are looking at monitoring any HP server running ESX or ESXi. The main bit that I have always found is that you need to install the HP extensions for ESXi installed as this greatly improves what you can see from remote tools such as Insight Remote Support, Nagios/Opsview or from the vSphere client itself.

The link to the article can be found here – http://h30507.www3.hp.com/t5/Technical-Support-Services-Blog/6-Simple-Steps-to-Monitoring-ESXi-with-Insight-Remote-Support/ba-p/100789

RANCID: Backing up Juniper EX switches

As part of my drive to backup all my switch/firewall configs I have been trying to get RANCID to backup the remaining devices on my network. The latest devices we added to the network were a pair of Juniper EX switches that are part of an iSCSI network and until now I have not had a backup of the configs. Looking at the documentation there is a set of commands to backup other JunOS devices so thought I would give it a go.

RANCID is running on an Ubuntu 10.04 server and is running version 2.3.3. and has the jlogin scripts in place. After adding the device information to the .cloginrc file I tested jlogin to check that it could connect as root to the device – it did. When I performed rancid_run however the device did not backup as expected and Rancid hung until it timed out. Upon closer inspection the issue came down to the fact that the root account will ssh to the BSD shell on the switch and not directly to the JunOS command line. To get around this I needed to setup a new user on the switches with the correct permissions and then get this to perform the backup of the switches. The command to add the config is as follows:

You will be prompted to choose a password and then confirm it before writing it to configuration

Now you can specify the details in RANCID:

The last thing that I did was to take a copy of jlogin and jrancid from an installation of RANCID 2.3.6 and everything seems to be working as expected.

RANCID: Issue backing up Cisco Aironet access points

I have had RANCID setup to backup switch and firewall config for a while now but not I had always had issues with backups of my Cisco access points which I had thought was an issue with the version of RANCID or the slight differences in IOS run on the WAPs versus the Switches. Turns out after revisiting it yesterday it was more a PEBKAC or ID-10-T error on my part!

What I had in my .cloginrc file was:

when I ran bin/clogin ip_address the device would login and get me to the enable prompt as expected but when run as part of rancid_run nothing was coming back for the config. After a bit of reading and searching the solution was simple enough and it wasnt a problem with RANCID or the Aironets….

should have been used instead of the noenable line.

I also managed to get RANCID to backup the config on my Juniper EX switches but that is a story for another post

check_equallogic volumes bug

I have been playing arond with the check_equallogic Nagios plugin written by Claudio Kuenzler (http://www.claudiokuenzler.com) to monitor some performance and utilisation values for a client and I came across a bug with the code in the latest release which I thought I would share.

The latest release allows you to monitor the size of a single volume as well as a single check to monitor all volumes. I setup the check in Opsview as normal and then proceeded to configure the Host Attributes for the SAN host for each volume on the SAN (there were 75 volumes to monitor). Having added all the checks and reloading Opsview I started to see a large number of OK checks for the volumes but also a number of UNKNOWN outputs from the plugin. Closer inspection showed that when you have two volumes that have the similar names (e.g. BES01-D and DR-BES01-D) the more generic name, BES01-D in this example will match for both volumes and the script will return an unknown value. The DR-BES01-D volume returned the correct stats as the volume name only matched one entry.

Looking through the code in the plugin the line that is causing the issue is:

When it grep’s the list of volumes from the SNMP walk it returns two values and the script cannot cope so exits. After some playing around (and remembering the basics of writing bash scripts) I managed to work around the problem and changed the line to the following:

The change adds the quotation marks that are surrounding the string value that is returned from the SNMPwalk so GREP should only return the exact matches. Having updated the script and re-run the checks the UNKNOWN status was gone and the checks all returned the correct data.

RSA Authentication Manager 7.1 upgrade issue

Following on from my article on the SQL files bug in RSA Authentication Manager 7.1 we were looking to carry out the upgrade to the client’s server in a maintenance window last weekend however the engineer carrying out the work was unable to login to the Operations Manager console to carry out certain parts of the upgrade task.

It turns out that since RSA was installed the Security Console Super Admin account had its password changed and in the updated documentation we lost the details of the password for the Operations Console as the two passwords are not linked. In order for us to get back into the Operations Console we had to run through the following:

  1. Create a new Super Admin from the Security Console in the Internal Database
  2. Run the RSA command line utility (C:Program FilesRSA SecurityRSA Authentication ManagerutilsRSAutil) to create a new Operations Console user account

Unfortunately it wasnt that easy to complete!

Initially when we ran RSAutil as one of the admin accounts we received an error stating that only one account could run it, the account that originally installed RSA! Luckily the account was still listed and we just needed to enable this and perform a swift “runas” to bring up a command prompt as that user.

Next we sent a good bit of time running through various commands to work out how we create a new Operations Console admin account. The final command that we needed to run was as follows:

rsautil manage-oc-administrators -a create -u UserCreatedEarlier -p PasswordForUserCreatedEarlier -g OperationsConsole-Administrators NewOperationsConsoleUsername NewOperationsConsolePassword

We were now able to login to the Operations Console using the account we created. Now to find another maintenance window to patch the RSA server