Using OSPF to maintain site-site VPN across multiple WAN links

Having a Single Point of Failure (SPoF) on your network is never a desirable situation and recently I implemented a multi-site set-up where each site had two internet connections and there was a requirement to enable the satellite office to connect to the head office at all times. Each site has a Juniper SSG5-SB firewall as well as a 10Mbit leased line primary Internet circuit and an ADSL backup.

With the Juniper SSG firewalls it is possible to use Policy Based VPNs to maintain multiple tunnels and have the firewalls switch between these as required however you end up with four policies on each firewall and you cannot tell from looking at a routing table where the traffic is flowing. In this instance I decided to make use of OSPF to dynamically route the traffic depending on the availability of the VPNs at each site.

The first thing we need to do in order to implement this is to put each Internet connection into its own Virtual Router so they can run independently of each other.I have covered this in a recent blog post which you can read here.

Once you have the two firewalls setup with each Internet connection in its own virtual router we need to setup the VPNs. This is done with a new Zone in the trust-vr and we will need four numbered tunnel interfaces on each firewall.

On the second site firewall you will need to repeat the commands but using the other IP address in the /30 in each case.

Now we need to setup the VPN tunnels. You may want to change these based upon your requirements however I have used these settings regularly and they work well. NB you will need to apply similar settings on the Site-B firewall with the various endpoint addresses for SiteA

From the GUI you should be able to check that these have come up by going to VPN -> Monitor Status.

We now need to enable OSPF on the trust-vr and configure the interfaces to communicate using OSPF. This should be completed on both the primary and secondary site firewalls.

You can check the OSPF status by running the following command

Finally we need to setup policies to allow traffic to flow across the VPN between the two sites.

To test this you need to take down the Internet connections one by one and watch the routing table update on each firewall.
You should now see all four IPSec VPN Tunnels show as active and the route between sites will be via the Layer-3 tunnel interface for the relevant tunnel.

JunOS: Logout stale edit sessions

I have been bitten enough times when my ssh session to my JunOS switch or router has been disconnected because it was idle and then when I reconnect get the warning to say that another user is editing the configuration.

The easiest thing to do is to log out the other session once you have reconnected to the device by using the PID of the stale session (in this case 28439) with the following command:

You should now no longer see that message when you log back into the switch.

Monitoring Alarm Status on Juniper EX Switches

I am in the process of installing a number of Juniper EX2200, EX3200 and EX4200 switches for a client and as part of the setup need to be able to monitor the switches for any alarmsย  (eg Switch Management interface down or Switch booted from Backup Partition) and have them dealt with accordingly.

Having a look at the SNMP OID tree for the EX switches I came across the following useful table

http://www.oidview.com/mibs/2636/JUNIPER-ALARM-MIB.html

Object Name Object Identifier
jnxAlarms jnxAlarms 1.3.6.1.4.1.2636.3.4
jnxCraftAlarms jnxCraftAlarms 1.3.6.1.4.1.2636.3.4.2
jnxAlarmRelayMode jnxAlarmRelayMode 1.3.6.1.4.1.2636.3.4.2.1
jnxYellowAlarms jnxYellowAlarms 1.3.6.1.4.1.2636.3.4.2.2
jnxYellowAlarmState jnxYellowAlarmState 1.3.6.1.4.1.2636.3.4.2.2.1
jnxYellowAlarmCount jnxYellowAlarmCount 1.3.6.1.4.1.2636.3.4.2.2.2
jnxYellowAlarmLastChange jnxYellowAlarmLastChange 1.3.6.1.4.1.2636.3.4.2.2.3
jnxRedAlarms jnxRedAlarms 1.3.6.1.4.1.2636.3.4.2.3
jnxRedAlarmState jnxRedAlarmState 1.3.6.1.4.1.2636.3.4.2.3.1
jnxRedAlarmCount jnxRedAlarmCount 1.3.6.1.4.1.2636.3.4.2.3.2
jnxRedAlarmLastChange jnxRedAlarmLastChange 1.3.6.1.4.1.2636.3.4.2.3.3

I have used the jnxRedAlarmCount and jnxYellowAlarmCount oid values as basic Opsview SNMP Service Checks to give me an initial overview but in the long term will be looking to combine this into a full service check script that can be used to check a number of different things.

The setup of the Service Check in Opsview is fairly simple and below are screenshots of the config that I have for each service check.

All you need to configure on your hosts is the SNMP community string and you can apply these checks individually or via a Host Template.

Once I performed a reload I could see the following in Opsview for one of my switches:

A bit of inspection showed that the Red Alarm was for the Management Interface being down (but wasnt being used on this switch) and the Yellow alarm was due to not setting a rescue configuration. I cleared the alarms by isuing the following commands

Now when I refresh the checks in Opsview I get an OK state for both checks

Juniper EX view pending changes

When making changes to Juniper EX switches yesterday I wanted to check the changes that I had made to my configuration before committing them. A quick look in the reference manual gave me the following command:

This will show the edited candidate config and pipe that into the compare function and look at the changes to the specified version (rollback 0). I could look at the changes compared to a previous config by replacing 0 with another number in the rollback sequence.

 

Configuring Juniper SSG Firewalls to failover between Internet connections

I have been working with the Netscreen, and then Juniper firewall products for the past five years and am still learning new and interesting features they offer. One thing that I have been configuring more and more recently are secondary Internet connections and fail-over between them for clients. This post runs through the steps required to configure an SSG firewall to use track-IP to monitor IP addresses on the Internet and then automatically fail-over and fail-back an Internet connection.

The first thing we need to do is move the interfaces that will contain the Internet connections so each is in their own virtual router. This will allow us to have an active default route for each connection and they can behave independently of each other.

For this example I am using the 192.0.2.0/24 address range for my WAN connections – this was defined by the IETF as a subnet to be used for testing and documentation in RFC 5735. As these interfaces are both public facing I am also going to restrict the management to secure protocols only

Now we need to setup the default routes out of each virtual router so that each connection can communicate with the rest of the Internet

We need to ensure that our internal users are able to route to both the untrust-vr and adsl-vr. This can be done by exporting the default static route from the untrust-vr and adsl-vr

This will import both default routes to the trust-vr and set maintain the preference of the export from the untrust-vr at 20 whilst setting the metric of the adsl-vr export to 140.

Now that our users can connect to the Internet we need to make sure that should there be an issue with the primary internet circuit the backup circuit can be used for Internet access. This is achieved by using track-ip to monitor a number of hosts on the Internet and should they become unreachable shut the interface down.

In this example we are using the IP address of some of the root DNS servers as the addresses the firewall will use to check for a valid Internet connection but they could be any IP addresses that you expect to remain online and will respond to PING requests

This will PING the three addresses every second and will consider the address to have failed when the test has failed 25 times consecutively. Summing these three failures together will hit the weight and threshold limits of 75 needed to shut down the interface.

UPDATE: Since this was written Juniper released newer firmware that allowed you to specify the interface threshold for failure in addition to the Track-IP threshold. This would mean that track-ip would fail at 75 but the interface default was set to 255 for failover, the config above has been amended accordingly to reflect this change in behaviour.

If you want to test the status of the track-ip monitoring you can issue the following commands

and you will be able to see the failure statistics as well as whether the interface is failed or not.

When the interface is shut down the default route no longer becomes valid in the untrust-vr and will be deleted in the trust-vr leaving the export from the adsl-vr active and Internet traffic will continue to function as normal. In the background, the management address on the primary connection will continue to poll the IP addresses configured and when they become available the weight and threshold will be below the failure values, the interface comes back up and the untrust-vr route export re-appaers in the trust-vr.

The only other thing to consider here is inbound services on the backup line such as MX records to permit mail delivery to a MIP or VIP on the secondary circuit

If this is all configured correctly the only things the user should notice is that any websites/services that login and use session data (eg online banking) will need to login after fail-over or fail-back as their existing session will no longer be valid.

The only remaining task is to commit the changes you have made to flash

 

RANCID: Backing up Juniper EX switches

As part of my drive to backup all my switch/firewall configs I have been trying to get RANCID to backup the remaining devices on my network. The latest devices we added to the network were a pair of Juniper EX switches that are part of an iSCSI network and until now I have not had a backup of the configs. Looking at the documentation there is a set of commands to backup other JunOS devices so thought I would give it a go.

RANCID is running on an Ubuntu 10.04 server and is running version 2.3.3. and has the jlogin scripts in place. After adding the device information to the .cloginrc file I tested jlogin to check that it could connect as root to the device – it did. When I performed rancid_run however the device did not backup as expected and Rancid hung until it timed out. Upon closer inspection the issue came down to the fact that the root account will ssh to the BSD shell on the switch and not directly to the JunOS command line. To get around this I needed to setup a new user on the switches with the correct permissions and then get this to perform the backup of the switches. The command to add the config is as follows:

You will be prompted to choose a password and then confirm it before writing it to configuration

Now you can specify the details in RANCID:

The last thing that I did was to take a copy of jlogin and jrancid from an installation of RANCID 2.3.6 and everything seems to be working as expected.

World IPv6 day got me thinking…

… and playing with IPv6 at home and I now have a partial setup of IPv6 on my home network and my parents will be going fully IPv6 from the weekend ๐Ÿ™‚

The issues I had to overcome were firstly part of my own stupidity and then part of a need to understand how IPv6 works. First of all my ISP (BeThere) doesn’t currently support Native IPv6 on their DSL connections so I needed to get an IPv6 tunnel and Hurricane Electric’s Tunnel Broker service (http://www.tunnelbroker.net) came in very handy here as it allowed me to have a public /64 and private /48 address range which seems like a whole load of addresses that I can play with.

To get the firewall configured they actually give you a predefined sample config based on your IPv6 allocation which needed a bit of modifying to work with the setup on my firewall. The trouble was adding the /48 range to the Trust/Internal side of my network. I had configured Router Advertisement and also set the interface to be in Router mode instead of Host mode but my PC wasn’t getting anything other than the link local fe80:: address.

Following some hair pulling and discussion with a colleague I realised the issue was that the link from my PC to the firewall had a device in between that wasn’t IPv6 enabled. I should point out here that the PC and firewall are in different rooms and because its a rented property I am unable to run a nice CAT6 cable between the two. So I improvised and took and old laptop which I wasn’t using and plugged this into the PC, connected the Wireless on the laptop to my network and bridged the two connections. This works great (for the most part) with IPv4 but was unable to bridge any of the IPv6 traffic on the network.

I added the IPv6 stack to the Windows XP machine and this broke the IPv4 bridge and I lost my Internet connection and ability to communicate with the world. A swift disabling of the IPv6 brought this back and I am going to have to resort to buying a Wireless PCI card for my PC.

Undeterred by this minor setback I looked at what other devices were on my network that I could setup IPv6 with that don’t have the same issue. I am running a number of test VMs in an ESXi lab and there is a Ubuntu server and a number of Windows Server 2003 boxes running on here. Starting with the Win2K3 box I added the IPv6 stack to the network card and the server got an IP from the /48 I had been allocated. All I had to do was manually set the DNS servers using the Open DNS IPv6 DNS Sandbox and I was online.

After the success of Server 2003 working I logged into my Ubuntu 10.04 LTS server and ifconfig showed that it had automatically picked up an address from my router. All that was left for me to do was to add the Open DNS entries to my /etc/resolv.conf and I was good to go.

IPv6 works and is clearly the way forward. What I now need to do is to fully understand the address assignment and subnetting so that I can allocate networks more clearly and understand what is happening ๐Ÿ™‚

If you want to learn more about IP addressing then take a look at the following page from RIPE (http://www.ripe.net/internet-coordination/press-centre/understanding-ip-addressing) or alternatively the Wikipedia page on IPv6 (http://en.wikipedia.org/wiki/IPv6)

Multiple Juniper SSG clusters on same network

I ran into an issue recently with a client where we were seeing a large level of packet loss to their newly installed SSG140 cluster. There were three clients sharing the same 100Mbit Internet circuit and they all connected directly into a pair of Juniper SRX210 routers.

All three clients had a firewall cluster which was either made up of a pair of Juniper SSG 140s or Juniper SSG 5s and we were seeing the packet loss on the two SSG 140 clusters.

After some investigation and troubleshooting the following KB article from the Juniper website seemed to demonstrate what the problem was: http://kb.juniper.net/InfoCenter/index?page=content&id=KB7435

The virtual MAC address for both firewall clusters public facing interfaces were the same.

Resolution? Rebuild one of the clusters to use a different cluster ID and the MAC address generated for the firewalls is different.

New Qualification – JNCIA-FWV

Today I sat and passed, after a long time of putting it off, my JNCIA (Juniper Networks Certified Internet Associate) Firewall/VPN Exam.

This now means that I have a qualification in the firewall technology that we are using at work. Hopefully I can play with some of the more funky stuff they use and work towards my JNCIS now ๐Ÿ™‚