Thursday, January 29, 2009

How does DHCP relay redundancy works?

Having the following configuration:

interface X
ip helper-address 1.1.1.101
ip helper-address 1.1.1.102

This will configure DHCP relay in the router. It receives a broadcast DHCP request and converts it into a unicast message directed exclusivly to both DHCP servers.
The router "sees" the DHCPDISCOVER packet and forwards it to bothaddresses simultaneously. Then, both DHCP servers will make an offer (DHCPOFFER), if they are up and received the request.
The client will receive each offer at one time, one first then the other. if it finds the offer agreeable, itwill send another broadcast, a DHCPREQUEST, specifically requestingthose particular IP parameters. Why does the client broadcast therequest instead of unicasting it to the server? A broadcast is usedbecause the first message, the DHCPDISCOVER, may have reached more thanone DHCP server. If more than one server makes an offer, thebroadcasted DHCPREQUEST allows the other servers to know which offerwas accepted. The offer accepted is usually the first offer received.
Sometimes, in other to make one server responde faster then the other, you can configure the delay of response in the DHCP server.Also, for sincronization between DCHP server, either they support this feature or you will have to have your subnet divided betweenthe 2 servers without overlapping.

Wednesday, January 21, 2009

Continuous reload of standby unit - FWSM Failover Configuration Syncronization problem

The following error message can be printed on standby:
Config Sync Error: Following command could not be executed on standby

<>Context: <>
******REPLICATION OF CONFIGURATION FROM ACTIVE TO STANDBY UNIT IS INCOMPLETE, TO PREVENT THE STANDBY UNIT TAKING OVER AS ACTIVE WITH A PARTIAL CONFIGURATION, THE STANDBY UNIT WILL NOW REBOOT*******

The problem is that for some reason, the failover replication is stopped because one of the commands was not accepted on standby. For that reason and to avoid inconsistence states, it reloads.
In fact, the happens on version 2.3(2). On versions 3.X.X I believe that the problem will not occur. The problem is related to the configuration status on standby. When the maximum acl is achived on the blade (you can check it with thecommand "sh resource acl") the standby unit will also get to this state correctly synced. The problem in this situation was that when the active unit wanted to replicate the configuration in the limit acl config, the standby did not accept some of the rules and rejected at least one of the lines. This caused this situation.

How to fix it:

- Optimize your config and reduce your config size.

- If it does not sync correctly, clear the configuration on standby and sync it again.