Troubleshooting Checkpoint ClusterXL
I recently came across an issue where SmartView Monitor showed an error for ClusterXL on a freshly rebuilt Checkpoint IP565 firewall. Both Synchronization and Filter were stuck in an initilizing state, we tried the following troubleshooting steps initially to no avail:
- cphastop followed by cphastart
- cpstop followed by cpstart
- reboot of the affected firewall
On digging deeper we noticed that one of the firewall devices was configured to use multicast and one for broadcast cluster communications, this was identified using the following command ‘cphaprob -a if‘ which presents the following output:
eth-s1p3c0 non sync(non secured)
eth-s4p3c0 non sync(non secured)
eth-s4p4c0 non sync(non secured)
eth-s1p1c0 non sync(non secured)
eth-s1p4c0 sync(secured), multicast
eth-s1p2c0 non sync(non secured)
eth-s4p1c0 non sync(non secured)
eth-s4p2c0 non sync(non secured)
Virtual cluster interfaces: 7
eth-s1p3c0 xx.xx.xx.xx
eth-s4p3c0 xx.xx.xx.xx
eth-s4p4c0 xx.xx.xx.xx
eth-s1p1c0 xx.xx.xx.xx
eth-s1p2c0 xx.xx.xx.xx
eth-s4p1c0 xx.xx.xx.xx
eth-s4p2c0 xx.xx.xx.xx
Both firewalls must be configured to use the same method of communication, which can be changed using the following command ‘cphaconf set_ccp multicast‘ or ‘cphaconf set_ccp broadcast‘. Providing your switching infrastructure supports multicast you should use this mode due to the performance overhead of broadcast communication. This command failed to change the method of communication and left us with no other option than to perform the following steps:
- Set Checkpoint Packages as in-active, then delete them ensuring that the Connectra package is removed first.
- Re-install the Checkpoint R65 IPSO Wrapper
- Re-install HFA 70
- Re-establish SIC via CPConfig and SmartDashboard
- Unassign and re-assign license via SmartUpdate
- Push policy from the SmartDashboard
After performing thse steps the cluster CCP was back to multicast (bizare really…). We had to perform a reboot of the second device once this was completed, at which point both nodes of the cluster reported no ClusterXL errors, ‘cphaprob list‘ showed the following output:
# cphaprob list
Registered Devices:
Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 213003 secDevice Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 213003 secDevice Name: cphad
Registration number: 2
Timeout: 5 sec
Current state: OK
Time since last report: 0.7 secDevice Name: fwd
Registration number: 3
Timeout: 5 sec
Current state: OK
Time since last report: 0.5 sec
‘fw ctl pstat‘ should also list the Synch as ‘Able to Send/Receive sync packets’ :
# fw ctl pstat
Machine Capacity Summary:
Memory used: 14% (90MB out of 637MB) – below low watermark
Concurrent Connections: 26% (17876 out of 67900) – below low watermark
Aggressive Aging is in monitor onlyHash kernel memory (hmem) statistics:
Total memory allocated: 200278016 bytes in 48894 4KB blocks using 2 pools
Initial memory allocated: 20971520 bytes (Hash memory extended by 179306496 bytes)
Memory allocation limit: 536870912 bytes using 10 pools
Total memory bytes used: 23487660 unused: 176790356 (88.27%) peak: 34170776
Total memory blocks used: 7126 unused: 41768 (85%) peak: 9164
Allocations: 1183931215 alloc, 0 failed alloc, 1183678473 freeSystem kernel memory (smem) statistics:
Total memory bytes used: 250335916 peak: 300842432
Blocking memory bytes used: 1865892 peak: 2596156
Non-Blocking memory bytes used: 248470024 peak: 298246276
Allocations: 160033475 alloc, 0 failed alloc, 160032829 free, 0 failed freeKernel memory (kmem) statistics:
Total memory bytes used: 73389696 peak: 101169940
Allocations: 1184023246 alloc, 0 failed alloc, 1183769860 free, 0 failed free
External Allocations: 0 for packets, 0 for SXLKernel stacks:
0 bytes total, 0 bytes stack size, 0 stacks,
0 peak used, 0 max stack bytes used, 0 min stack bytes used,
0 failed stack callsINSPECT:
1029526467 packets, -2128289516 operations, 373013811 lookups,
2035 record, 183665476 extractCookies:
-1649393933 total, 0 alloc, 0 free,
4607 dup, -1525329462 get, 138972711 put,
-1565092568 len, 217535 cached len, 0 chain alloc,
0 chain freeConnections:
54513276 total, 52537755 TCP, 1898998 UDP, 76506 ICMP,
17 other, 49485065 anticipated, 1 recovered, 17882 concurrent,
24286 peak concurrentFragments:
213594 fragments, 105472 packets, 389 expired, 0 short,
0 large, 0 duplicates, 0 failuresNAT:
23444077/0 forw, 29804768/0 bckw, 53234829 tcpudp,
14016 icmp, 702040-723136 allocSync:
Version: new
Status: Able to Send/Receive sync packets
Sync packets sent:
total : 78286072, retransmitted : 16171, retrans reqs : 20, acks : 3
Sync packets received:
total : 17030603, were queued : 16591, dropped by net : 15
retrans reqs : 8840, received 3 acks
retrans reqs for illegal seq : 0
dropped updates as a result of sync overload: 0