VM : High Number of VHD’s – Boot Order Issues

VM : High Number of VHD’s – Boot Order Issues

Encountered an issue with a VM that had around 15 VHDs and multiple SCSI buses. After testing this a few times I discovered that on adding the 8th drive the boot order changed to: 1:0, 1:1, 1:2… 1:7 – this was automatic, I never changed the boot order.

To make matters worse, you can only see 8 VHDs in the VM BIOS, so it is impossible to change the primary boot disk to 0:0 again!

Workaround – I figured, lets move the boot volume to SCSI channel 1 and the data disks to an other controller. I move the data disks to channel 2/3 and the system booted. So, my system disk was on 1:0, additional disks were on channels 2:X and 3:X.

Update 02/02/2012: You can also set the boot drive in the vmx file. Power off the VM and add the line:
bios.hddOrder=scsi0:0 to force the VM to boot from disk 0:0.

Windows : NetLogon Service Missing

Windows : NetLogon Service Missing

Encountered a machine today where the user was unable to connect to network printers. On further investiagtion users without cached credentials were unable to logon to the machine and received an error regarding the NetLogon service not running. After checking the netlogon service, it was infact missing!

To resolve this simply install ‘Client for Microsoft Networks’ from the LAN/Wireless network adapter configuration and reboot.

Cluster: Generic Script – Rouge Process Cleanup/Terminate

Cluster: Generic Script –  Rouge Process Cleanup/Terminate

I came across an interesting issue whilst clustering a CODA application server on a Windows Server 2003 Enterprise Microsoft Cluster. CODA spawns multiple processes after service startup, these processes are not services and as a result the cluster is not aware of them. When the CODA services are stopped theadditional processes are not always cleaned up. If the services are started again they will fail with a bind error.

I created a Cluster Generic Script to cleanup ‘rogue’ processes on resource group stop and start. Initially the script below contained a process.terminate() function, however this does not work for SYSTEM owned process. I therefore explored using taskkill.exe which is built into Windows, this means that no additional components are required to get this script working.

‘Chris’ CODA Fix Cluster Script – Created 14/01/2011

Dim WshShell, oExec, oLooksAlive, oIsAlive, oWait, objWMIService, colProcess, objProcess, strComputer, objShell
Set WshShell = CreateObject(“WScript.Shell”)

processName = “oasasv.exe”

‘On Error Resume Next

Function Online( )
    Resource.LogInformation “Entering Online”
    On Error Resume Next
   
    If CheckProcess > 0 Then KillProcess
   
    If CheckProcess > 0 Then
        Resource.LogInformation “Rougue ‘” & processName & “‘ process still present, FAILED to kill…”
        Online = False
    Else
        Online = True
    End If
       
    If Err.Number > 0 Then
      Resource.LogInformation Err.Details
      Resource.LogInformation oExec.StdErr.ReadAll
    End If   
End Function

Function Offline( )
    Resource.LogInformation “Entering Offline”
   
    If CheckProcess > 0 Then KillProcess
   
    If Err.Number > 0 Then
      Resource.LogInformation Err.Details
      Resource.LogInformation oExec.StdErr.ReadAll
    End If
     
    Offline = True
End Function

Function LooksAlive( )
     Resource.LogInformation “Entering LooksAlive”
     LooksAlive = True
End Function

Function IsAlive( )
     Resource.LogInformation “Entering IsAlive”
     IsAlive = True
End Function

Function Open( )
     Open = 0
End Function

Function Close( )
     Close = 0
End Function

Function Terminate( )
    Resource.LogInformation “Entering Terminate”
     
    If Err.Number > 0 Then
      Resource.LogInformation Err.Details
      Resource.LogInformation oExec.StdErr.ReadAll
     End If
     Terminate = True
End Function

Function CheckProcess ()
    CheckProcess = 0
    strComputer = “.”
   
    Set objWMIService = GetObject(“winmgmts:” & “{impersonationLevel=impersonate}!\\” & strComputer & “\root\cimv2”)
    Set colProcess = objWMIService.ExecQuery (“Select * from Win32_Process where Name = ‘”
& processName & “‘”)
   
    On Error Resume Next
    For Each objProcess in colProcess
        CheckProcess = CheckProcess + 1
    Next
End Function

Function KillProcess ()
    strComputer = “.”
    Set objWMIService = GetObject(“winmgmts:” & “{impersonationLevel=impersonate}!\\” & strComputer & “\root\cimv2”)
    Set colProcess = objWMIService.ExecQuery (“Select * from Win32_Process where Name =
‘” & processName & “‘”)
   
    On Error Resume Next
    For Each objProcess in colProcess
        Set objShell = CreateObject(“WScript.Shell”)
        objShell.Run “taskkill.exe /F /IM ” &
processName

        Resource.LogInformation “Killed rougue ” & processName & “ process…”
    Next
End Function

 

Windows 7: Mapped Drived Fail At Logon/Startup

Windows 7: Mapped Drived Fail At Logon/Startup

I cam across an issue recently where mapped drivers were not connecting at user logon. These drives were attached via the user account in AD (home drive) and via aVBS logon script.

This was resolved by setting the following registry key on the Windows 7 client:

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System]
“EnableLinkedConnections”=dword:00000001

Checkpoint : Troubleshooting Checkpoint ClusterXL

Troubleshooting Checkpoint ClusterXL

I recently came across an issue where SmartView Monitor showed an error for ClusterXL on a freshly rebuilt Checkpoint IP565 firewall. Both Synchronization and Filter were stuck in an initilizing state, we tried the following troubleshooting steps initially to no avail:

  1. cphastop followed by cphastart
  2. cpstop followed by cpstart
  3. reboot of the affected firewall

On digging deeper we noticed that one of the firewall devices was configured to use multicast and one for broadcast cluster communications, this was identified using the following command ‘cphaprob -a if‘ which presents the following output:

  eth-s1p3c0      non sync(non secured)
  eth-s4p3c0      non sync(non secured)
  eth-s4p4c0      non sync(non secured)
  eth-s1p1c0      non sync(non secured)
  eth-s1p4c0      sync(secured), multicast
  eth-s1p2c0      non sync(non secured)
  eth-s4p1c0      non sync(non secured)
  eth-s4p2c0      non sync(non secured)

  Virtual cluster interfaces: 7

  eth-s1p3c0      xx.xx.xx.xx
  eth-s4p3c0      xx.xx.xx.xx
  eth-s4p4c0      xx.xx.xx.xx
  eth-s1p1c0      xx.xx.xx.xx
  eth-s1p2c0      xx.xx.xx.xx
  eth-s4p1c0      xx.xx.xx.xx
  eth-s4p2c0      xx.xx.xx.xx

Both firewalls must be configured to use the same method of communication, which can be changed using the following command ‘cphaconf set_ccp multicast‘ or ‘cphaconf set_ccp broadcast‘. Providing your switching infrastructure supports multicast you should use this mode due to the performance overhead of broadcast communication. This command failed to change the method of communication and left us with no other option than to perform the following steps:

  1. Set Checkpoint Packages as in-active, then delete them ensuring that the Connectra package is removed first.
  2. Re-install the Checkpoint R65 IPSO Wrapper
  3. Re-install HFA 70
  4. Re-establish SIC via CPConfig and SmartDashboard
  5. Unassign and re-assign license via SmartUpdate
  6. Push policy from the SmartDashboard

After performing thse steps the cluster CCP was back to multicast (bizare really…). We had to perform a reboot of the second device once this was completed, at which point both nodes of the cluster reported no ClusterXL errors, ‘cphaprob list‘ showed the following output:

# cphaprob list

Registered Devices:

Device Name: Synchronization
Registration number: 0
Timeout: none
Current state: OK
Time since last report: 213003 sec

Device Name: Filter
Registration number: 1
Timeout: none
Current state: OK
Time since last report: 213003 sec

Device Name: cphad
Registration number: 2
Timeout: 5 sec
Current state: OK
Time since last report: 0.7 sec

Device Name: fwd
Registration number: 3
Timeout: 5 sec
Current state: OK
Time since last report: 0.5 sec

fw ctl pstat‘ should also list the Synch as ‘Able to Send/Receive sync packets’ :

# fw ctl pstat

Machine Capacity Summary:
  Memory used: 14% (90MB out of 637MB) – below low watermark
  Concurrent Connections: 26% (17876 out of 67900) – below low watermark
  Aggressive Aging is in monitor only

Hash kernel memory (hmem) statistics:
  Total memory allocated: 200278016 bytes in 48894 4KB blocks using 2 pools
  Initial memory allocated: 20971520 bytes (Hash memory extended by 179306496 bytes)
  Memory allocation  limit: 536870912 bytes using 10 pools
  Total memory bytes  used: 23487660   unused: 176790356 (88.27%)   peak: 34170776
  Total memory blocks used:     7126   unused:    41768 (85%)   peak:     9164
  Allocations: 1183931215 alloc, 0 failed alloc, 1183678473 free

System kernel memory (smem) statistics:
  Total memory  bytes  used: 250335916   peak: 300842432
    Blocking  memory  bytes   used:  1865892   peak:  2596156
    Non-Blocking memory bytes used: 248470024   peak: 298246276
  Allocations: 160033475 alloc, 0 failed alloc, 160032829 free, 0 failed free

Kernel memory (kmem) statistics:
  Total memory  bytes  used: 73389696   peak: 101169940
        Allocations: 1184023246 alloc, 0 failed alloc, 1183769860 free, 0 failed free
        External Allocations: 0 for packets, 0 for SXL

Kernel stacks:
        0 bytes total, 0 bytes stack size, 0 stacks,
        0 peak used, 0 max stack bytes used, 0 min stack bytes used,
        0 failed stack calls

INSPECT:
        1029526467 packets, -2128289516 operations, 373013811 lookups,
        2035 record, 183665476 extract

Cookies:
        -1649393933 total, 0 alloc, 0 free,
        4607 dup, -1525329462 get, 138972711 put,
        -1565092568 len, 217535 cached len, 0 chain alloc,
        0 chain free

Connections:
        54513276 total, 52537755 TCP, 1898998 UDP, 76506 ICMP,
        17 other, 49485065 anticipated, 1 recovered, 17882 concurrent,
        24286 peak concurrent

Fragments:
        213594 fragments, 105472 packets, 389 expired, 0 short,
        0 large, 0 duplicates, 0 failures

NAT:
        23444077/0 forw, 29804768/0 bckw, 53234829 tcpudp,
        14016 icmp, 702040-723136 alloc

Sync:
        Version: new
        Status: Able to Send/Receive sync packets
        Sync packets sent:
         total : 78286072,  retransmitted : 16171, retrans reqs : 20,  acks : 3
        Sync packets received:
         total : 17030603,  were queued : 16591, dropped by net : 15
         retrans reqs : 8840, received 3 acks
         retrans reqs for illegal seq : 0
         dropped updates as a result of sync overload: 0