Jones Lang LaSalle/Bank of America: CFMS Hardware Failure
When the Critical Facilities Monitoring System (CFMS) unexpectedly failed for their Bank of America client, Jones Lang LaSalle turned to Affinity Energy for a quick turnaround to get the system back online.
Jones Lang LaSalle & Bank of America
Jones Lang LaSalle (JLL), is the property manager for Bank of America and responsible for managing the CFMS. The CFMS manages all critical equipment for 17 buildings owned by Bank of America in the Charlotte, NC area. It monitors 113 different device types, including AC units, generators, UPSs, PDUs, transfer switches, switchgear, meters, batteries, and fire alarm panels.
Affinity Energy came to our aid without hesitation and as soon as humanly possible. They were flexible enough to work late hours on the weekend, and finished an entire day faster than we anticipated.
–Walt Perryman, Operating Engineer, Jones Lang LaSalle
In 2015, JLL knew their Critical Facilities Monitoring System (CFMS) needed an upgrade. The hardware had hit its end of life, some software components were reaching their lifecycle end, and servers were running on an unsupported version of Windows. Unfortunately, JLL failed to secure the budget required for a complete overhaul, and instead started working with Affinity Energy in 2016 to develop an improved disaster recovery strategy. This involved installing a backup virtual machine (VM) host server and using VMWare to run backup VM images in the event of a primary server failure. While not the long-term solution the team had hoped for, it was the best solution given the existing budget constraints.
Midway through the implementation of the backup VM host server, the scope of the project changed when JLL finally received the budget to replace their outdated servers, fibre channel switch, and Storage Area Network (SAN) with a Nutanix, a hyper-converged high availability solution that aims to reduce power and space consumption and eliminate storage complexities.
On December 1, less than a week from the planned installation of the new Nutanix VM host system, the unexpected happened. The existing VM host environment experienced a catastrophic failure. JLL could no longer access the CFMS due to total equipment failure.
With no remote visibility to the critical environments, JLL was required to deploy personnel in shifts to make physical rounds in all 17 buildings to ensure equipment continued to operate as normal. This is standard operating procedure whenever the monitoring system is down for any reason. Next, JLL contacted Affinity Energy to help get their systems back online ASAP.
Affinity Energy support personnel were unable to access the system remotely and dispatched personnel to the CFMS host server location to troubleshoot. Affinity Energy determined that the existing storage area network (SAN) had been compromised due to hard drive failure. They were able to reconfigure the SAN to allow the system to work again; however, less than an hour later, JLL was once again unable to access the system via remote access.
Within the hour, Affinity Energy engineers again arrived onsite. Due to the strain on the system, yet another SAN drive had failed, rendering the entire disk pool faulty. Half the VMs were inaccessible and unrecoverable.
Affinity Energy engineers determined the existing hardware could no longer be used.
Luckily, Affinity Energy had already ordered, received, and set up the Nutanix host. A decision was made to place the new Nutanix host in service and run the existing VMs on it so JLL could have visibility to the critical environments again.
On December 2, Affinity Energy installed the new servers and started the Nutanix configuration, including reconfiguring IP addresses, setting up networking, adding hosts, restoring backups, and salvaging VMs not affected by the second disk pool's failure.
Late into the night, remote access was achieved, and the network was successfully reconfigured on the Nutanix host.
JLL started using the CFMS again around 11am on Saturday morning and finally could send home staff members who had been working around the clock to keep tabs on all of the critical spaces.
JLL expected the site to be back up by Monday, December 5. Affinity Energy engineers cut that downtime in half due to quick troubleshooting and decision-making during a time of high stress and catastrophic failure. Ultimately, Affinity Energy completed the task of setting up the new Nutanix server and associated virtual machine environment in just one day
Additionally, the system data salvaged from the VMs included the system’s historian which meant the only data lost was the data that would have been collected during the critical system failure downtime from Thursday until Saturday morning.
Affinity Energy will continue to work on the new Nutanix software on an expedited schedule, and complete the final software migration in early January 2017.