Picture the scene. It’s Friday, and in the UK, at last, the final day of school for many children. Holidays booked and much needed time off from work stretches ahead of you. Except if you work in IT. The CrowdStrike-originated global outage for Windows users – plus it would seem, parts of Microsoft’s cloud infrastructure itself – means it’s all hands on deck to patch the issue. And this is where the challenge lies.
The fix requires booting the affected device into safe mode and manually deleting a CrowdStrike-originated file. Furthermore, if the hard drive is encrypted with Bitlocker, access to that storied key may be necessary. This means a support engineer visiting each affected device or finding a way to communicate the fix to end users for them to resolve themselves. The latter is fraught with its own risks. It may require a non-technical user to go digging around in Windows configuration directories.
But before we deploy the fix, how do we know which hardware is affected? And, in the modern world of hybrid working, how do we know where that device is located? How do we know if it received the faulty patch before CrowdStrike issued a fix? All are vital for enabling a speedy resolution of this issue.
IT departments might also choose (or be forced) to simply swap out the affected machine. In this case, they need to know where they have spare machines and the state of their configuration. Robust Hardware Asset Management answers these questions.
Hardware Asset Management is vital in situations like this because it contains abundant data about the physical aspects of the endpoint. IT departments may also need effective and secure logistics in place to deploy a replacement machine and ensure the return of the affected machine.
All too often, HAM is seen as a clerical function – affixing a barcode to a new device, scanning it, and deploying it to its end user. It has the potential to be much more – highlighting when devices are approaching end of warranty or support, keeping a focus on often expensive assets, and providing a first-class hardware service for busy employees.
We’ll see how long we’ll experience disruption from this outage. What’s clear is it will be a busy weekend for IT support teams – perhaps whilst users unable to work, have taken advantage of an early and extended weekend.
Alongside the HAM issues, it also highlights (yet again) how fragile our electronic world is. Rather like the butterfly effect, the addition of a single faulty file to an update has left people in Australia unable to pay for supermarket essentials and hospitals to cancel operations.
For many years in the SAM world, we have talked about standardisation. We have focused heavily on reducing the variety of software applications we have, to reduce risk. However, the CrowdStrike outage highlights the issues around the lack of software diversity within organisations, and even entire industries.
In the modern world of cloud software, there is a huge amount of choice, and yet, we find ourselves in a position where we have become so heavily reliant on a particular software from one single vendor, it is able to ground flights across the world.
In times gone by, having a diverse range of software was a no-no. To this day, it’s common for businesses to remain loyal to their preferred vendors. This incident turns this viewpoint on its head. The SAM industry might now need to consider the possibility that limited software diversity is also a risk.
We should be mindful that over-reliance on one specific software or vendor carries its own risk. Robust vendor and supplier management is a crucial element of any SAM practice. We must acknowledge that having a variety of suppliers would reduce the impact of major outages. This means reconsidering the notion that fewer vendors is always better. And, understanding that limiting vendor and supplier diversity may have an adverse effect on risk, in an outage scenario.