I'm just interested in feedback from people here. My background is sysadmin and I've had MANY people come screaming at me over the last couple of decades when a Zero day comes out to cancel all my plans and suddenly patch the entire infrastructure.
I've had a few arguments with cybersecurity people who haven't worked sysadmin or similar who insist that the patches need to be installed STRAIGHT AWAY and scare the C-Suite about it.
However, I've seen enough times with Microsoft especially, but also VMWare back in the day where an entire infrastructure was taken down by a bad patch. And don't say - you should test it first, there are PLENTY of clients out there with no test environments and some of these patches that went out didn't display a problem until after reboot or even a couple of days later.
I'm a BIG fan of letting someone else take the risk first. Definitely DON't do it on a Friday night & leave at least 48 hours before even thinking about patching a 0 day, as the patches are usually rushed, badly written & VERY likely to be faulty.
Thoughts?
For things like OS and infrastructure patches I fully agree. If you have a test environment feel free to start testing the zero day there immediately, but anything that needs to be up should wait at least a few days, and then start small with your testing. Most of these zero days would need more access to your environment and would require some level of expertise.
Now zero days for things like browsers are a different story. The level of exposure to things outside of your control make them a little more critical to start updating on endpoints that are browsing the web. My feeling is that browsers and similar software should be patched as soon as possible, and in some cases, same day as zero day is announced the rollout should be started. Now I would put in the caveat of, start patching your preplanned test users first and let the updates bake for a bit before rolling out across your entire environment, but I would not take as much time to get those patches out as I would for patches on mission critical equipment.
This is where update rings come into play. Whatever you are using to push the updates out should have this feature, and if it doesn't you need to find something that does. It's simply pushing the update to large and larger groups of users on a set schedule. Your first ring would probably be a core set of users, including IT, os if there is a problem with an update they will know what going on and you can pause updates going out to anyone lease before is it addressed. With things I have seen with breaches it seems less to be about the 0 day and things being exploited what a patch has been out for a year and was just never applied...
Thoughts?
John-
To an extent, they are right. One should generally keep equipment patched. But like any advise, it needs to be moderated with a dose of reality. There is risk to patching; there is risk to not patching.
The goal is to find the middle ground where everyone can sleep at night and everyone is equally unhappy with the plan.
We have a schedule stating how many "wait days" before each wave gets updates (as @JKWiniger mentions), including escalation based on CVSS score. And, if there is evidence we are under attack the schedule escalates further.
@akkem But you seem to miss one thing... you don't know if a 0 day is being exploited until it's too late! So while many things are not actively exploited you need to act as if they all are. Even if 1 in 100 get exploited I would much rather apply 100 patched than not patch for that 1 that matters...
John-
I think that my issue is that I've seen more patches go wrong than exploits .
As someone who's done the 2 or 3 days in a row in the office and the 6am finishes because someone else did something stupid while still being made redundant or having my contract not renewed.
I think I'm at the point in my career where I'd rather take a 99% risk than cancel a plan with friends.
I do genuinely dislike being expected to cancel plans or stay late with no extra pay by people who've never done it or have never worked sysadmin. Especially if the 0 day is on a product I didn't want to buy in the first place or a PM or cybersecurity guy ignored advice I gave 6 months before. Usually a PM. ;o)
Thats if you have testing environments. To be fair if a security team came to me at 4:55 expecting me to cancel plans, I wouldn't even roll out testing patchers unless I could press a button and leave the office and worry about it in the morning.
I asked this question on the r/sysadmin reddit & there is definitely a split between the junior guys who haven't been mentally destroyed yet and the more senior guys like me that have been screwed over by corporate decisions repeatedly ;o)
I'm now a strict, no overtime without 1 month notice and double time, regardless of risk.
@vishybear wrote:I think that my issue is that I've seen more patches go wrong than exploits .
The missing bit is not knowing how many exploits were foiled by patching prior to exploit. Part of that may be because success does not create headlines. The classic example being fear of flying due to airplane accidents despite the fact that all objective measurements show driving has more fatalities.
I do genuinely dislike being expected to cancel plans or stay late with no extra pay by people who've never done it or have never worked sysadmin.
You might check your local labor laws regarding working "off the clock". But that is just a symptom. The true problem is not aligning staffing levels with requirements. If the organization requires 1-hour response, the organization needs to staff for 1-hour response.
@vishybear wrote:insist that the patches need to be installed STRAIGHT AWAY ... [v.s.]... letting someone else take the risk first.
Both of these are immature positions. The mature position realizes this is not a binary choice. Notable to me is that all those advising a layered approach are CISSPs, indicating 5+ years demonstrable security experience, often including sysadmin work at least for their own boxes and an ability to understand the management perspective.
Back to the tiers/waves, a "test environment" is not mandatory. One possible approach would be something like this:
We take a mix and match approach focusing on all pertinent risks. Prioritising based on CVSS score and how exposed particular systems are. You'd test in non prod if feasible, but if not, take one of two approaches to derisk deployment; wait a few days or start patching less critical systems first and monitor for ill effects. Once you're happy you can accelerate the deployment.