Five o’clock, one evening at work. It’s dark. A message comes in from a customer to my boss:
The home pages for the international stores on UAT(*) have suddenly all gone blank!
(*) UAT = User Acceptance Testing. When we produce sites for our customers they’re developed locally, then pushed to a virtual copy of the live environment called UAT which allows the customer to agree to the site’s functionality as per specifications, identify bugs, or request further changes; when everyone’s happy the site then gets moved to the Web Staging environment which is a duplicate of the live environment running off the same data as a last chance sanity check before it gets transferred finally to live.
You don’t get to be boss by fixing issues like this so a message goes from the boss to me first thing the following morning:
The home pages for the international stores on UAT have suddenly all gone blank. Can you take a look?
Strange, I think. I was on UAT at four o’clock that evening and it all looked fine.
I take a look and yes, the pages are blank. Very odd. I load up the local environment from which the UAT system was deployed. The pages are fine as I’d (if not expected, at least) hoped. So, something happened on UAT after four that caused all the home pages to go blank. I hope it was nothing I did but, then again, I didn’t do anything; I was just double-checking the layout conformed to certain requirements so as to properly reply to an email. Okay then. I “take a look” as asked/instructed.
Has someone tampered with the home page views, I wonder? I look and they’re all indicating that they’re untouched. I compare them against the local versions and, yes, they’re matching just fine. I compare against the current live views since they shouldn’t have changed and confirm that everything is as it should be.
Has the data been erased somehow? I compare with a copy of the site and check some URLs that should be referenced on the home pages and they’re all showing up just fine. I launch a remote session onto the virtual machine and run some smoke test queries against the data. Nothing obvious comes up.
Maybe it’s the access permissions! I could check individual access rights for the user I’m logging in as against the content that should be displayed or I could simply hit the “Rebuild access control permissions” button in the configuration section and see if that fixes the problem; it wouldn’t be the first time they’ve needed a kick up the backside like this. The latter option appeals to me. It’ll only take a few minutes to run. I do that.
It doesn’t even get one tenth of the way through before it fails. That shouldn’t happen, I think to myself. I like to think obvious statements to myself. This is something I need to tell my boss.
“Tried to rebuild access control and it failed,” I say. I’m a fan of brevity when it comes to verbal communication.
“Yeah, it did for me too.”
“Er… when? You weren’t just trying it now were you?”
“Good, because both of us trying at the same time wouldn’t be sensible.”
“So, when then?”
“Oh, yesterday evening.”
“About four thirty, quarter to five, that sort of time.”
“I just wanted to make sure we could. I was thinking of doing it on live.”
“So in the short window of time between the site working absolutely fine and then all the home pages showing blank you saw fit to run a procedure that failed and decided to withhold that information from me because it didn’t seem appropriate to share when you subsequently wanted the problem looked into? Why did you not think that having a procedure fail on the site shortly before things going wrong on the site wasn’t pertinent information for someone to know when tasked with determining the cause of a problem? At what point did we slip into the parallel dimension in which all work has to involve elements of puzzles in it necessitating supplying hints to the issue but not the full list of events preceding the moment of error? Do you have any idea how much time I’ve wasted on this this morning when had I known that you’d just had a database-altering procedure fail moments before all hell breaking loose on the site I could have told you ‘hey, you know that thing that just failed? That’s the thing that’s caused the problem! Investigation over!’ and we’d have reached this point in the issue-resolution process within just three sentences?”
I don’t say any of that out loud; that’s the sort of speech that sits in your head and festers while you fantasise about murder.