Sysadmins troubleshoot using mental abstractions. Our knowledge of a system can never be complete; we are always missing information when troubleshooting. Abstractions are used to cover gaps in our knowledge, and allow us to fix problems without knowing all of the details.
Abstractions are a sysadmin's most powerful tool and potentially our biggest weakness.
A co-worker (let's call him Ralph) had been troubleshooting a user's computer for two hours when he finally called me. I walked to where Ralph was working, he looked frustrated and exhausted,We can take abstractions for granted, and forget to see the obvious. When we abstract the portion of our knowledge that contains the problem, we come to a point where we cannot troubleshoot even the simplest issue. This can be embarrassing, and undermine our own confidence in our work."I've tried every thing I can think of, but this system will not stay on the network."I took a deep breath, and thought about what he said.
"Really? What have you tried?", I asked.
"Updating the drivers, a reboot, re-seating the NIC, loopback tests, I checked the switch config, and I even updated the BIOS."
"Really?", I questioned.
"Yeah. No dice."
I turned around and left the room without saying a word.
I walked down to the communications closet, found the right cable and pulled it out. I bent the plastic clip back and inserted it back into the switchport with a satisfying *click*. The link-light went green.
I walked back into the room, sat down at the computer and executed a few well practiced keyboard maneuvers (/release and /renew). The network connection was established.
The user was very grateful (probably had an ebay auction), I mumbled something about a switch configuration error and that we'd be sure to look into it. After we left, I told Ralph what happened."#&%$! How did I miss that?"
"It's simple. You took something for granted."
This happens to everyone, even senior sysadmins. Tom Limoncelli has an excellent List of Dumb Things To Check, that's filled with some simple (and some complex) things that have caused hours of wasted time for sysadmins.
All too often when troubleshooting it's easy to think of every possible thing that could go wrong. We get caught up in our own abstractions and forget about reality. We must focus on the moment, and deliberately acknowledge where we've created abstractions.
This is a deliberate form of thinking, and it takes some practice. In Zen this is called it 初心 (shoshin), the Beginners Mind. Seeing everything fresh, as if it were the first time you've seen it. Being in the moment. Being deliberate.
The next time a complex problem occurs, take a minute (take a deep breath), and deliberately choose your abstractions. If you don't know why you've chosen one ("Is the network cable plugged in?") question it, observe it, and understand it.
Deliberately choosing your thoughts will not only help you troubleshoot, it will bring a vitality and freshness to your work. You will see things that you haven't seen before and understand things few others do. Your work will feel more like play, and you will enjoy the simple as well as the complex problems.