I wrote this post for a good friend who asked me for some questions he could use for interviewing a Systems Engineer. They are also easily adapted for System Administrators and Network Engineers (and there is a pdf of the questions you can print out below if desired).
When I think of the great systems engineers, network engineers or systems administrators I have worked with they all have the following qualities:
- Great teamwork
- Reliable – they always follow through and are good at juggling a lot of things.
- Intellectually curious – a learner seeking knowledge
- Understands technology basics required for their job (and usually much more than just the basics)
- Can figure almost anything out (and takes enjoyment in solving problems – big or small)
- Pragmatic, hard working, and tenacious.
Of course there are many other traits, but for this type of role a person needs to be willing and able to work on any type of problem thrown their way (in a small company this can mean removing viruses off an employee’s (or employee’s wife) computer or building out a 100 new servers in a colo, or managing 1000 EC2 instances in the cloud. This person needs to be patient, reliable, helpful and most of all able to learn quickly and adapt – because you can’t always plan and account for all the things that can go wrong.
Therefore it is important to focus most of your questions on a candidate’s reliability, resourcefulness, and ability to learn and adapt to new technologies and situations. Of course you should cover some technical, and knowledge based questions, although focus those on the job expectations and things they have learned in the past.
Below is a list of questions I like to ask that will help you assess the candidates potential in the above areas.
- Have you ever been in a situation where you found yourself having to learn a technology in order to perform a task essential to your job responsibilities? What did you do?
Every good candidate should have a response to this line of questions. A lot of systems engineering involves debugging or solving complex problems – often with software you didn’t write or have prior knowledge of. Being able to learn quickly, be resourceful and come out victorious are important to any role.
- What blogs/sites do you read to stay abreast of current trends?
Most great candidates will not just care about technology, but have an active passion for new and important trends. They should have a few sites or thoughts on how they stay on top of current industry trends. Most of the really great IT candidates I interviewed also had a list of sites they watch for virus/security news to make sure their teams have the latest patches for vulnerabilities etc.
- Have you ever been on call before? Did you like it? Did you ever miss a page?
- When have you gone over and above at work? If you had to do it over, would you do it again? Why or why not?
- Have you ever done something on the job that you were really proud of, but no one else really knew about?
- What does being awesome at your job mean to you?
You are looking for someone who values excellence, knows what “operations” means and is willing to sign up for the role. You should be transparent in the interview about the following:
- Frequency of on-call responsibilities
- Will they need to be available on holidays to support systems?
- How often do pages/escalations happen now?
- If they work a lot of unusual or off hours are the expectations adjusted for their normal job duties?
Make sure you are transparent about he role and expectations, and they agree in the interview process that it is reasonable and acceptable given their personal obligations and life outside of work.
Best Practices – Monitoring
- Based on our current website (or ask about a past project or site they worked on – any example will do) what is the minimum set of things you should monitor? What is the optimum set? What has worked well in practice?
Most great systems engineers will want to monitor everything (in my opinion you can never have too much monitoring – unless it slows down your systems, or you don’t have an easy way to see the forest through the trees – or key health signals), but a good candidate should identify the basics right away. Things like external uptime monitoring (a la pingdom or keynote, etc), basic server health monitoring cpu, memory, i/o, disk space, etc. But most candidates should rattle off a whole bunch of other things as well – like application monitoring, network, etc. Some great follow up questions to this one are:
- Have you ever needed more monitoring and didn’t have it? If so how did you get around the problem?
- Have you ever setup a monitoring system like Nagios, Munin, Zabbix, etc? If so, what were some of the things you learned from the process? Any changes or improvements you would have made?
Best Practices – Disaster Recovery
- What is a disaster recovery plan? Have you ever created one? In your opinion, how (and how often) should you test your plan?
- How do offsite backups play into such a plan?
You are looking for a candidate that really understands disaster recovery policies and can make the right tradeoffs when implementing one. Generally the “right” answer depends on your business (if you are doing air traffic control, you better have a well tested and rehearsed plan, whereas if you are just operating a personal blog failover and backups to S3 may be sufficient).
- What is a 3-way handshake? Hint: TCP uses it.
- How do TCP/IP networks work?
- How does DNS work? Have you ever had DNS go down? When should you have backup DNS – have you ever had to set this up for a website?
- How does traceroute work?
- You have a MySQL DB, once you login how could you find out the schema of the db? (or any other set of basic SQL commands)
Anything here is fair ground. The best way to cover this is to pepper these in with the other questions, or in relation to the other projects the candidate has worked on in the past. Definitely don’t site and go through each one because that would be a bit hard for most people to context switch on these knowledge ones right after another – try to group them with other related questions.
The best questions are ones that are directly applicable to the job for which the candidate is interviewing. If this is more of a network engineer role, going deeper on networking topics is fair. If it is more of an IT role focusing on more of the traditional IT stuff (like setting up office networks, removing viruses, setting up password policies) are all fair game. Tailor them to the role and candidate.
- What was the last piece of software you installed on a server? How did you do it? Had you done it before? Do updates/installs always go this smooth for you?
This should be a gimme question for most candidates. You are just looking for someone who installs things often and can tell you about it. Ideally they have some stories when things haven’t worked as well and can share an anecdote about incompatible versions or something else interesting.
- What are the pros and cons of using a cloud like Amazon?
Not all roles will require knowledge of cloud computing, but I would expect all engineers to have some opinions on these technologies and know some of the basics of what is offered. And I expect both pros and cons.
- Write a regular expression that can find all the [ phone numbers | websites | emails addresses ] in a file?
I think regular expressions are a must for anyone in this role. If the candidate is a bit rusty you can also let them use the man pages on the server, or print out basic regular expression syntax for them.
Troubleshooting & Debugging
- What is the worst mistake you have made at work? How did you fix it? What did you learn from it?
- Someone says the website is slow. How do you troubleshoot the problem? What steps do you take? At what point do you ask for help?
I love these two questions. I am looking for someone who is curious, ask good questions, and is capable of learning from his or her mistakes. The last one is a great one to dive into. Generally I always make this some database issue – like too many connections, but I want to see the candidate ask lots of questions and walk me through how they diagnose such a problem. Feel free to choose any sort of prognosis (even better if you have had to solve this one yourself in real life so you know the steps you took to get there), but pick it ahead of time so you lead the candidate in the right direction…
Technical Working problems
- You have an application running on its own server and you want to host it at www.yourdomain.com/newservice – what do you need to do so that when users go to that url it will resolve to the right host.
Note: there are several different ways to solve this, and they can depend on the web server you are using, so you may also want to state the webserver that a candidate has worked with or that you use – i.e. Apache, tomcat, etc.
If they don’t know off the top of their head, this is a great one to give them a computer with access to google and watch them figure it out.
- Write a script that will ping each of the following websites (insert list of 3-4 websites of your choice) and report the time they take to respond and then output these response to the screen. It should do this every 10 seconds for a minute before terminating.
Feel free to let them use google and there own laptop – whatever makes them comfortable. Note: not all candidates are comfortable with scripting, so you should only ask this if they claim proficiency on their resume.
- What is the best way to keep documentation up to date?
- What is the minimum amount of documentation you need to support a service/system?
If documentation is part of their job, the candidate should know what good documentation looks like and what they need to be responsible for a service. Experienced candidates should have some strong opinions on this since most of them have had to be the customers of good/bad documentation.
- How do you balance customer service skills and technical skills?
- Have you ever had to create a schedule for a project? What was the project? How did it go? Did you always hit your deadlines? If not, what did you learn from the experience?
- How do you like to work? By yourself, as a team? At home? At work? What are the conditions under which you do your best work?
- Tell me about a time when you “wowed” a customer.
- Are there any coworker behaviors that really drive you crazy? How do you deal with them?
- What makes you a great teammate? How do you think you could improve your soft skills to be an even better teammate?
With this line of questioning you are really trying to figure out if you would like to work with this person. Would they fit well with your team, and your culture, and the role? Sometimes systems engineers or systems administrators have a very big customer service portion helping employees or real customers, so being able to interface with people is important. And of course in any operations role if something goes wrong, hopefully the candidate will be able to handle the situation with grace – or at least not freak out.
- Where do you want to be in 5 years? How does this job/role fit into that vision?
- What would your last manager say about you? The good and the bad? How are you working to improve or mitigate the bad parts?
- When have you had the most fun at work?
- Tell me about a time you have gone over and above the call of duty.
- Which project is your biggest success and why?
- Have you ever read a book that made a big impact on you? What was it and what did you learn?
- Do you consider yourself weird? Why or why not?
These are a pretty standard set of questions I like to ask; of course you should pick and choose which ones make sense for your company and culture.
I hope this helps – feel free to add any other questions you know and like in the comments!
Oh and for those of you that just want the list of questions – you can download them in pdf here!