Site Reliability Engineer: Why Experience Will Get You the Job

Getting hired for any position depends on having the right education and experience. For site reliability engineers, more than many other positions, having the right experience is more important than the right degree.

The reason for this is that site reliability engineers need to keep critical applications running. This requires a level of understanding that only comes from real-world experience; book smarts just aren’t enough to provide the insights that let SREs resolve problems quickly.

Build Technical Experience

The SRE position also requires systems understanding that runs both deep and broad. SREs need to understand networking, systems administration, databases, applications development, and all the interactions between them. SREs can be involved in architecting the environment where the applications will run and need to know how to bring all the components together effectively.

Experience working with programming languages, including both high-level languages like Java and scripting languages like Python, is necessary to develop the tools SREs use. Writing a few "hello world" level programs doesn't offer deep experience to build and debug complex applications.

Develop Trouble-Shooting Skills

The more problems you solve, the better you get at solving problems. A large part of the SRE role is solving problems; often, they're solved by recognizing similarities to a previous issue. The more experience you have solving problems, the bigger the dataset you can apply pattern recognition to. You don't have to start solving every problem from first principles; you can jump right to the most likely sources of trouble. Shortcutting the problem-solving process means shortening the length of time the problem exists.

Develop Serenity

Another reason companies look for experience in their SREs is that the position requires interacting with other internal organizations to investigate and resolve problems as quickly as possible. Things get stressful fast; when a core application is down, business doesn't get done and companies lose money, as well as potentially taking a hit to their image. Developing the ability to stay calm and focused, and work through analyses when managers are losing their cool around you, is key to getting the job done. This kind of battle hardening only comes from living through complicated real-world problems and makes you more effective at your SRE job.