Site Reliability Engineer Facebook is seeking talented operations engineers to join the Site Reliability Engineering team in Palo Alto, CA. Telecommuting is not an option. The ideal candidate will have strong communication skills, a passion for tinkering with Linux, and an almost insane fondness for fast-paced, seat-of-your-pants troubleshooting and crisis management. This position reports to the Lead Site Reliability Engineer. Responsibilities Monitor the stability and performance of all aspects of the site and initiate corrective action if needed Remotely troubleshoot and diagnose hardware problems Debug issues with the operating system, applications, databases, and network Respond to utilization variances across multiple data centers Identify and triage all outage related events Track issues and run reports. Requirements 5-7+ years Linux support/sys admin experience in an Internet operations environment BA/BS in Computer Science or a related field, or equivalent experience Good working knowledge of Linux OS fundamentals (preferably from the RedHat branch), Linux fault-finding methodologies and tools, TCP/IP, Apache and MySQL Previous experience, or understanding of memcache is a definite plus Demonstrable understanding of network load-balancing principles - F5 experience is a plus Experience working with network management systems and monitoring tools, such as Nagios, Ganglia and Cacti Competency in Bash scripting - PHP, Perl or Python are a plus Solid understanding of the functional principles of the LAMP stack A sense of urgency in responding to, owning, and resolving all critical issues that relate to the performance of the site and/or core infrastructure Obsessive-compulsive attention to detail, and demonstrable lateral thinking - we live outside the box! Excellent verbal and written communication skills Willingness to work shifts. IF INTERESTED, PLEASE SEND YOUR RESUME TO MICHELLE BOSTOCK (mbostock-at-facebook.com). Michelle Bostock | Facebook Recruiting | 1601 S. California ave. | Palo Alto, Ca | 94304 SRE Awesomeness http://www.facebook.com/notes/facebook-engineering/site-reliability-engineering-at-facebook/291616313919