- Home
- Remote Jobs
- Senior Site Reliability Engineer
Already filled
Don't miss the next one. Get matching roles delivered to your inbox.
Senior Site Reliability Engineer
Job summary
Work model
Requirements
Must have:
- BSc in Engineering, Computer Science, or equivalent practical experience.
- 5+ years of experience in Site Reliability Engineering.
- Strong background in a technical or IT-focused role.
- Hands-on experience with configuration management tools such as Ansible, Puppet, Chef, or similar platforms.
- Professional experience working in a public cloud environment such as Azure, AWS, or Google Cloud Platform.
- Solid troubleshooting and support experience with Linux and Windows servers.
- Experience with system and application monitoring tools such as Prometheus, Grafana, Nagios, or CloudWatch.
- Familiarity with source control systems such as Git or SVN.
- Ability to design cloud architecture and technical solutions that support business priorities.
- Broad technical skill set and a proactive, enthusiastic approach to technology.
- Excellent verbal and written communication skills.
- Ability to serve as a technical point of reference, share best practices, and coach colleagues.
Preferred:
- Azure or AWS certifications.
- Experience using orchestration tools such as Terraform, Ansible, or CloudFormation.
- Experience moving applications from on-premises infrastructure to the public cloud.
- Familiarity with blue-green deployment methods.
- Experience with continuous integration and delivery tools such as GitLab or Jenkins.
- Experience working with containerized environments such as Docker.
- Familiarity with log management tools such as Elastic Stack, Graylog, or Splunk.
- Experience with enterprise databases such as MySQL or Microsoft SQL Server.
- Understanding of change control processes and related procedures.
- Experience using secret management services such as HashiCorp Vault.
- Familiarity with a high-level programming language.
Responsibilities
- Deliver resilient application platforms using Infrastructure as Code and other DevOps practices.
- Monitor and support mission-critical, high-revenue business applications on an ongoing basis.
- Investigate, diagnose, and resolve complex system and application incidents.
- Collaborate closely with development, QA, IT operations, customer operations, and project management teams.
- Create and maintain technical documentation for both technical and non-technical audiences.
- Participate in an on-call rotation to help maintain 24/7, 365-day system availability.
- Work across a diverse range of technologies as a leading member of the team.
Company
We are seeking a remote professional to join a friendly team working across a diverse technology landscape. We invest in our people and offer comprehensive benefits to eligible employees, including medical, dental, and vision insurance, HSA, FSA, 401(k), and life, disability, and ADD insurance. Salaried employees receive paid time off, while hourly employees may receive paid sick leave where required by law. This role does not include bonuses, incentives, or commissions. Compensation is determined by experience, skills, education, certifications, seniority, location, performance, and business needs. We are an equal opportunity employer committed to fair consideration for all qualified applicants.