Find a Job
305 available
Back to all search results

Site Reliability Engineer

Ref: 76047

  • 55,000-60,000
  • 09 Nov 2021
  • Cork (Centre)
  • Permanent

An experienced Site Reliability Engineering opportunity has become available with a global tech firm in Cork.

This is an opportunity for an experienced SRE or DevOps engineer to join a new team and to enter at an early stage, with the ability to shape the future of the systems and services of the business. The SRE team is responsible for the following operations, availability, latency, performance, monitoring, change management, emergency response and capacity management.

• Incident management and conducting post incident reviews

• Design and implementation of software to assist in operations and support

• Work closely with developer to ensure designed solutions meet non-functional requirements such as availability, performance, security and maintainability

• Improving the internal and external process and systems within the organization. Moving from ad-hoc to infrastructure and configuration as code though out the organization

• Implementing monitoring and alerting throughout in house and cloud-based systems

• Improving the reliability of systems though out the organization and working with product teams to help improve their products

• Support on production and inhouse systems

• Define and manage release budgets

• Managing the inhouse development and CICD systems

• Responsible for design, implementation, maintainability for robust, scalable, high-quality software and systems within the SRE domain

• Take responsibility for complex project tasks and contributes to technical decisions to ensure successful delivery.

• Contribute to the site architecture for all products

• Active member of the best practice SRE function striving for and achieving higher standards of individual and team performance


What is needed to do the job:


• BSc in a related field such as Computer Science, Computer Eng, Elec Eng or equivalent and 5-7 Years’ professional development or operations experience

• 2-3 years in a DevOps engineering or SRE role
• 2-3 years proven experience in cloud computing
• Experience in VMWare ESXi
• Experience in AWS
• Configuration Management Technologies, such as Ansible, Puppet, Chef or Salt
• Infrastructure as Code Technologies, such as Terraform and CloudFormation
• Experience with, and high-level understanding of, multiple operating systems. including, Ubuntu, MacOS & Windows
• Scripting in languages such as Bash or Python
• CICD and Build pipelines both in Jenkins and Gitlab
• Experience working with multiple teams to facilitate orderly project and release plans
• Experience in issue analysis in a cloud environment

Nice to have:

• Databases such as MySQL / Postgres
• Experience in cloud based multi tenancy
• Defining SLO’s and SLA’s, defining and monitoring release budget
• Monitoring and Alerting with Prometheus & CloudWatch• Logging with CloudWatch, ELK / EFK
• Experience in an SRE team and understanding of SRE principles
• Experience in backup and restore processes and procedures