Find a Job
322 available
Back to all search results

Senior Site Reliability Engineer

Ref: 78372

  • A generous overall package in line with experience will be on offer to the right candidate
  • 01 Dec 2022
  • Cork (Centre)
  • Permanent

Senior Site Reliability Engineer

About the Company

My client is a highly successful, dynamic, international company specialising in the provision of safety engineering services and solutions to a wide range of industry sectors. The employees operate in over 40 countries worldwide, contributing their commitment, creativity, and knowledge, which enable the client to maintain their position as one of the leading brands in safety.
Based in Cork, the Global Software Product Research and Development Centre is responsible for the creation of innovative, high quality software tools. Their highly skilled workforce specialize in: Research and Advanced Development Product Development Mobile Development Cloud Computing. The profile within the Software team includes: Software Architects, Software Developers, Software Testers, Requirements Engineers, Product Owners, Scrum Masters and Agile Coaches.

About the Position

This client contributes to the development of innovative industry leading software products that provide the human interface into the control of industrial automation systems. They are currently looking to add a Senior Site Reliability Engineer to their team. This is an opportunity for an experienced SRE or DevOps engineer to join a new team and to enter at an early stage, with the ability to shape the future of the systems and services.
The SRE team is responsible for the following broad headings within the organization; operations, availability, latency, performance, monitoring, change management, emergency response and capacity management.

Key Responsibilities

• Incident management and conducting post incident reviews
• Design and implementation of software to assist in operations and support
• Work closely with developer to ensure designed solutions meet non-functional requirements such as availability, performance, security and maintainability
• Improving the internal and external process and systems within the organization. Moving from ad-hoc to infrastructure and configuration as code though out the organization
• Implementing monitoring and alerting throughout in house and cloud-based systems
• Improving the reliability of systems though out the organization and working with product teams to help improve their products
• Support on production and inhouse systems
• Define SLOs and SLAs for Products
• Define and manage release budgets
• Managing the inhouse development and CICD systems
• Responsible for design, implementation, maintainability for robust, scalable, high-quality software and systems within the SRE domain
• Take responsibility for complex project tasks and contributes to technical decisions to ensure successful delivery.
• Contribute to the site architecture for all products
• Active member of the best practice SRE function striving for and achieving higher standards of individual and team performance
• Build relationships external to the team
• Drive and achieve knowledge sharing across all product
• Continuous education and development of technical skill set, and proven demonstration applied to domain
• Identify personal development opportunities, set goals and proven ability to deliver on them
• Mentor and train other engineers throughout the company and drive companywide improvement


Experience/Requirements

Essential Criteria
• BSc in a related field such as Computer Science, Computer Eng, Elec Eng or equivalent and 5-7 Years’ professional development or operations experience
• 2-3 years in a DevOps engineering or SRE role
• 2-3 years proven experience in cloud computing
• Experience in VMWare ESXi
• Experience in AWS
• Configuration Management Technologies, such as Ansible, Puppet, Chef or Salt
• Infrastructure as Code Technologies, such as Terraform and CloudFormation
• Experience with, and high-level understanding of, multiple operating systems. including, Ubuntu, MacOS & Windows
• Scripting in languages such as Bash or Python
• CICD and Build pipelines both in Jenkins and Gitlab
• Experience working with multiple teams to facilitate orderly project and release plans
• Experience in issue analysis in a cloud environment

Desired Criteria
• Databases such as MySQL / Postgres
• Experience in cloud based multi tenancy
• Defining SLO’s and SLA’s, defining and monitoring release budget
• Experience in emergency response & on call
• Understanding of the current best principles around Security Management
• Experience working in an agile development team using SCRUM
• Monitoring and Alerting with Prometheus & CloudWatch
• Logging with CloudWatch, ELK / EFK
• Experience in an SRE team and understanding of SRE principles
• Experience in backup and restore processes and procedures


Remuneration Package
A generous overall package in line with experience will be on offer to the right candidate.

Contact
Please contact Charlie Bigger on 01 5927861 or email or simply click the apply button. To view all live jobs with Brightwater and market insights, please visit our website: www.brightwater.ie