Find a Job
166 available
Back to all search results

Site Reliability Engineer

Ref: 80371

  • A generous overall package in line with experience will be on offer to the right candidate.
  • 07 Dec 2023
  • Cork (Centre)
  • Permanent

Site Reliability Engineer

About the Company

My client is a highly successful, dynamic, international company specialising in the provision of safety engineering services and solutions to a wide range of industry sectors. The employees operate in over 40 countries worldwide, contributing their commitment, creativity, and knowledge, which enable the client to maintain their position as one of the leading brands in safety.
Based in Cork, the Global Software Product Research and Development Centre is responsible for the creation of innovative, high quality software tools. Their highly skilled workforce specialize in: Research and Advanced Development Product Development Mobile Development Cloud Computing. The profile within the Software team includes: Software Architects, Software Developers, Software Testers, Requirements Engineers, Product Owners, Scrum Masters and Agile Coaches.


About the Position

This client contributes to the development of innovative industry leading software products that provide the human interface into the control of industrial automation systems. They are currently looking to add a Senior Site Reliability Engineer to their team. This is an opportunity for an experienced SRE or DevOps engineer to join a new team and to enter at an early stage, with the ability to shape the future of the systems and services.
The SRE team is responsible for the following broad headings within the organization; operations, availability, latency, performance, monitoring, change management, emergency response and capacity management.


Key Responsibilities

• Design, implementation and maintainability for robust, scalable, high-quality software and systems within the SRE domain
• Management and support of the in-house development systems, CI/CD pipelines and tools, monitoring and alerting, leaning on automation to streamline activities and to reduce toil
• Work closely with developers and architects to ensure designed solutions meet non-functional requirements such as availability, performance, security and maintainability
• Incident management, post-mortem reviews and continuous improvement initiatives, contributing to the evolution of processes and systems within the  organization
• Contribute to the definition of key metrics and technical decisions driving products and their delivery: SLOs and SLAs, architecture, best practices, cost optimization
• Take responsibility for complex project tasks, strive for and achieve higher standards of individual and team performance
• Build relationships external to the team
• Drive and achieve knowledge sharing across all product
• Identify personal development opportunities, set goals and deliver against them, turn learning into impactful on-the-job contributions
• Mentor and train other engineers throughout the company and drive company-wide improvement


Experience/Requirements

• 3-5 years using AWS platform and services
• Experience with common tools and technologies used within CI/CD and Build pipelines: Terraform, Jenkins, Gitlab, Nexus, Ansible, Maven, Docker, Helm
• BSc in an IT related field (e.g Computer Science, Cloud Computing, Engineering) or 3-5 years’ professional experience on cloud operations and/or cloud platforms in a DevOps engineering or SRE role
• Experience with, and high-level understanding of, common operating systems including Ubuntu and Windows
• Experience scripting in Bash, Python or Powershell
• Experience troubleshooting issues in a cloud environment
• Experience working with multiple teams to facilitate orderly project and release plans
• Familiarity with VMWare vSphere (ESXi, vCenter) desirable Essential Criteria
• Experience on as many AWS services as possible: compute (EC2, Lambda) and containerization (EKS, ECR), storage (S3), database (RDS, Dynamo), networking (ELB, VPC), automation (CloudFormation), IAM (Cognito), security (Security Hub, Shield, GuardDuty, Control Tower, KMS…), monitoring and logging (Prometheus, CloudWatch, Cloudtrail…), backup and configuration management (Backup, Config)
• Experience in an SRE team and understanding of SRE principles
• Experience in backup and restore processes and procedures
• Experience in cloud based multi tenancy
• Experience in emergency response & on call
• Understanding of the current best practices around Security
Management and patching, branching strategy, release management, Linux administration
• Experience working in an agile development team using SCRUM

                                   
Remuneration Package

60K-70K negotiable along with an outstanding benefits package.

Contact
Please contact Charlie Bigger on 01 5927861 or email or simply click the apply button. To view all live jobs with Brightwater and market insights, please visit our website: www.brightwater.ie

#LI-Hybrid #LI-CB1