If you like a challenging environment where you’re working with the best and are encouraged to learn and experiment daily, there’s no better place — guaranteed! :)
What you will do
- Operations and Service Availability: Participate in a 24/7 operations team to guarantee service availability, managing day-to-day alerts, system checks, and issue escalations;
- Monitoring and Troubleshooting: Actively monitor and troubleshoot alerts and issues within SaaS environments.
- Utilize custom dashboards for effective troubleshooting as needed;
- Infrastructure Knowledge: Gain proficiency in our existing infrastructure, particularly Docker Swarm, to effectively manage and support the environment;
- Root Cause Analysis (RCA): Conduct thorough RCAs to identify the root causes of issues and implement corrective actions to prevent future occurrences;
- Alert Management: Investigate alerts, create action plans, and delegate tasks to the appropriate team members;
- Support and Communication: Handle support requests, engage in customer calls to explain RCAs, and communicate effectively with managers, teams, and customers about product monitoring risks, issues, and changes;
- Automation and Feedback: Identify automation opportunities to streamline RCAs and provide valuable feedback to the product and engineering teams to enhance product performance, logging, tracing, and monitoring;
- Documentation and Compliance: Maintain process and procedure documentation and conduct internal audits to ensure SaaS infrastructure security and compliance;
- Collaboration and Improvement: Work collaboratively with support teams and customers to identify and resolve SaaS environment issues. Contribute to the improvement of monitoring, alerting, and overall system health.
- Ability to operate independently and collaboratively in a team environment;
- Proficient in EKS, Terraform, Helm, Docker, and Docker Swarm;
- Strong sense of responsibility and accountability for delivering high-quality work;
- Excellent communication skills, with the ability to effectively convey issues and RCAs to customers;
- Experience with AWS, cloud and network administration, and SaaS product/application support;
- Knowledgeable in infrastructure, security, compliance, Prometheus, Grafana, Linux, and shell scripting (Python, shell scripting);
- Understanding of APIs, databases, systems architecture, and design.
- Professional growth
- Competitive compensation
- A selection of exciting projects
- Flextime
Veure més
No et perdis res!
Uneix-te a la comunitat de wijobs i rep per email les millors ofertes d'ocupació
Mai no compartirem el teu email amb ningú i no t'enviarem correu brossa
Subscriu-te araDarreres ofertes d'ocupació de Enginyer/a DevOps a Madrid
Michael Page
Machine Learning Engineer
7 de maigMercedes-Benz Group Services Madrid
Madrid, ES
Data Engineer 2024
4 de maigKairós Digital Solutions S.L.
Madrid, ES
Devops Ansible/Hithub
3 de maigGrupo NS
Grupo NS
Capgemini Engineering
Madrid, ES
DevOps Cloud AWS
30 d’abr.arelance
Malthus Darwin
Data Engineer
28 d’abr.Movilges IT Consulting
Madrid, ES