Site Reliability Engineer

Hybrid
English
Banking
Expert/Senior
Agile/Scrum

Technologies

Krakow

19 320 - 26 040 zł B2B

Dodaj do koszyka POLEĆ KANDYDATA

Join us, and ensure seamless system performance every day!

Krakow-based opportunity with the possibility to work 80% remotely!

As a Site Reliability Engineer, you will be working for our client, a leading financial institution heavily investing in Agile culture, DevOps processes, and Cloud Technologies. The new development team in Krakow, part of a long-term strategy to support a European platform, offers an exciting opportunity to contribute to the foundational stages of a critical project. This role involves ensuring system reliability, availability, and performance while supporting a dynamic, high-impact environment.

Your main responsibilities:

Managing application support operations, focusing on resiliency, availability, and monitoring system health and performance
Coordinating resolution of production incidents, conducting post-mortem/RCA to identify root causes and improve processes
Investigating, triaging, and resolving production incidents with a focus on technical signals and root cause analysis
Documenting post-incident recovery steps, contributing to process improvements, identifying deviations, and creating a Knowledge Base
Actively participating in the service management community, engaging in Incident Management, Problem Management, and Service Delivery
Defining and delivering tactical and strategic service improvements across the technical and process landscape
Applying SRE principles to continuously improve platform reliability, capacity, and performance, reducing toil and enhancing observability
Developing observability tools and techniques for monitoring, alerting, incident detection, response, capacity management, and release safety

You’re ideal for this role if you have:

4+ years of experience in developing, supporting distributed systems written in Java
Experience with Disaster Recovery methods and processes
A methodical approach to troubleshooting and problem-solving skills
Experience in application lifecycle management tooling: JIRA/Confluence, Ansible, Vulnerability Remediation, CI/CD automation
Experience implementing and managing Logging, Monitoring, and Alerting framework for hybrid cloud using tools such as Geneos, Grafana, InfluxDB, Splunk, Loki or any other similar tools
Understanding of RDBMS Database, Cloud Technology, Unix/Linux, Job scheduling e.g., Control-m or Autosys
Ability to lead technical conversations with various technical support groups
Excellent communication skills and experience working in Agile methodology

It is a strong plus if you have:

Experience with Apache Beam, Apache Flink, GCP, Redis, REST APIs
Familiarity with Spring Boot and Spring Cloud
Knowledge of Ansible and Jenkins for automation and deployment

#GETREADY to meet with us!

We would like to meet you. If you are interested please apply and attach your CV in English or Polish, including a statement that you agree to our processing and storing of your personal data. You can always also apply by sending us an email at recruitment@itds.pl.

Internal number #5627