Machine Learning Data Engineer

  • Hybrid
  • English
  • Banking
  • Regular
  • Agile/Scrum
Dodaj do koszyka POLEĆ KANDYDATA

Join us, and enhance data solutions with the latest technologies and tools!

Krakow-based opportunity with the possibility to work 80% remote.

As a Machine Learning Data Engineer, you will be working for our client, a leading global financial institution, known for building innovative digital solutions and transforming the banking industry. You will play a key role in supporting their data and digital transformation initiatives by developing and optimizing data engineering processes. Working with cutting-edge technologies, you’ll contribute to the development of robust and scalable data solutions for critical financial services, handling everything from data pipelines to cloud integrations. You’ll be part of a dynamic team working on both greenfield projects and established banking applications.

Your main responsibilities:

  • Developing and optimizing data engineering processes
  • Building robust, fault-tolerant data solutions for both cloud and on-premise environments
  • Automating data pipelines to ensure seamless data flow from ingestion to serving
  • Creating well-tested, clean code in line with modern software engineering principles
  • Working with cloud technologies (AWS, Azure, GCP) to support large-scale data operations
  • Supporting data transformation and migration efforts from on-premise to cloud ecosystems
  • Designing and implementing scalable data models and schemas
  • Maintaining and enhancing big data technologies such as Hadoop, HDFS, Spark, and Cloudera
  • Collaborating with cross-functional teams to solve complex technical problems
  • Contributing to the development of CI/CD pipelines and version control practices

You’re ideal for this role if you have:

  • Strong experience in the Data Engineering Lifecycle, especially in building data pipelines
  • Proficiency in Python, Pyspark, and the Python ecosystem
  • Experience with cloud platforms such as AWS, Azure, or GCP (preferably GCP)
  • Expertise in Hadoop on-premise distributions, particularly Cloudera
  • Experience with big data tools such as Spark, HDFS, HIVE, and Databricks
  • Knowledge of data lake formation, data warehousing, and schema design
  • Strong understanding of SQL and NoSQL databases
  • Ability to work with data formats like Parquet, ORC, and Avro
  • Familiarity with CI/CD pipelines and version control tools like Git
  • Strong communication skills to collaborate with diverse teams

It is a strong plus if you have:

  • Experience with ML models and MLOps
  • Exposure to building real-time event streaming pipelines with tools like Kafka or Apache Flink
  • Familiarity with containerization and DevOps practices
  • Experience in data modeling and handling semi-structured data
  • Knowledge of modern ETL and ELT processes
  • Understanding of the trade-offs between different data storage technologies

#GETREADY  to meet with us!

We would like to meet you. If you are interested please apply and attach your CV in English or Polish, including a statement that you agree to our processing and storing of your personal data. You can always also apply by sending us an email at recruitment@itds.pl.

Internal number #6506

Benefits

Access to +100 projects
Access to Healthcare
fintech-delivery
Access to Multisport
Training platforms
Access to Pluralsight
Make your CV shine
B2B or Permanent Contract
Flexible & remote work
Flexible hours and remote work

Aplikuj na to stanowisko


    Wyrażam zgodę na otrzymywanie informacji marketingowych od ITDS Polska na podany adres e-mail Potrzebujemy Twojej zgody na procesy rekrutacyjne na wybrane stanowiska. Prosimy o zamieszczenie w CV zgody na przetwarzanie danych lub przesłanie oświadczenia o wyrażeniu zgody na adres privacy@itds.pl. Możesz również wyrazić zgodę na przyszłe procesy rekrutacyjne na podobne stanowiska.