The vacancy is well-defined but lacks compensation details, affecting overall attractiveness to applicants.
no salary info
Job description
Alfa-Bank is looking for a Data Engineer to implement high-load data processing pipelines and prepare data for machine learning models.
Responsibilities
### Responsibilities
- Implement high-load data processing pipelines to ensure reliable data replication from the Bank's IT systems.
- Prepare data in target analytical storage (DataLake, SandBox, FeatureStore) for building features necessary for machine learning models.
- Develop and maintain documentation for the developed functionality.
- Reflect task status in Jira in a timely manner.
- Review code quality (code review) written by data engineers and junior data engineers.
Requirements
### Requirements
- Python - strong knowledge of data structures and algorithms, effective application of OOP and FP principles, experience in writing unit and integration tests, knowledge and experience with data processing and analysis libraries - numpy, pandas.
- Experience in developing and implementing services for loading and processing unstructured and weakly structured data (text, xml, json) from external sources.
- Ability to understand data provider APIs using available documentation.
- SQL - ability to create complex queries using analytical window functions and use profiling tools to optimize their performance, experience with Oracle, Postgres, Greenplum databases.
- Strong knowledge and experience with development, planning, and monitoring tools (workflow engines) for batch data processing.
- Airflow - experience in developing complex, high-load data processing applications based on PySpark, strong knowledge of Spark settings and their impact on Spark application performance.
About Alfa-Bank
Alfa-Bank is one of Russia's largest private banks, providing a wide range of financial services including retail and corporate banking, digital solutions, and data processing initiatives. It actively hires for IT roles such as Python developers to build data integration systems, RESTful APIs, and recommendation engines within its banking ecosystem.