- File
Володимир
Data engineer
Contact information
The job seeker has entered a phone number .
Name, contacts and photo are only available to registered employers. To access the candidates' personal information, log in as an employer or sign up.
You can get this candidate's contact information from https://www.work.ua/resumes/15706963/
Uploaded file
This resume is posted as a file. The quick view option may be worse than the original resume.
Data Engineer
[
ABOUT ME
Python Data Engineer with over 6 years of experience designing and maintaining ETL pipelines, real-time data streams, and workflow automation.
Skilled in Apache Airflow, PySpark, Kafka, BigQuery, and cloud-based data solutions. Passionate about optimizing data processes, ensuring data
quality, and delivering scalable and reliable systems. Enjoy tackling complex challenges and driving improvements through collaboration and
innovative solutions.
WORK EXPERIENCE
Python Data Engineer | Murka Games | 2021 - Present
▪ Designed and implemented robust real-time data pipelines using Python and Apache Airflow to process high-volume streams.
▪ Integrated and processed data streams with Apache Kafka (Kafka Connect, kSQL, Flink) and Google Pub/Sub.
▪ Built scalable ETL pipelines with PySpark and Dataproc, delivering data into BigQuery as the primary analytical database.
▪ Developed and maintained data schemas to normalize incoming streaming data and ensure compatibility with existing database structures for
analytics and reporting.
▪ Implemented data quality checks, monitoring, and logging with Google Cloud Monitoring.
▪ Optimized data retrieval and processing in BigQuery, ensuring cost efficiency and performance.
▪ Leveraged cloud storage in GCS and AWS S3 for data ingestion and delivery.
▪ Automated deployment and operations using GitLab CI/CD, Ansible Playbooks, and Docker for containerization of jobs and services.
▪ Wrote and maintained unit and integration tests to ensure code reliability (pytest, mocks).
▪ Developed and maintained integrations with partner APIs, retrieving and processing cost/traffic data from external systems.
▪ Documented data engineering workflows and collaborated with cross-functional teams to ensure reliable data delivery.
▪ Continuously refined data engineering practices by adopting new technologies and best practices
▪ Monitored and troubleshooted data pipelines to ensure data integrity and availability
Python Backend Developer | PrivatBank | 2017 - 2021
▪ Engineered and managed ETL pipelines using Python, Apache Airflow, PySpark, and Pandas for critical banking data operations.
▪ Consolidated, validated, and cleaned data from diverse sources, including RDBMS (Sybase ASE, PostgreSQL, Oracle), files, APIs, and web services.
Developed stored procedures and Python scripts to support both batch and real-time data processing workflows.
▪ Led database migration projects, rewriting procedures and data handlers from Sybase ASE to PostgreSQL/Oracle.
▪ Automated scheduling, monitoring, and troubleshooting of ETL workflows, improving process efficiency and system reliability.
▪ Applied software engineering best practices, performance optimizations, and design patterns to enhance data integrity and maintainability.
▪ Mentored colleagues on ETL architecture, coding standards, and workflow design, fostering a collaborative and knowledgeable team environment.
▪ Ensured data preparation and integration for core banking systems, including balances, account positions, and other critical financial operations.
A C H I E V E M E N T S
▪ Developed an Airflow pipeline using AWS Athena to synchronize and align data between S3 and GCS in BigQuery. Conducted research and
benchmarking on performance and cost, enabling the removal of a dedicated cluster and reducing operational complexity.
▪ Replaced legacy Java-based senders with Kafka Connect pipelines that write directly from Kafka into BigQuery, simplifying data flows and
reducing maintenance overhead.
▪ Introduced HashiCorp Vault to securely manage secrets and credentials across pipelines, improving reliability and security of real-time data
processing.
▪ Implemented ksql-based streaming transformations to generate real-time insights from Kafka topics.
▪ Developed a data validation process to ensure stability and accuracy of streaming pipelines, preventing errors and data inconsistencies.
▪ Configured monitoring and alerting for all critical processes, including synchronization pipelines and streaming jobs.
▪ Implemented automatic restart of streaming jobs via Cloud Function, which checks jobs on a schedule and resubmits them in case of failure,
reducing reliance on manual intervention or alerts.
EDUCATION
MASTER OF STATE MANAGEMENT, ACADEMY OF CUSTOMS SERVICE OF UKRAINE, DNIPRO | 2014
MASTER OF ECONOMICS, DNIPRO STATE FINANCIAL ACADEMY, DNIPRO | 2012
SKILLS
Technologies: Apache Airflow, PySpark, Apache Kafka (Kafka Connect, kSQL, Flink), Google Pub/Sub, BigQuery, Dataproc, GCS, AWS S3, Docker, G
itLab CI/CD, Ansible, PostgreSQL, Oracle, Sybase ASE, Kubernetes, Apache Hadoop, Pandas, Vault, Graylog, Datadog, Prometheus
Languages: Python, SQL, Bash
Similar candidates
-
Програміст баз даних
Dnipro -
Junior Data engineer
Dnipro, Kyiv -
AI engineer
Dnipro, Vinnytsia , more 5 cities -
Інженер з інформаційних технологій
40000 UAH, Dnipro -
Data analyst, software engineer, python backend developer
Dnipro, Kyiv , more 2 cities -
Фахівець інформаційних технологій
Dnipro