• File

Володимир

Data engineer

City: Dnipro
City:
Dnipro

Contact information

The job seeker has entered a phone number .

Name, contacts and photo are only available to registered employers. To access the candidates' personal information, log in as an employer or sign up.

Uploaded file

Quick view version

This resume is posted as a file. The quick view option may be worse than the original resume.

VO LO DY M Y R M E D I A N Y I
Data Engineer
[open contact info](look above in the "contact info" section) | LinkedIn: vladimir-medyanyiy-474152223 | [open contact info](look above in the "contact info" section) | Dnipro, Ukraine

ABOUT ME

Python Data Engineer with over 6 years of experience designing and maintaining ETL pipelines, real-time data streams, and workflow automation.
Skilled in Apache Airflow, PySpark, Kafka, BigQuery, and cloud-based data solutions. Passionate about optimizing data processes, ensuring data
quality, and delivering scalable and reliable systems. Enjoy tackling complex challenges and driving improvements through collaboration and
innovative solutions.

WORK EXPERIENCE

Python Data Engineer | Murka Games | 2021 - Present

▪ Designed and implemented robust real-time data pipelines using Python and Apache Airflow to process high-volume streams.
▪ Integrated and processed data streams with Apache Kafka (Kafka Connect, kSQL, Flink) and Google Pub/Sub.
▪ Built scalable ETL pipelines with PySpark and Dataproc, delivering data into BigQuery as the primary analytical database.
▪ Developed and maintained data schemas to normalize incoming streaming data and ensure compatibility with existing database structures for
analytics and reporting.
▪ Implemented data quality checks, monitoring, and logging with Google Cloud Monitoring.
▪ Optimized data retrieval and processing in BigQuery, ensuring cost efficiency and performance.
▪ Leveraged cloud storage in GCS and AWS S3 for data ingestion and delivery.
▪ Automated deployment and operations using GitLab CI/CD, Ansible Playbooks, and Docker for containerization of jobs and services.
▪ Wrote and maintained unit and integration tests to ensure code reliability (pytest, mocks).
▪ Developed and maintained integrations with partner APIs, retrieving and processing cost/traffic data from external systems.
▪ Documented data engineering workflows and collaborated with cross-functional teams to ensure reliable data delivery.
▪ Continuously refined data engineering practices by adopting new technologies and best practices
▪ Monitored and troubleshooted data pipelines to ensure data integrity and availability

Python Backend Developer | PrivatBank | 2017 - 2021

▪ Engineered and managed ETL pipelines using Python, Apache Airflow, PySpark, and Pandas for critical banking data operations.
▪ Consolidated, validated, and cleaned data from diverse sources, including RDBMS (Sybase ASE, PostgreSQL, Oracle), files, APIs, and web services.
Developed stored procedures and Python scripts to support both batch and real-time data processing workflows.
▪ Led database migration projects, rewriting procedures and data handlers from Sybase ASE to PostgreSQL/Oracle.
▪ Automated scheduling, monitoring, and troubleshooting of ETL workflows, improving process efficiency and system reliability.
▪ Applied software engineering best practices, performance optimizations, and design patterns to enhance data integrity and maintainability.
▪ Mentored colleagues on ETL architecture, coding standards, and workflow design, fostering a collaborative and knowledgeable team environment.
▪ Ensured data preparation and integration for core banking systems, including balances, account positions, and other critical financial operations.

A C H I E V E M E N T S

▪ Developed an Airflow pipeline using AWS Athena to synchronize and align data between S3 and GCS in BigQuery. Conducted research and
benchmarking on performance and cost, enabling the removal of a dedicated cluster and reducing operational complexity.
▪ Replaced legacy Java-based senders with Kafka Connect pipelines that write directly from Kafka into BigQuery, simplifying data flows and
reducing maintenance overhead.
▪ Introduced HashiCorp Vault to securely manage secrets and credentials across pipelines, improving reliability and security of real-time data
processing.
▪ Implemented ksql-based streaming transformations to generate real-time insights from Kafka topics.
▪ Developed a data validation process to ensure stability and accuracy of streaming pipelines, preventing errors and data inconsistencies.
▪ Configured monitoring and alerting for all critical processes, including synchronization pipelines and streaming jobs.
▪ Implemented automatic restart of streaming jobs via Cloud Function, which checks jobs on a schedule and resubmits them in case of failure,
reducing reliance on manual intervention or alerts.

EDUCATION

MASTER OF STATE MANAGEMENT, ACADEMY OF CUSTOMS SERVICE OF UKRAINE, DNIPRO | 2014

MASTER OF ECONOMICS, DNIPRO STATE FINANCIAL ACADEMY, DNIPRO | 2012

SKILLS

Technologies: Apache Airflow, PySpark, Apache Kafka (Kafka Connect, kSQL, Flink), Google Pub/Sub, BigQuery, Dataproc, GCS, AWS S3, Docker, G
itLab CI/CD, Ansible, PostgreSQL, Oracle, Sybase ASE, Kubernetes, Apache Hadoop, Pandas, Vault, Graylog, Datadog, Prometheus

Languages: Python, SQL, Bash

Similar candidates

All similar candidates

Candidates at categories

Candidates by city


Compare your requirements and salary with other companies' jobs: