Hi, I'm Oleksandr Tkachenko

Data Engineer

Results-driven Data Engineer with 4+ years of specialized experience building scalable data infrastructure and robust ETL pipelines. Proven expertise in Python, SQL, Apache Airflow, and major cloud platforms, with successful track record improving data processing efficiency by 15% and ensuring 99.5%+ data accuracy for enterprise and U.S.-based projects.

About Me

I'm a results-driven Data Engineer with 4+ years of specialized experience in building scalable data infrastructure, robust ETL pipelines, and optimized cloud data warehouses. My journey began with a BSc in Data Science and Engineering from Kaunas University of Technology (KTU) in Lithuania, where I focused on data engineering solutions and parallel computing.

Throughout my career, I've worked on enterprise projects from Lithuania to Chicago, IL, successfully architecting systems that improved data processing efficiency by 15% and ensured over 99.5% data accuracy. I combine technical expertise in Python, SQL, Apache Airflow, and cloud platforms (AWS, GCP, Azure) with strong DevOps practices using Docker, Kubernetes, and Terraform.

My experience spans from building real-time streaming capabilities with Apache Kafka to migrating legacy Hadoop clusters to cloud-native solutions. I've also led successful freelance projects, delivering full-stack data solutions that eliminated manual processes and empowered clients with self-service analytics capabilities.

Scalable Data Pipelines
Cloud Data Architecture
Real-time Data Processing

Technical Skills

Programming

Python SQL Scala Java Bash

Pipeline & Orchestration

Apache Airflow Apache Spark PySpark Apache Kafka ETL/ELT Design Hadoop HDFS

Data Warehousing

Snowflake Amazon Redshift Google BigQuery Star Schema Dimensional Modeling

Cloud Platforms & DevOps

AWS Google Cloud Azure Terraform CloudFormation

Databases

PostgreSQL MySQL SQL Server MongoDB

DevOps & CI/CD

Docker Kubernetes Jenkins GitLab CI GitHub Actions ArgoCD

Analytics & Monitoring

Pandas NumPy Tableau Power BI Prometheus Grafana

Featured Projects

E-Commerce Analytics Pipeline

Architected full-stack data solution automating financial reporting and inventory management. Built end-to-end ETL pipeline extracting daily sales from Shopify/PayPal APIs, eliminating 20+ hours of monthly manual Excel work with automated report-ready data by 9 AM daily.

Python Pandas PostgreSQL Shopify API Power BI

Legacy Database Modernization

Led migration of 50+ GB operational data from legacy SQL Server to modern PostgreSQL with zero downtime. Optimized customer analytics dashboard performance, reducing report generation time from 2 minutes to under 15 seconds through query refactoring.

PostgreSQL SQL Server Python Data Migration

Scalable Data Infrastructure - Project Roseland

Engineered enterprise-scale data pipelines supporting analytics and BI for Chicago-based project. Reduced query costs by 15% through optimization, implemented real-time streaming with Kafka, and achieved 99.5%+ data accuracy through automated quality frameworks.

Snowflake Apache Kafka Apache Airflow Python BigQuery

Multi-Client Data Integration Framework

Delivered reliable data infrastructure for 5+ clients across e-commerce, marketing, and professional services. Built reusable Python framework for automated data quality checks, maintaining >99% data accuracy and proactively identifying discrepancies.

Python ETL Framework Data Quality Multi-Client

Professional Experience

Data Engineer

Project Roseland, Chicago, IL 2022 - 2025
  • Engineered scalable data pipelines supporting analytics and BI for U.S.-based project
  • Optimized data warehouse performance, reducing query costs by 15%
  • Implemented automated quality frameworks achieving 99.5%+ data accuracy
  • Built real-time streaming with Apache Kafka for high-velocity transactional data
  • Led migration from on-premise Hadoop to cloud-native solutions

Data Engineer

Cogniteq, Vilnius, Lithuania 2020 - 2022
  • Built and optimized ETL pipelines for enterprise clients
  • Deployed change data capture (CDC) reducing reporting latency from 24 hours to 30 minutes
  • Developed reusable Python libraries accelerating pipeline development by 25%
  • Optimized Apache Spark jobs improving processing speeds by 35%

Software Developer (DevOps Focus)

Emerline, Vilnius, Lithuania 2018 - 2020
  • Developed web applications with focus on DevOps practices
  • Containerized services using Docker and Kubernetes orchestration
  • Built CI/CD pipelines using Jenkins/GitLab CI
  • Managed AWS infrastructure with Terraform (IaC)

BSc Data Science and Engineering

Kaunas University of Technology (KTU) 2014 - 2018
  • Specialized in data engineering solutions and parallel computing
  • Key modules: Programming for Data Processing, Distributed Databases, Machine Learning

Let's Connect

I'm always interested in discussing data engineering challenges, new opportunities, or potential collaborations. Feel free to reach out!