Skip to content
View Kenantkurt's full-sized avatar

Highlights

  • Pro

Block or report Kenantkurt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Kenantkurt/README.md

πŸ‘‹ Hi, I'm Kenan Kurt

Data Engineer | Building end-to-end ELT pipelines | Python β€’ SQL β€’ BigQuery β€’ dbt β€’ Apache Airflow β€’ Docker

🎯 I build end-to-end ELT pipelines, design analytics data warehouses, and optimize data transformations using Apache Airflow, dbt, and BigQuery. I bring a strong analytics background (Power BI, SQL) and keep deepening my data engineering skills through hands-on projects and the DataTalks.Club Data Engineering Zoomcamp.


🎯 About Me

I'm a Data Engineer with a strong analytics background, building real-world data infrastructure projects. My focus is on end-to-end ELT pipelines, dbt transformations, and cloud data warehouses (BigQuery, PostgreSQL). I believe in writing clean, tested, and documented codeβ€”and I practice this in every project.

Current Location: Utrecht, Netherlands
Learning: DataTalks.Club Data Engineering Zoomcamp (in progress)


πŸ› οΈ Tech Stack

Core Data Engineering

  • Languages: Python 3.11, SQL (BigQuery, PostgreSQL)
  • Transformation: dbt (dbt-BigQuery), SQL for analytics
  • Orchestration: Apache Airflow 2.9+
  • Cloud Warehousing: Google BigQuery, PostgreSQL
  • Storage: Parquet, Google Cloud Storage (GCS)
  • Containerization: Docker, Docker Compose

Data Formats & APIs

  • Parquet, CSV, JSON
  • REST APIs (Python Requests)
  • Open-Meteo API (weather data enrichment)

Other Tools

  • Git & GitHub (version control, CI/CD)
  • Linux / Bash scripting
  • Jupyter Notebooks (EDA & prototyping)

πŸš€ Highlighted Projects

1. End-to-End Flight Delay Pipeline ⭐

Technologies: Airflow 2.9 | dbt | BigQuery | PostgreSQL | Python | Docker

A complete ELT analytics pipeline for U.S. domestic flight delays using real data from BTS (Bureau of Transportation Statistics).

What I Built:

  • βœ… Python ingestion scripts (Parquet generation, API enrichment, data cleaning)
  • βœ… PostgreSQL staging (local data landing zone)
  • βœ… BigQuery transformation (dbt: 5 staging + 5 mart models)
  • βœ… Apache Airflow orchestration (daily @ 9 AM UTC)
  • βœ… Data quality layer (15+ dbt tests: unique, not_null, accepted_range, composite keys)
  • βœ… BI dashboard with 4 analytical views

Key Metrics:

  • Dataset size: 2-3 GB/month, 20M+ flight records analyzed
  • Pipeline runtime: ~15 minutes
  • Query optimization: Leveraged BigQuery partitioning & clustering

Key Findings:

  • Identified airport-level inefficiencies as primary delay driver (stronger than weather/holidays)
  • Applied BigQuery partitioning & clustering to reduce query scan costs
  • Weather moderately affects delays; holiday periods show stable scheduling

πŸ‘‰ View Repository


2. BigQuery Taxi Data Warehouse

Technologies: BigQuery | GCS | SQL

Hands-on project comparing 3 table strategies for NYC Yellow Taxi data (20M+ trips, Jan–Jun 2024).

What I Learned:

  • External tables vs. regular tables (storage vs. query trade-offs)
  • Columnar storage impact on query costs
  • Partitioning effectiveness: 12x cost reduction on filtered queries
  • When to use clustering with partitions

Results:

  • Same query on non-partitioned table: 310 MB scanned
  • Same query on partitioned table: 26 MB scanned
  • Practical demonstration of BigQuery optimization

πŸ‘‰ View Repository


3. Data Engineering Practice Solutions

Technologies: Python | SQL | dbt | Docker | Pandas

Step-by-step solutions to the data-engineering-practice exercises (Daniel Beach's course).

Coverage:

  • βœ… File I/O & data ingestion (Python)
  • βœ… Web scraping & API integration
  • βœ… AWS S3 & cloud storage basics
  • βœ… JSON/CSV transformations
  • βœ… SQL joins & aggregations
  • βœ… Data modeling for Postgres
  • βœ… PySpark fundamentals
  • βœ… DuckDB for analytics
  • βœ… Polars lazy computation
  • βœ… Data quality with Great Expectations

πŸ‘‰ View Repository


πŸ“Š What I Focus On

  • ELT/ETL Pipelines: From raw data to production-ready analytics
  • dbt Best Practices: Staging layers, mart models, comprehensive testing
  • SQL Performance: Query optimization, BigQuery partitioning/clustering
  • Data Quality: dbt tests, schema validation, anomaly detection
  • Orchestration: Airflow DAG design, error handling, monitoring
  • Docker: Local reproducibility, image optimization
  • Code Quality: Clean SQL/Python, documentation, version control

πŸ“š Learning Path

Currently working through DataTalks.Club Data Engineering Zoomcamp modules:

  • βœ… Module 1: Docker & GCP setup
  • βœ… Module 2: dbt fundamentals
  • βœ… Module 3: BigQuery & data warehousing
  • πŸ”„ Module 4: Orchestration with Airflow (in progress)
  • πŸ”„ Currently going deeper into PySpark, Databricks (Delta Lake), and CI/CD for data pipelines

πŸŽ“ Skills Summary

Skill Proficiency Evidence
Python Intermediate+ Ingestion scripts, API integration, data transformation
SQL Advanced Complex joins, window functions, query optimization
dbt Intermediate+ 10+ production models, comprehensive tests, staging/mart patterns
BigQuery Intermediate+ External/regular/partitioned tables, cost optimization, real data at scale
Apache Airflow Intermediate DAG design, error handling, email alerts, daily orchestration
Docker Intermediate Docker Compose, multi-container setups, local dev environments
PostgreSQL Intermediate Schema design, indexing, data loading pipelines
Git Intermediate Version control, branching, commit hygiene

πŸ”— Featured Work

Best for: Data engineering interviews, portfolio reviews, hiring managers

πŸ“Œ Pinned Repositories:

  1. end-to-end-flight-delay-pipeline β€” End-to-end ELT pipeline (Airflow + dbt + BigQuery)
  2. bigquery-taxi-data-warehouse β€” Data warehouse design & optimization
  3. gz-dbt-repository β€” dbt project examples
  4. sql-financial-analytics-pipeline β€” Advanced SQL transformations

πŸ’‘ What Makes My Work Stand Out

βœ… Real-world projects β€” Not toy datasets. Real data, real pipelines, real trade-offs.
βœ… Data quality-first β€” Every pipeline includes comprehensive testing.
βœ… Clear documentation β€” READMEs that explain the "why," not just the "what."
βœ… Optimization mindset β€” BigQuery cost reduction, SQL efficiency, Airflow reliability.
βœ… Learning in public β€” Active in DataTalks.Club community, documenting learnings.


πŸ“§ Get In Touch

πŸ’Ό LinkedIn: kenan-tufan-k-263000308
πŸ“¬ Email: kenantkurt@gmail.com
πŸ™ GitHub: @Kenantkurt
πŸ“ Location: Utrecht, Netherlands


🎯 Open To

  • πŸ’Ό Data Engineering roles: Junior to mid-level
  • 🀝 Collaborations: Open-source data projects, learning groups
  • πŸ’¬ Discussions: Data pipelines, dbt best practices, SQL optimization

Let's talk data! Feel free to reach out on LinkedIn or email. I'm always eager to discuss data infrastructure, ask questions, and collaborate on interesting problems.


Last Updated: May 2, 2026
Status: Actively learning & building πŸš€

Pinned Loading

  1. end-to-end-flight-delay-pipeline end-to-end-flight-delay-pipeline Public

    End-to-end flight delay data pipeline using Airflow, dbt, and BigQuery

    Jupyter Notebook

  2. gz-dbt-repository gz-dbt-repository Public

  3. sql-financial-analytics-pipeline sql-financial-analytics-pipeline Public

    End-to-end SQL analytics pipeline built in BigQuery to join sales, product, and shipping data, compute financial KPIs, and analyze monthly profitability.

  4. airbnb-analytics airbnb-analytics Public

    Airbnb demand and occupancy analysis using SQL and Looker Studio

  5. bigquery-taxi-data-warehouse bigquery-taxi-data-warehouse Public

    Hands-on GCS and BigQuery project using NYC Yellow Taxi data with external tables, regular tables, partitioning, and clustering.

  6. data-engineering-practice-solutions data-engineering-practice-solutions Public

    My step-by-step solutions and notes for data engineering practice exercises.

    Python