Data Engineer | Building end-to-end ELT pipelines | Python β’ SQL β’ BigQuery β’ dbt β’ Apache Airflow β’ Docker
π― I build end-to-end ELT pipelines, design analytics data warehouses, and optimize data transformations using Apache Airflow, dbt, and BigQuery. I bring a strong analytics background (Power BI, SQL) and keep deepening my data engineering skills through hands-on projects and the DataTalks.Club Data Engineering Zoomcamp.
I'm a Data Engineer with a strong analytics background, building real-world data infrastructure projects. My focus is on end-to-end ELT pipelines, dbt transformations, and cloud data warehouses (BigQuery, PostgreSQL). I believe in writing clean, tested, and documented codeβand I practice this in every project.
Current Location: Utrecht, Netherlands
Learning: DataTalks.Club Data Engineering Zoomcamp (in progress)
- Languages: Python 3.11, SQL (BigQuery, PostgreSQL)
- Transformation: dbt (dbt-BigQuery), SQL for analytics
- Orchestration: Apache Airflow 2.9+
- Cloud Warehousing: Google BigQuery, PostgreSQL
- Storage: Parquet, Google Cloud Storage (GCS)
- Containerization: Docker, Docker Compose
- Parquet, CSV, JSON
- REST APIs (Python Requests)
- Open-Meteo API (weather data enrichment)
- Git & GitHub (version control, CI/CD)
- Linux / Bash scripting
- Jupyter Notebooks (EDA & prototyping)
Technologies: Airflow 2.9 | dbt | BigQuery | PostgreSQL | Python | Docker
A complete ELT analytics pipeline for U.S. domestic flight delays using real data from BTS (Bureau of Transportation Statistics).
What I Built:
- β Python ingestion scripts (Parquet generation, API enrichment, data cleaning)
- β PostgreSQL staging (local data landing zone)
- β BigQuery transformation (dbt: 5 staging + 5 mart models)
- β Apache Airflow orchestration (daily @ 9 AM UTC)
- β Data quality layer (15+ dbt tests: unique, not_null, accepted_range, composite keys)
- β BI dashboard with 4 analytical views
Key Metrics:
- Dataset size: 2-3 GB/month, 20M+ flight records analyzed
- Pipeline runtime: ~15 minutes
- Query optimization: Leveraged BigQuery partitioning & clustering
Key Findings:
- Identified airport-level inefficiencies as primary delay driver (stronger than weather/holidays)
- Applied BigQuery partitioning & clustering to reduce query scan costs
- Weather moderately affects delays; holiday periods show stable scheduling
π View Repository
Technologies: BigQuery | GCS | SQL
Hands-on project comparing 3 table strategies for NYC Yellow Taxi data (20M+ trips, JanβJun 2024).
What I Learned:
- External tables vs. regular tables (storage vs. query trade-offs)
- Columnar storage impact on query costs
- Partitioning effectiveness: 12x cost reduction on filtered queries
- When to use clustering with partitions
Results:
- Same query on non-partitioned table: 310 MB scanned
- Same query on partitioned table: 26 MB scanned
- Practical demonstration of BigQuery optimization
π View Repository
Technologies: Python | SQL | dbt | Docker | Pandas
Step-by-step solutions to the data-engineering-practice exercises (Daniel Beach's course).
Coverage:
- β File I/O & data ingestion (Python)
- β Web scraping & API integration
- β AWS S3 & cloud storage basics
- β JSON/CSV transformations
- β SQL joins & aggregations
- β Data modeling for Postgres
- β PySpark fundamentals
- β DuckDB for analytics
- β Polars lazy computation
- β Data quality with Great Expectations
π View Repository
- ELT/ETL Pipelines: From raw data to production-ready analytics
- dbt Best Practices: Staging layers, mart models, comprehensive testing
- SQL Performance: Query optimization, BigQuery partitioning/clustering
- Data Quality: dbt tests, schema validation, anomaly detection
- Orchestration: Airflow DAG design, error handling, monitoring
- Docker: Local reproducibility, image optimization
- Code Quality: Clean SQL/Python, documentation, version control
Currently working through DataTalks.Club Data Engineering Zoomcamp modules:
- β Module 1: Docker & GCP setup
- β Module 2: dbt fundamentals
- β Module 3: BigQuery & data warehousing
- π Module 4: Orchestration with Airflow (in progress)
- π Currently going deeper into PySpark, Databricks (Delta Lake), and CI/CD for data pipelines
| Skill | Proficiency | Evidence |
|---|---|---|
| Python | Intermediate+ | Ingestion scripts, API integration, data transformation |
| SQL | Advanced | Complex joins, window functions, query optimization |
| dbt | Intermediate+ | 10+ production models, comprehensive tests, staging/mart patterns |
| BigQuery | Intermediate+ | External/regular/partitioned tables, cost optimization, real data at scale |
| Apache Airflow | Intermediate | DAG design, error handling, email alerts, daily orchestration |
| Docker | Intermediate | Docker Compose, multi-container setups, local dev environments |
| PostgreSQL | Intermediate | Schema design, indexing, data loading pipelines |
| Git | Intermediate | Version control, branching, commit hygiene |
Best for: Data engineering interviews, portfolio reviews, hiring managers
π Pinned Repositories:
end-to-end-flight-delay-pipelineβ End-to-end ELT pipeline (Airflow + dbt + BigQuery)bigquery-taxi-data-warehouseβ Data warehouse design & optimizationgz-dbt-repositoryβ dbt project examplessql-financial-analytics-pipelineβ Advanced SQL transformations
β
Real-world projects β Not toy datasets. Real data, real pipelines, real trade-offs.
β
Data quality-first β Every pipeline includes comprehensive testing.
β
Clear documentation β READMEs that explain the "why," not just the "what."
β
Optimization mindset β BigQuery cost reduction, SQL efficiency, Airflow reliability.
β
Learning in public β Active in DataTalks.Club community, documenting learnings.
πΌ LinkedIn: kenan-tufan-k-263000308
π¬ Email: kenantkurt@gmail.com
π GitHub: @Kenantkurt
π Location: Utrecht, Netherlands
- πΌ Data Engineering roles: Junior to mid-level
- π€ Collaborations: Open-source data projects, learning groups
- π¬ Discussions: Data pipelines, dbt best practices, SQL optimization
Let's talk data! Feel free to reach out on LinkedIn or email. I'm always eager to discuss data infrastructure, ask questions, and collaborate on interesting problems.
Last Updated: May 2, 2026
Status: Actively learning & building π
