Skip to content

bashoori/bashoori

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

200 Commits
 
 

Repository files navigation

Hi, I'm Bita

Data Engineer focused on building reliable, scalable data systems

🔗 LinkedIn • 🌐 myHub


About Me

I’m a Data Engineer with 5+ years of experience working with data systems that are often fragmented, inconsistent, and difficult to trust.

My work focuses on bringing structure to those environments by designing pipelines that are reliable, observable, and maintainable. I’ve worked across healthcare, retail, and enterprise systems where data is business-critical and small inconsistencies have real impact.

I’m particularly interested in how data platforms evolve, from legacy ETL toward modern, cloud-based architectures.


What I Focus On

  • Designing end-to-end data pipelines from ingestion to reporting
  • Structuring data using medallion and dimensional modeling approaches
  • Improving data quality through validation, monitoring, and clear transformation logic
  • Integrating multiple systems into consistent, analysis-ready datasets
  • Building toward scalable data platforms using Azure, Fabric, and Databricks

Tech Stack

Languages
Python · SQL · PySpark

Data Platforms & Orchestration
Azure Data Factory · Databricks · Apache Airflow

Cloud & Storage
Azure · AWS S3

Data Warehousing & Modeling
Redshift · PostgreSQL · Dimensional Modeling

Analytics & Reporting
Power BI

Other
Git · Docker · API Integration


Selected Projects

TransLink GTFS Data Warehouse

End-to-end data warehouse built on real GTFS transit data using a medallion architecture.

  • Structured raw transit feeds into validated Bronze, Silver, and Gold layers
  • Addressed domain-specific challenges such as time values beyond 24:00
  • Built dimensional models to support time-based analysis and reporting
  • Embedded data quality checks across pipeline layers

🔗 https://github.com/bashoori/transit_data_warehouse


Global Retail Lakehouse (Microsoft Fabric)

Designed a lakehouse platform for a multi-region retail scenario.

  • Implemented medallion architecture to standardize ingestion and transformation
  • Built unified data models for customers, products, and sales
  • Focused on creating consistent datasets across regions for scalable reporting

🔗 https://github.com/bashoori/Global-Retail-Lakehouse-on-Microsoft-Fabric


Databricks End-to-End Pipeline

Medallion-based pipeline using Delta Lake and Unity Catalog.

  • Designed for scalable processing and governed data access
  • Structured transformations for clarity, reuse, and maintainability

🔗 https://github.com/bashoori/data-engineering-portfolio/tree/main/databricks-end-to-end


Airflow + Spark + AWS Pipeline

Containerized ETL pipeline reflecting production patterns.

  • Implemented orchestration, retries, and scheduling
  • Focused on reliability and operational behavior of pipelines

🔗 https://github.com/bashoori/airflow-spark-aws-etl-pipeline


Direction

  • Microsoft Certified: Azure Data Fundamentals (DP-900)
  • Preparing for: Microsoft Fabric Data Engineer (DP-700)
  • Building hands-on projects focused on cloud-based data platforms

Build systems that remain reliable as complexity grows.

Profile Views

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors