Description: We are a growing healthcare platform looking for an experienced ETL/Data Engineer to build a secure, scalable data integration pipeline. The goal is to consolidate data from multiple healthcare data sources (EHRs, patient management systems, claims data) into a unified data warehouse to support analytics, reporting, and improved care outcomes. This project will require strong expertise in healthcare data handling, compliance (e.g., HIPAA), and ETL best practices. Project Goals: Extract data from multiple healthcare data sources, including: Electronic Health Records (EHRs) such as Epic, Cerner, or Athenahealth Claims data from insurance providers Patient management systems CSV/Excel flat files from labs and imaging centers Normalize and transform data to a unified schema for analytics and reporting Load data into our AWS Redshift (or Snowflake) data warehouse Automate the pipeline using Apache Airflow or AWS Glue Implement robust data quality checks, logging, and error handling Ensure compliance with HIPAA and other relevant data security standards Key Deliverables: ETL pipeline code (Python preferred) with clear, maintainable documentation Automated scheduling and monitoring (e.g., via Airflow or Glue workflows) Data validation scripts to ensure data integrity Data mapping documentation (source to target mapping) Deployment guide and knowledge handover To Apply, Please Share: A brief summary of your healthcare data integration experience Examples of similar ETL pipelines you've built (especially healthcare-related) Your approach to ensuring data privacy and HIPAA compliance Your proposed strategy and timeline for this project
Keyword: Python
Python Apache Airflow AWS Glue Snowflake BigQuery ETL Pipeline HIPAA
I'm working with 20+ affiliate radio stations who each send me weekly email updates. I need a fully automated system that: Will look at the emails, pull station ID, Slogan, and update info from the emails. Rewrite the updates into short ready-to-read scripts and assign ...
View Job