Build a Data Warehouse: Kafka → PostgreSQL ELT Pipeline w/ dbt + Metabase (Multi-Tenant SaaS)


Project Overview: We are a SaaS company with a multi-tenant architecture and Kafka-based event pipelines. We are looking for a skilled data engineer to implement an ELT system using: - Kafka (event source) - Kafka Connect (for streaming data into a central warehouse) - PostgreSQL (as the reporting warehouse) - dbt (for transformation and normalization) - Metabase (for dashboards and non-developer-friendly reporting) Project Goals: - Deliver a reliable Kafka → PostgreSQL pipeline using Kafka Connect - Build clean analytics models using dbt - Enable our non-technical staff to build reusable, tenant-aware reports in Metabase Current Architecture: - Each of the 23 microservices (Account, Commerce, etc.) has its own PostgreSQL DB - Each tenant has its own schema/namespace - All services emit Kafka events when DB records change - We are deploying a dedicated PostgreSQL instance as a reporting warehouse Scope of Work Phase 0 – Warehouse Schema Design - Propose and define the initial schema structure for the central reporting PostgreSQL warehouse - Includes raw Kafka-ingested tables and normalized dbt models - Should support multi-tenant structure (e.g., tenant_id column, single schema vs schema-per-tenant) - Design for scalability across services (e.g., account_users, commerce_orders) - Set up required schemas/namespaces (e.g., raw, staging, analytics) - Ensure naming conventions and structure are dbt- and Metabase-friendly - Review schema plan with our team before implementation - Document decisions and structure for handoff Phase 1 – Initial Backfill (One-Time Load) - Connect to each service PostgreSQL database (e.g., Account, Commerce) - Extract historical data from each tenant schema within those databases - Load the extracted data into the appropriate tables in the reporting PostgreSQL warehouse - Normalize field types and formats to match Kafka-ingested data for consistency - Ensure tenant_id and source_service fields are included - Automate this step as a repeatable script or process in case re-runs are needed - Document the process clearly Phase 2 – Kafka Connect to PostgreSQL - Set up Kafka Connect using open-source connectors - Configure sinks for multiple topics into the central PostgreSQL warehouse - Ensure tenant context (tenant_id) is preserved in target tables - Document topic-to-table mappings Phase 3 – dbt Modeling - Create and configure a clean, modular dbt project - Write initial models to transform raw event data into curated tables (E.g., users, orders, subscriptions) - Normalize fields across services where applicable - Add documentation and basic data tests Phase 4 – Metabase Dashboards - Connect Metabase to the reporting warehouse - Create example dashboards with filters for: - Date ranges - Tenant selection - Service-specific views - Recommend best practices for access control (row-level or embedding strategies) Desired Experience - Kafka Connect (sink connectors, JDBC, PostgreSQL) - dbt (including incremental models, Jinja templating, modular structure) - PostgreSQL (for schema design and indexing in a warehouse context) - Metabase configuration and dashboard setup - Experience with multi-tenant data structures is highly preferred

Keyword: Data Cleaning

dbt PostgreSQL ETL Pipeline Apache Kafka Metabase

 

Contractor – U.S. Account Mapping Project

About the Role: We are seeking a detail-oriented and analytical contractor to support a U.S. healthcare account mapping project. The ideal candidate will have strong experience in Excel modeling, working with healthcare claims data, and familiarity with healthcare codin...

View Job
HubSpot CRM Expert Needed for Audit, Workflow Optimization & Data Cleanup - Sales + Marketing Hubs

Flying Fish is looking for an experienced HubSpot freelancer to optimize our CRM setup and help us operate smarter across marketing, sales, and customer success. We use HubSpot Sales Hub Professional and Marketing Hub Professional, and we’re ready to take our systems fr...

View Job
Excel-based Revenue and ROI Calculator

Objective: To transform our current Excel-based ROI and Revenue Projection tool into a visually engaging, branded, and intuitive sales asset that clearly communicates profitability and return timelines for our device — while also providing side-by-side competitor compar...

View Job