Do you want to apply for this freelance job vacancy?

Build a Data Warehouse: Kafka → PostgreSQL ELT Pipeline w/ dbt + Superset (Multi-Tenant SaaS)

Project Overview: We are a SaaS company with a multi-tenant architecture and Kafka-based event pipelines. We are looking for a skilled data engineer to implement an ELT system using: - Kafka (event source) - Kafka Connect (for streaming data into a central warehouse) - PostgreSQL (as the reporting warehouse) - dbt (for transformation and normalization) - Superset (for reports and non-developer-friendly reporting) Project Goals: - Deliver a reliable Kafka → PostgreSQL pipeline using Kafka Connect - Build clean analytics models using dbt - Enable our non-technical staff to build reusable, tenant-aware reports in Superset Current Architecture: - Each of the 23 microservices (Account, Commerce, etc.) has its own PostgreSQL DB - Each tenant has its own schema/namespace - All services emit Kafka events when DB records change - We are deploying a dedicated PostgreSQL instance as a reporting warehouse Scope of Work Phase 0 – Warehouse Schema Design - Propose and define the initial schema structure for the central reporting PostgreSQL warehouse - Includes raw Kafka-ingested tables and normalized dbt models - Should support multi-tenant structure (e.g., tenant_id column, single schema vs schema-per-tenant) - Design for scalability across services (e.g., account_users, commerce_orders) - Set up required schemas/namespaces (e.g., raw, staging, analytics) - Ensure naming conventions and structure are dbt- and Superset-friendly - Review schema plan with our team before implementation - Document decisions and structure for handoff Phase 1 – Initial Backfill (One-Time Load) - Connect to each service PostgreSQL database (e.g., Account, Commerce) - Extract historical data from each tenant schema within those databases - Load the extracted data into the appropriate tables in the reporting PostgreSQL warehouse - Normalize field types and formats to match Kafka-ingested data for consistency - Ensure tenant_id and source_service fields are included - Automate this step as a repeatable script or process in case re-runs are needed - Document the process clearly Phase 2 – Kafka Connect to PostgreSQL - Set up Kafka Connect using open-source connectors - Configure sinks for multiple topics into the central PostgreSQL warehouse - Ensure tenant context (tenant_id) is preserved in target tables - Document topic-to-table mappings Phase 3 – dbt Modeling - Create and configure a clean, modular dbt project - Write initial models to transform raw event data into curated tables (E.g., users, orders, subscriptions) - Normalize fields across services where applicable - Add documentation and basic data tests Phase 4 – Superset Charts/Table Reports - Connect Superset to the reporting warehouse - Create example dashboards with filters for: - Date ranges - Tenant selection - Service-specific views - Recommend best practices for access control (row-level or embedding strategies) Desired Experience - Kafka Connect (sink connectors, JDBC, PostgreSQL) - dbt (including incremental models, Jinja templating, modular structure) - PostgreSQL (for schema design and indexing in a warehouse context) - Superset configuration and dashboard setup - Experience with multi-tenant data structures is highly preferred

Keyword: E-commerce Development

dbt PostgreSQL ETL Pipeline Apache Kafka Apache Superset

What is your current level of experience in freelance work?

How many hours per week do you plan to work?

Which language do you speak fluently?

What are you looking for?

Perfect! Questions answered.

Make $25 - $35 Per Hour Doing Simple Jobs From Home.

Build a Data Warehouse: Kafka → PostgreSQL ELT Pipeline w/ dbt + Superset (Multi-Tenant SaaS)

Registration