Minimal RAG Solution

$60.00
Hourly: $60.00 - $100.00

I am looking for a response that include a project plan for accomplishing. I need someone to help me build this out that has done this before, I prefer a fixed fee but open to hourly with a good project plan you can deliver to. The right person will assist with the additional phases of this project. This is only phase 1 and it is extremely minimal. I want to have someone start no later than the 31st of March. If I can find the right individual we will start sooner. Objective: Build out a minimal version of an AI RAG service Time Frame: 2 weeks Scope of Project: Phase 1 - build out a minimal version of RAG in two weeks. Success criteria 1) JWT (username password) and OAuth 2.0 for authentication (google and Microsoft) 2) From Web UI query private LLM and get response 3) From Web UI attach a file as part of the query 4) History per chat session stored and leveraged for context as part of chat session 5) Leverage API from make.com and be able to query LLM and get a response to make.com 6) Leverage API from make.com and be able to select a document from the Vector DB as part of the query to the LLM 7) Put a document blob storage and it is ingested and chunked into the DB (sizes will be provided) High Level Architecture Layer Recommended Tool LLM Host Ollama (version will be provided) GPU H100 RAG Framework LlamaIndex Vector Store Weaviate Frontend/API FastAPI & React for UI & Document Ingestion & Chunking Docs Loader / Orchestration LlamaIndex Storage/Auth Blob storage Cloud Azure Container Kubernetes Data Sources Layer: • Document ingestion processes • Storage using Azure Blob Storage • Document loading via LlamaIndex Processing Pipeline: • Handles document chunking and indexing • Uses LlamaIndex as the RAG framework for orchestration Vector Database: • Weaviate as the vector store for embeddings • Provides semantic search capabilities LLM Engine: • Ollama as the LLM host • H100 GPU for acceleration • Handles query processing and response generation Frontend: • React-based user interface • Provides chat interaction capabilities and documentation upload API Gateway: • Built with FastAPI • Manages REST endpoints and routes requests Auth/Storage Layer: • Handles user management and permissions • Connects to Azure Blob Storage for document persistence There will be a phase 2 of this project and is outside the scope of the request. The second phase will be to make this a multitenant version, improve availability and tune for performance. It will include enhancements to the interface. Phase 1 is extremely minimal with no focus on features, performance, scaling or availability.

Keyword: Software Development

Price: $60.0

Web Development Kubernetes React Python Retrieval Augmented Generation NLP Tokenization FastAPI LLM Prompt Engineering

I want to apply

Build private LLM (ChatGPT) instance and pre-load with industry data

I'm looking to build a private version of ChatGPT, but ideally with all of ChatGPT's functionality and model built-in. I would then like to train this model on industry specific data (geographic, population, pricing, etc.) and begin to ask this model questions about how...

View Job

Senior Google Cloud Platform Development & Security Expert

We are seeking a Senior Google Cloud Platform (GCP) Expert who specializes in both cloud-native development and cloud security architecture. This individual will play a key role in designing, developing, securing, and optimizing cloud applications and infrastructure usi...

View Job

Seasoned developer with options trading experience needed (wordpress, PHP, Javascript)

Hello, I am looking for a great fit for a project I am currently looking to complete. The project is structured into three key milestones: 1. Indicators Integration: - Integrate one or two indicators into a chart, similar to the functionality offered by TradingView ([EU...

View Job