Validating profile... 0% complete

What is your current level of experience in freelance work?

 

 

NLP Expert for Fake Review Detection (3-Class Classification with RoBERTa & GAN Models)

We are working on a research-based fake review detection project involving advanced Natural Language Processing and deep learning models. The goal is to develop a multi-class classification system that can distinguish between: 1)Original reviews (OR) 2)Computer-generated fake reviews (CG) 3)Human-generated fake reviews (HG) We have currently collected a dataset of 5000 human-generated reviews via 100 Google Forms (50 per form), and plan to expand it to around 20,000 reviews. We also have matching samples of computer-generated and original reviews. We have already tested a sample RoBERTa-based binary classification model (FakeRoBERTa) using a sample dataset for trial purposes. You will be provided with this working trial code to get a more clear idea The next step is to upgrade/modify this code to handle 3-class classification, tune hyperparameters to get better results, and ensure the architecture is robust and scalable. Main responsibilities:- 1) Modify and optimize the current RoBERTa-based implementation for 3-class classification 2) Train and evaluate the model using our current dataset (≈15,000 rows: 5000 for each class) 3) Ensure PyTorch-based training logic is correct and efficient 4) Design and suggest hyperparameter tuning strategies to improve model performance 5) Prepare the codebase to be easily scalable to larger datasets (~60,000 rows) 6) Assist with planning or implementing a GAN-based model as the next phase 7) Provide clear guidance, documentation if required and best practices. Looking for a skilled and talented freelancer who has:- -Strong background in NLP and deep learning (especially transformer models like RoBERTa) -Experience with PyTorch and Hugging Face Transformers -Familiarity with multi-class classification problems -Prior experience working on text classification or fake review detection (preferred) -Ability to write clean, well-structured, and optimized code We will be providing with:- 1. Sample working code for FakeRoBERTa (binary classification) 2. Labeled dataset (currently 5000 human-generated samples, 5000 computer-generated, 5000 original) Final expected deliverables:- **1)Final working code (for 3-class classification) that runs reliably on a large dataset 2) Evaluation metrics and confusion matrix 3) Guidance for scaling to 20k+ samples 4) Clean documentation and comments *Note:- Deadline - will be updated soon Thank you for your time.