We are working on a research-based fake review detection project involving advanced Natural Language Processing and deep learning models. The goal is to develop a multi-class classification system that can distinguish between: 1)Original reviews (OR) 2)Computer-generated fake reviews (CG) 3)Human-generated fake reviews (HG) We have currently collected a dataset of 5000 human-generated reviews via 100 Google Forms (50 per form), and plan to expand it to around 20,000 reviews. We also have matching samples of computer-generated and original reviews. We have already tested a sample RoBERTa-based binary classification model (FakeRoBERTa) using a sample dataset for trial purposes. You will be provided with this working trial code to get a more clear idea The next step is to upgrade/modify this code to handle 3-class classification, tune hyperparameters to get better results, and ensure the architecture is robust and scalable. Main responsibilities:- 1) Modify and optimize the current RoBERTa-based implementation for 3-class classification 2) Train and evaluate the model using our current dataset (≈15,000 rows: 5000 for each class) 3) Ensure PyTorch-based training logic is correct and efficient 4) Design and suggest hyperparameter tuning strategies to improve model performance 5) Prepare the codebase to be easily scalable to larger datasets (~60,000 rows) 6) Assist with planning or implementing a GAN-based model as the next phase 7) Provide clear guidance, documentation if required and best practices. Looking for a skilled and talented freelancer who has:- -Strong background in NLP and deep learning (especially transformer models like RoBERTa) -Experience with PyTorch and Hugging Face Transformers -Familiarity with multi-class classification problems -Prior experience working on text classification or fake review detection (preferred) -Ability to write clean, well-structured, and optimized code We will be providing with:- 1. Sample working code for FakeRoBERTa (binary classification) 2. Labeled dataset (currently 5000 human-generated samples, 5000 computer-generated, 5000 original) Final expected deliverables:- **1)Final working code (for 3-class classification) that runs reliably on a large dataset 2) Evaluation metrics and confusion matrix 3) Guidance for scaling to 20k+ samples 4) Clean documentation and comments *Note:- Deadline - will be updated soon Thank you for your time.
Keyword: Data Processing
Price: $20.0
Natural Language Processing Data Science Python PyTorch TensorFlow Deep Neural Network Artificial Neural Network Machine Learning Deep Learning Neural Network
I need a custom Excel sheet tailored to my specific needs. Key Requirements: - Purpose: Data analysis, inventory management, or financial tracking - Number of Sheets: One or multiple, depending on complexity - Data Type: Sales, employee, or research data Ideal Skills...
View JobTIME SENSITIVE PROJECT Description: I have approximately 200 reports in Excel and PDF format. These reports contain tables or structured/semi-structured data, but the formatting, field names, and file naming conventions vary significantly across files. I'm lookin...
View JobI need help to clean and organize a mixed dataset in Excel. Tasks include: - Removing duplicates - Fixing date format issues - Filling in missing data where possible - Validating and organizing data into structured tables - Accurate data entry from PDF or scanned docu...
View Job