TIME SENSITIVE PROJECT Description: I have approximately 200 reports in Excel and PDF format. These reports contain tables or structured/semi-structured data, but the formatting, field names, and file naming conventions vary significantly across files. I'm looking for a skilled data analyst or Python developer who can help me compare these reports and identify which ones are at least 60% similar in content. This will require fuzzy matching techniques and possibly data normalization. Responsibilities: Extract data from PDF and Excel reports (some may require OCR or table parsing). Clean and normalize the data across all files. Compare the reports and determine which are ≥60% similar based on data content. Deliver a summary of matched report pairs or groups with similarity scores....
Keyword: Data Processing
Delivery Time: 2 days left days
Price: $481.0
Data Mining Data Processing Excel Python Software Architecture
I need a simple, functional Excel dashboard to track contracts. Requirements: - Replicate the attached image layout. - Sections: Contract Count, Open Tasks, Pending/Expiring Contracts, Contract Amount vs Budget (Bar Chart), Donut Charts (Status, Type, Department), Con...
View JobPromoting data sharing and developing a robust metric to reward exemplary data sharers. Mandatory Registration: Registration period is now closed! Thank you everyone for expressing their interest. Submission period for Phase 1 entries will be open starting Apr 21,...
View JobI'm seeking an experienced developer to create a tailored ERP system for the retail sector. The core focus of this ERP will be on sales and customer management, and I want to incorporate advanced AI capabilities to enhance its functionality. Key Requirements: - S...
View Job