We are seeking a highly skilled and experienced Playwright Web Scraping Developer to join our team. In this role, you will be responsible for designing, developing, and maintaining sophisticated web scraping scripts using Playwright and TypeScript, with a focus on storing and managing extracted data in a Postgresql database utilizing Drizzle as an Object-Relational Mapping (ORM) tool. You will tackle complex challenges, including navigating dynamic websites, overcoming anti-scraping measures like CAPTCHAs, extracting intricate data objects, automating document downloads, and integrating the scraped data into our Postgresql database using Drizzle's efficient data modeling and query capabilities. The ideal candidate possesses deep expertise in advanced Playwright techniques, a strong understanding of modern web technologies, experience with database management using Postgresql, and familiarity with Drizzle for seamless interaction between the application code and the database. This role requires the ability to design and implement scalable data storage solutions, ensuring data integrity, consistency, and performance. Key Responsibilities: - Design, code, test, and deploy robust and scalable web scraping solutions using Playwright and TypeScript. - Implement advanced scraping techniques to handle dynamic content loading (SPAs, AJAX), complex user interactions, and intricate website structures. - Develop and integrate strategies for bypassing various CAPTCHA challenges and other anti-bot mechanisms. - Expertly scrape and parse complex data structures (nested objects, tables, lists) from HTML and dynamically generated content. - Implement functionality to reliably download various document types (PDFs, CSVs, images, etc.) encountered during scraping processes. - Utilize Postgresql to design and implement efficient database schemas for storing scraped data, ensuring data normalization, and optimizing query performance. - Leverage Drizzle to interact with the Postgresql database, defining models that represent the scraped data, and performing CRUD (Create, Read, Update, Delete) operations efficiently. - Monitor, maintain, and optimize existing scraping scripts and database integrations for performance, reliability, and efficiency. - Troubleshoot and resolve issues related to script failures, website changes, blocking mechanisms, or database connectivity problems. - Collaborate with relevant teams (e.g., data analysts, backend engineers) to understand data requirements, ensure data quality, and align database design with project needs. - Stay current with the latest developments in Playwright, TypeScript, web scraping best practices, anti-scraping technologies, Postgresql features, and Drizzle capabilities. - Document code, methodologies, and processes clearly, including database schema designs and API interactions. Required Skills and Qualifications: - Proven professional experience building complex web scrapers specifically with Playwright. - Advanced proficiency in Playwright API, including handling complex selectors, browser contexts, page interactions, network interception, and navigation strategies. - Strong programming skills in TypeScript and its ecosystem (Node.js). - Demonstrated experience implementing CAPTCHA bypassing techniques (familiarity with recognition services or advanced interaction simulation). - Proven ability to scrape and structure data from complex, nested web elements (object scraping). - Experience building reliable document downloading capabilities within scraping workflows. - Solid understanding of web fundamentals: HTML, CSS, JavaScript (ES6+), DOM manipulation, browser developer tools, HTTP/S protocols. - Familiarity with common anti-scraping techniques (IP rotation, user-agent spoofing, fingerprinting, etc.) and strategies to mitigate them. - Experience with Postgresql database management, including schema design, query optimization, and data modeling. - Proficiency in Drizzle (ORM) for interacting with Postgresql databases, including defining models, performing queries, and managing transactions. - Experience with version control systems, particularly Git. - Strong analytical and problem-solving skills with meticulous attention to detail. - Excellent communication skills.
Keyword: Node.js
Price: $20.0
AWS Developer to work on following key activities Amazon Connect Set up and configure Amazon Connect instances, contact flows, routing profiles, and queues. Develop and customize Contact Flow logic using AWS Lambda and Amazon Lex for automation. Monitor and troubleshoot...
View JobWe are looking for a skilled Node.js Engineer to help integrate our platform with various social media APIs, including Facebook, Instagram, TikTok, and Pinterest. Please read below carefully. IMPORTANT REQUIREMENTS: - Must have extensive Social Media API integration exp...
View JobBusco um desenvolvedor para criar um sistema web com backend em Node.js e integração com Firebase, voltado para o cadastro completo de registros por profissionais ou empresas. O sistema deverá incluir: Formulários técnicos para preenchimento detalhado de informações; Up...
View Job