Java Developer with Web Crawler Experience
Role: Java Developer with Web Crawler Experience
Location: Austin TX(Hybrid)
Responsibilities:
1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources.
2. Data Extraction: Develop and maintain systems for structured data extraction, handling various data formats (HTML, JSON, XML, etc.).
3. Data Storage and Processing: Design data storage and processing pipelines, ensuring extracted data is clean, structured, and easily accessible.
4. Performance Optimization: Optimize web crawling processes for speed, efficiency, and accuracy, while ensuring minimal impact on source websites.
5. Error Handling and Logging: Implement error-handling mechanisms and logging systems to detect and resolve issues during crawling operations.
6. Data Integrity and Compliance: Ensure data collection practices are ethical, legal, and compliant with relevant regulations (e.g., robots.txt, copyright laws).
Requirements:
Proficiency in Java and experience with Java-based web scraping libraries (e.g., Jsoup, Apache
Knowledge of web crawling frameworks and tools, such as Scrapy, Selenium, or Puppeteer.
Strong understanding of HTML, CSS, JavaScript, and web data structures.
Familiarity with data parsing and handling techniques for JSON, XML, and other common formats.
Experience with database technologies (SQL, NoSQL) to store and manage scraped data.
Knowledge of protocols, headers, proxies, and load handling.
Recommended Jobs
Equipment Technician - All Shifts!
Description Shifts: 1st - Mon-Wed: 5 am - 5 pm + every other Thurs (WEEKDAY – DAYS) 2nd - Mon-Wed: 5 pm - 5 am + every other Thurs (WEEKDAY – NIGHTS) 3rd - Fri-Sun: 5 am - 5 pm + every other Th…
Production Chef/Line Cook
Job Description Job Description Benefits: Competitive salary Employee discounts Free food & snacks Paid time off Training & development About Us: EatFlavorly is dedicated to cr…
RN and LPN PRN Flu Vaccination Positions
Join the Vaccine Virtuosos at Ramp Health! Who We Are Since 2002, we've been health heroes on a mission! At Ramp Health, we're not just jabbing arms – we're transforming lives through top-notch …
Implementation Engineer Lead - Process Focused (Manhattan WMS/OMS)
CEVA Logistics provides global supply chain solutions to connect people, products, and providers all around the world. Present in 170+ countries and with more than 110,000 employees spread over 1,500…
Oracle Database Administrator
Atlas Technologies is seeking a skilled and motivated Oracle Database Administrator to join our dynamic IT team and contribute to the development and maintenance of Direct Access infrastructure contra…
Cycle Counter
Job Description Job Description Atco Products is located in Ferris, Texas, and is an IATF/ISO9001 Certified tubing/hose assembly manufacturer of high-quality automotive and heavy-duty truck comp…
Construction, Oil and Gas Controller
Job Title: Controller – Construction / Oil & Gas Location: Lindale, TX 75706, Hybrid role Reports To: Chief Financial Officer (CFO) or CEO Employment Type: Full-Time Industry: Constru…
Staff Accountant
Witherite Law Group (WLG) is a personal injury law firm specializing in motor vehicle accidents. Our vision is to improve the life of each person we serve. Whether it is for our employees, our client…
Automation Manager (G-Project)
Main Function: The Automation Manager is responsible for the repair and maintenance of production equipment. This includes both planned preventive maintenance and ad hoc repairs. The role also enc…
Track Maintenance Manager - Houston, TX
Start a Watco Career and Discover the Difference Keep the world’s supply chain moving. That’s what the Watco team does every day at our short line railroads, switching sites, terminals, ports, …