Java Developer with Web Crawler Experience
Role: Java Developer with Web Crawler Experience
Location: Austin TX(Hybrid)
Responsibilities:
1. Web Crawler Development: Design and implement efficient and scalable web crawlers in Java to collect data from various online sources.
2. Data Extraction: Develop and maintain systems for structured data extraction, handling various data formats (HTML, JSON, XML, etc.).
3. Data Storage and Processing: Design data storage and processing pipelines, ensuring extracted data is clean, structured, and easily accessible.
4. Performance Optimization: Optimize web crawling processes for speed, efficiency, and accuracy, while ensuring minimal impact on source websites.
5. Error Handling and Logging: Implement error-handling mechanisms and logging systems to detect and resolve issues during crawling operations.
6. Data Integrity and Compliance: Ensure data collection practices are ethical, legal, and compliant with relevant regulations (e.g., robots.txt, copyright laws).
Requirements:
Proficiency in Java and experience with Java-based web scraping libraries (e.g., Jsoup, Apache
Knowledge of web crawling frameworks and tools, such as Scrapy, Selenium, or Puppeteer.
Strong understanding of HTML, CSS, JavaScript, and web data structures.
Familiarity with data parsing and handling techniques for JSON, XML, and other common formats.
Experience with database technologies (SQL, NoSQL) to store and manage scraped data.
Knowledge of protocols, headers, proxies, and load handling.
Recommended Jobs
Construction
Job Description Job Description Commercial Sub-Contractor seeking employee with knowledge of working with basic hand tools such as Welder, Tape Measure, Jig Saw, Drills, etc. Some knowledge of se…
Part-Time Resident Advocate
GENERAL DESCRIPTION AND PURPOSE: The Family Abuse Center depends on the Resident Advocates to ensure that our shelter is staffed 24 hours a day, 7 days a week. The primary responsibilities of the Res…
ADULT EDUCATION TEACHER
Summary ...Title: ADULT EDUCATION TEACHER Pay Plan:...??.... ...Education/GED? Instructor responsible for...??...for teaching assigned academic... ...techniques to assist student... ...I…
Dishwasher
At Watermark Retirement Communities, we've been a trusted leader in senior living for over 30 years, driven by our commitment to building an innovative, compassionate culture for both residents and as…
Precision CNC Lathe Programmer‑Operator (5+ Axis)
Is this you? Fluent in G-code and comfortable making on-the-fly edits? At your best setting up production runs on multi-axis lathes with live tooling? Detail-driven, safety-minded, and located …
Freelance Software Developer (Ruby) - AI Trainer
This opportunity is only for candidates currently residing in the specified country. Your location may affect eligibility and rates. At Mindrift , innovation meets opportunity. We believe in usi…
Dog Daycare Attendant
Bring your dog to work? That's right! Dogtopia, the industry leader in dog daycare, boarding, and spa services has immediate opening for energetic, organized, business-minded individuals that will be …
Recovery Coach
Position Summary: Under general supervision the Recovery coach reports to the Director of Recovery Support Services. The Recovery Coach will provide leadership to establish effective Recovery Support …
Senior Product Manager, Customer Success
Who are we? Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet. A place where tech thin…
Security Engineering Manager, Stores AppSec
DESCRIPTION We are looking for an experienced security leader to join the Application Security team. As a security leader, you will manage multiple teams of security engineers, fostering a strong …