Blog

Python Operations Engineer (Web Scraping Focus)

【Location:】Ho Chi Minh City

【Job Overview】
We are seeking an experienced Python Operations Engineer specializing in web scraping development and maintenance. This position requires candidates to have solid Python programming skills and extensive experience in web scraping development.

【Key Responsibilities】
1. Design, develop, and maintain distributed web scraping systems
2. Perform multi-platform information extraction, cleaning, and analysis
3. Optimize scraping strategies to improve platform extraction efficiency
4. Real-time monitoring of scraping progress and alert feedback
5. Solve anti-scraping technical challenges to ensure data collection stability
6. Participate in scraping-related architecture design and development work

【Qualifications】
1. Bachelor’s degree or above, preferably in Computer Science or related fields
2. Proficient in Python programming with 3+ years of relevant work experience

【Essential Skills】
1. Familiarity with Linux operating systems and strong system operations capabilities
2. In-depth understanding of HTTP protocols and web scraping principles and techniques
3. Expertise in common scraping frameworks such as Scrapy and pyspider
4. Proficiency in HTML, DOM structure, and data extraction techniques like XPath, regular expressions, and CSS selectors
5. Knowledge of common anti-scraping techniques and ability to counter them
6. Experience with distributed scraping architectures and large-scale data processing

【Preferred Skills】
1. Familiarity with Web frontend technologies and understanding of JavaScript dynamic rendering
2. Experience in data mining and machine learning
3. Proficiency in MySQL, MongoDB, and other database operations
4. Experience with link analysis (e.g., PageRank, TrustRank)
5. Ability to perform feature extraction (e.g., page quality evaluation, topic analysis, LDA)
6. Capability to solve complex issues such as account bans, IP blocks, and CAPTCHA recognition

【Tech Stack】
– Programming Languages: Python (required), Shell scripting (preferred)
– Operating System: Linux
– Databases: MySQL, MongoDB
– Scraping Frameworks: Scrapy, pyspider
– Version Control: Git
– Other Tools: Regular expressions, XPath, BeautifulSoup

【Soft Skills】
1. Strong desire to learn and problem-solving abilities
2. Excellent teamwork and communication skills
3. Ability to work under pressure, responsible, and proactive work attitude
4. Innovative spirit, capable of continuously optimizing work processes and technical solutions

【Work Environment】
– Dynamic technical team
– Competitive salary package
– Flexible working hours
– Opportunities for continuous learning and growth

Add comment