Labor Chronicle Crawler

Overview

The Labor Chronicle Crawler is a component of the Labor Chronicle application, designed to gather labor-related news articles from news platforms and labor organizations. This crawler is specifically tuned to identify and retrieve news content that pertains to labor issues.

Purpose

This crawler is developed to support public knowledge related to labor rights and developments. It helps consolidate news from various sources into a single, accessible platform, making it easier for users to stay informed about significant labor-related issues.

Operational Details

Target Content: The crawler is programmed to search for and retrieve articles that explicitly relate to labor topics. It uses predefined keywords and categories (such as "labor rights," "unions," "wages," "employment law") to filter content during the crawling process.
Frequency: To minimize server load and respect the website's bandwidth, the crawler operates once daily during off-peak hours.
User-Agent String: The crawler identifies itself with: LaborChronicleCrawler/1.0 (+https://github.com/LaborChronicle/crawling)

Compliance with `robots.txt`

Adherence to Directives: This crawler strictly adheres to the directives outlined in the robots.txt files of all target websites. It is configured to respect all Disallow and Allow rules to ensure compliance with each site's policy on automated access.
Respect for Site Architecture: The crawler is designed to navigate and parse websites without causing undue strain or impact on their operational performance.

Contact Information

For any inquiries, feedback, or concerns about the Labor Chronicle Crawler, please contact:

Email: litaliencaleb@gmail.com
GitHub Repository: https://github.com/LaborChronicle/crawling

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
output.csv		output.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Labor Chronicle Crawler

Overview

Purpose

Operational Details

Compliance with `robots.txt`

Contact Information

About

Uh oh!

Releases

Packages

Languages

License

LaborChronicle/crawling

Folders and files

Latest commit

History

Repository files navigation

Labor Chronicle Crawler

Overview

Purpose

Operational Details

Compliance with robots.txt

Contact Information

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Compliance with `robots.txt`

Packages