Skip to content

I created a webscraper tool to fetch indeed data. It will return job title, company name, url of the job, salary(if present) and many more.

License

Notifications You must be signed in to change notification settings

invictusaman/Indeed-WebScraper

Repository files navigation

Indeed Scraper

I created a webscraper 🕸️ tool to fetch indeed data. It will return job title, company name, job id, url of the job, salary(if present) and whole description of respective job.


Step 1: Install dependencies

Install required dependencies in your project folder.

pip install -r requirements.txt

Step 2: Run Indeed_Scraper.py

Make sure you have Chrome ⬇️ latest version installed in your system. This step creates scraped_job_file.csv, however, you won't have job descriptions.

Step 3: Run Extract_Description_Indeed.py

Recommended: Clean your scraped_job_file.csv for duplicate values, before running this code.

This step extracts job_description and assign them to the respective rows. It will take good amount of time, go grab a coffee ☕. O/P is updated scraped_job_file.csv with merged job description.

I did not implement multi threading 🧵 (which would have otherwise saved you a lot of time), because of time and limited knowledge. Feel free to fork this repo and implement. Good luck. 🤓


Further Work:

Implement a pretrained NER model and extract information such as programming languages included, type of work(remote, hybrid, in-person), salaries from description column. Or, you can use simple logic to match respective words.

Follow my data-analyst journey: Portfolio_Link

About

I created a webscraper tool to fetch indeed data. It will return job title, company name, url of the job, salary(if present) and many more.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy