Skip to content

jaypyles/Scraperr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Scraperr Logo

A powerful self-hosted web scraping solution

MongoDB FastAPI Next JS TailwindCSS

πŸ“‹ Overview

Scrape websites without writing a single line of code.

πŸ“š Check out the docs for a comprehensive quickstart guide and detailed information.

Scraperr Main Interface

✨ Key Features

  • XPath-Based Extraction: Precisely target page elements
  • Queue Management: Submit and manage multiple scraping jobs
  • Domain Spidering: Option to scrape all pages within the same domain
  • Custom Headers: Add JSON headers to your scraping requests
  • Media Downloads: Automatically download images, videos, and other media
  • Results Visualization: View scraped data in a structured table format
  • Data Export: Export your results in markdown and csv formats
  • Notifcation Channels: Send completion notifcations, through various channels

πŸš€ Getting Started

Docker

make up

Helm

Refer to the docs for helm deployment: https://scraperr-docs.pages.dev/guides/helm-deployment

βš–οΈ Legal and Ethical Guidelines

When using Scraperr, please remember to:

  1. Respect robots.txt: Always check a website's robots.txt file to verify which pages permit scraping
  2. Terms of Service: Adhere to each website's Terms of Service regarding data extraction
  3. Rate Limiting: Implement reasonable delays between requests to avoid overloading servers

Disclaimer: Scraperr is intended for use only on websites that explicitly permit scraping. The creator accepts no responsibility for misuse of this tool.

πŸ’¬ Join the Community

Get support, report bugs, and chat with other users and contributors.

πŸ‘‰ Join the Scraperr Discord

πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

πŸ‘ Contributions

Development made easier with the webapp template.

To get started, simply run make build up-dev.

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy