Example of Python Scrapy project. It scrapes book data from https://books.toscrape.com/.
This project serves as an example of Python Scrapy project. It scrapes book data from books.toscrape.com.
To use this scraper, you need to install the Apify CLI. Follow the instructions here.
Make sure you have Python installed. If not, download it here. Any version supported by Apify SDK and Scrapy should be fine.
Additionally, install Virtualenv using the following command:
pip install virtualenv
Create a Python virtual environment by running:
python3.12 -m virtualenv .venv
Activate the virtual environment:
source .venv/bin/activate
Install Python dependencies:
pip install -r requirements.txt -r requirements-dev.txt
The project is still runnable as a Scrapy project. Execute the following command:
scrapy crawl book_spider -o books.json
Run the scraper as an Apify Actor using:
apify run --purge
You will need to provide your Apify API Token to complete this action.
apify login
This command will deploy and build the Actor on the Apify Platform. You can find your newly created Actor under Actors -> My Actors.
apify push
To learn more about Apify and Actors, take a look at the following resources:
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!