Scrape product information from the product details page, product category page, and search page. The product category page includes `productgrouplist`, `featureproductlist`, `productlist`, etc.
A template for web scraping data from websites enqueued from starting URL using Python. The URL of the web page is passed in via input, which is defined by the input schema. The template uses the HTTPX to get the HTML of the page and the Beautiful Soup to parse the data from it. Enqueued URLs are available in request queue. The data are then stored in a dataset where you can easily access them.
This code is a Python script that uses HTTPX and Beautiful Soup to scrape web pages and extract data from them. Here's a brief overview of how it works:
start_urls
key with a list of URLs to scrape and a max_depth
key with the maximum depth of nested links to follow.push_data
method of the Actor instance.Actor.log.exception
method.
For complete information see this article. In short, you will:
If you would like to develop locally, you can pull the existing Actor from Apify console using Apify CLI:
Install apify-cli
Using Homebrew
brew install apify-cli
Using NPM
npm -g install apify-cli
Pull the Actor by its unique <ActorId>
, which is one of the following:
You can find both by clicking on the Actor title at the top of the page, which will open a modal containing both Actor unique name and Actor ID.
This command will copy the Actor into the current directory on your local machine.
apify pull <ActorId>
To learn more about Apify and Actors, take a look at the following resources:
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!