Transform any website into structured data with AI-powered extraction. This versatile tool combines advanced web scraping with intelligent content analysis to deliver clean, customized JSON output - perfect for automating data collection from any web source.
Transform any website into structured data effortlessly! This powerful Apify actor revolutionizes web scraping by combining AI with precise data extraction. Simply specify what data you need, and watch as advanced AI models intelligently parse web content into clean, structured JSON - saving you countless hours of manual data collection and processing. Perfect for businesses and developers who need reliable, automated data extraction without complex coding or maintenance.
array
): List of web pages to scrapearray
): Specification of fields to extract, each containing:
name
: Field name in the outputdescription
: Description to guide the AI, be as specific and descriptive as possibletype
: Data type (string
, number
, boolean
, array
, object
)string
, optional): CSS selector to target specific elements on the page. This can greatly reduce the AI cost, by reducing the number of input tokens. It can also have a positive impact on accuracy. If provided, only text from elements matching this selector will be extracted. If not provided, the default content extraction will be used. This is an advanced option, if you are not familiar with CSS selectors, please do not provide one. Inspect the HTML of a page to find the correct CSS selector.Example CSS selectors:
main
: selects elements with tag "main".#price
: selects elements with id "price"..product-details-container .price, .product-details-container .description
: selects elements with class "price" and "description" that are descendants of elements with class "product-details-container".article.main-story, .article-body > p
: selects elements with tag "article" and class "main-story", as well as direct child "p" elements under elements with class "article-body"..documentation-content h2, .documentation-content .method-signature
: selects "h2" elements and elements with class "method-signature" that are descendants of elements with class "documentation-content"..post-container[data-type="user-post"] .content
: selects elements with class "content" that are descendants of elements with both class "post-container" and data-type attribute "user-post".#product-listing div.item:not(.ad) .details h3, .price-info span.current-price
: selects "h3" elements under elements with class "details" that are descendants of "div" elements with class "item" but not class "ad" under element with ID "product-listing", as well as "span" elements with class "current-price" under elements with class "price-info".You can either use one of our predfined models which we verified that work well. Or you could specify your own model from OpenRouter. If you use a predefined model, you don't have to bring your own API key we will cover the AI cost and you will be charged for it through Apify usage. If you bring your own OpenRouter API key you will not be charged for the AI cost. Your API key is stored securly and encrypted with Apify.
After some testing we found Google Gemini Flash 2.0 to give the best quality for the lowest price.
Free Apify users can only process 1 URL every 24 hours using predefined models to test out this functionality. If you are a free user you will have to upgrade your Apify account to a paying subscription tier to use predefined models or bring your own OpenRouter API key.
string
): Choose from supported models:
boolean
): Toggle to use your own modelstring
): OpenRouter model identifier e.g. google/gemini-2.0-flash-001string
): Your API key for custom model access (is stored encrypted)Make sure your model supports structured outputs. Check model compatibility at: https://openrouter.ai/models?supported_parameters=structured_outputs
object
): Configure proxy settings for web scraping1{ 2 "urls": [ 3 "https://apify.com/clockworks/free-tiktok-scraper" 4 ], 5 "fields": [ 6 { 7 "name": "name", 8 "description": "The name/title of the scraper tool", 9 "type": "string" 10 }, 11 { 12 "name": "price", 13 "description": "The price per 1000 results, only the number", 14 "type": "number" 15 }, 16 { 17 "name": "author", 18 "description": "The author or maintainer of the scraper", 19 "type": "string" 20 } 21 ], 22 "cssSelector": "main > article", 23 "useCustomModel": false, 24 "predefinedModel": "google/gemini-2.0-flash-001", 25 "proxyConfiguration": { 26 "useApifyProxy": true, 27 "apifyProxyGroups": [ 28 "RESIDENTIAL" 29 ] 30 } 31}
The actor outputs a dataset where each item contains:
url
: The source URLExample output:
1{ 2 "url": "https://apify.com/clockworks/free-tiktok-scraper", 3 "author": "Clockworks", 4 "name": "TikTok Data Extractor", 5 "price": 4 6}
There are 3 costs to using this model: startup cost, cost per result and AI cost. We split it up like this to make our pricing as competitive as possible.
You can check how many tokens are in a given text by using the Open AI Tokenizer: https://platform.openai.com/tokenizer. Generally speaking 1 token = 1 word.
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!