Extract clean article content, metadata and structured information from any web page. Supports multiple URLs and returns well-formatted JSON with title, description, content, author, publish date and more. ππ
Extract clean article content and metadata from any web pages automatically. This actor helps you get structured content from news sites, blogs, and other article-based websites.
Features β¨
Extract article content and metadata from any URL
Support batch processing of multiple URLs
Clean and structured JSON output
Built-in rate limiting to avoid overloading target sites
Robust error handling and validation
Fast and efficient processing
Output Data Structure π
The actor extracts the following information from each article:
Title
Description
Main content (both HTML and plain text)
Author
Publication date
Source domain
Featured image URL
Related links
Tags
Scraping timestamp
Use Cases π‘
Content aggregation and syndication
News monitoring and analysis
Research and data collection
Content migration
SEO analysis
Digital archiving
Limitations β οΈ
Respects robots.txt and implements polite scraping
2-second delay between requests to avoid overwhelming target servers
URLs must be valid and accessible
Content extraction quality depends on page structure
Tips for Best Results πͺ
Provide valid, accessible URLs
Use for public content only
Consider target website's terms of service
Monitor execution logs for any issues
Need help or have questions? Feel free to reach out!
The results will be wrapped into a dataset which you can always find in theΒ StorageΒ tab. Here's an excerpt from the data you'd get if you apply the input parameters above:
And here is the same data but in JSON. You can choose in which format to download your data: JSON, JSONL, Excel spreadsheet, HTML table, CSV, or XML.
1[2{3"url":"https://www.fancode.com/pickleball/schedule",4"title":"Pickleball Schedule - Check International and Domestic matches on FanCode",5"description":"ABOUT FANCODEIndia's Premium Live Streaming, Live Scores & Sports Merchandise Shopping platform FanCode has grown to become one of the most loved and followed all-sports destination in the last few years....",6"content":"<div><p><label>ABOUT FANCODE</label><label>India's Premium Live Streaming, Live Scores & Sports Merchandise Shopping platform FanCode has grown to become one of the most loved and followed all-sports destination in the last few years. The FanCode app has been downloaded by more than 3+ crore users. It offers interactive live streaming of all major sporting events, premier cricket tournaments, women's cricket, live football, basketball, baseball, wrestling, badminton, and other major sports. It also offer real-time match highlights, match videos, cricket videos, India cricket highlights, highlights of today's match, highlights of yesterday's match, cricket data, statistics, cricket analysis, fantasy insights, cricket updates, breaking news from India cricket and world of sports. It also offers sports merchandise for all major sporting leagues and teams from across the world.</label></p></div>",7"author":"",8"publishedDate":"",9"source":"fancode.com",10"image":"https://www.fancode.com/skillup-uploads/fc-web/home-page-new-arc/hero-image/v1/hero-image-dweb-v4.png",11"links":[12"https://www.fancode.com/pickleball/schedule"13],14"tags":[],15"scrapedAt":"2025-02-05T07:19:26.119Z"16},17 ...
18]
Related Actors
π URL Metadata Crawler - Extract comprehensive metadata from web pages including meta tags, favicons, and Open Graph tags.
π Google News Scraper - Collect up to 5000 news articles with flexible search options and language support.
π arXiv Search Scraper - Extract comprehensive research paper data including titles, authors, and abstracts.
Is it legal to scrape job listings or public data?
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
Do I need to code to use this scraper?
No. This is a no-code tool β just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
What data does it extract?
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Can I scrape multiple pages or filter by location?
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
How do I get started?
You can use the Try Now button on this page to go to the scraper. Youβll be guided to input a search term and get structured results. No setup needed!