Article Text Extractor

Article Text Extractor

Simply extracts article texts and other meta info from the given URL. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose.

NEWSOPEN_SOURCEApify

Simply extracts article text and other meta info from given url. Uses https://github.com/ageitgey/node-unfluff which is a NodeJS implementation of https://github.com/grangier/python-goose. Check out also lukaskrivka/article-extractor-smart.

Output get's saved into a default key-value store under the OUTPUT key. HTML of the given page is stored under the page.html key.

Example output:

1{
2  "title": "Sánchez no logra extender su poder territorial pese al triunfo del 26-M",
3  "softTitle": "Sánchez no logra extender su poder territorial pese al triunfo del 26-M",
4  "date": "16/06/2019 22:03",
5  "author": [
6    "Madrid"
7  ],
8  "publisher": "La Vanguardia",
9  "copyright": "La Vanguardia Ediciones Todos los derechos reservados",
10  "favicon": "https://www.lavanguardia.com/rsc/images/ico/favicon.ico",
11  "description": "El PSOE ganó el pasado 26 de mayo las elecciones municipales y autonómicas de manera 'clara y rotunda', según celebró el propio Pedro Sánchez aquella misma noche. Aunque la victoria socialista se tiñó...",
12  "lang": "es",
13  "canonicalLink": "https://www.lavanguardia.com/politica/20190617/462906149711/psoe-pedro-sanchez-elecciones-26m-alcaldias-gobiernos-espana.html",
14  "tags": [],
15  "image": "https://www.lavanguardia.com/r/GODO/LV/p6/WebSite/2019/06/17/Recortada/20190614-636961455890161857_20190614215051428-kvhE-U462903686315FDE-992x558@LaVanguardia-Web.jpg",
16  "videos": [],
17  "links": [],
18  "text": "..."
19}

Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!