The /llms.txt Generator ๐ธ๏ธ๐ extracts website content to create an llms.txt file for AI apps ๐คโจ like LLM fine-tuning and indexing. Output is available ๐ฅ in the Key-Value Store for easy download and integration into workflows. ๐
The /llms.txt Generator is an Apify Actor that helps you extract essential website content and generate an /llms.txt file, making your content ready for AI-powered applications such as fine-tuning, indexing, and integrating large language models (LLMs) like GPT-4, ChatGPT, or LLaMA. This Actor leverages the Website Content Crawler actor to perform deep crawls and extract text content from web pages, ensuring comprehensive data collection. The Website Content Crawler is particularly useful because it supports output in multiple formats, including markdown, which is used by the /llms.txt.
The /llms.txt format is a markdown-based standard for providing AI-friendly content. It contains:
Proposed structure:
1# Title 2 3> Optional description 4 5Optional details go here 6 7## Section name 8 9- [Link title](https://link_url): Optional link details 10 11## Optional 12 13- [Link title](https://link_url)
By adding an /llms.txt file to your website, you make it easy for AI systems to understand, index, and use your content effectively.
Our Actor is designed to simplify and automate the creation of /llms.txt files. Here are its key features:
1{ 2 "startUrl": "https://docs.apify.com", 3 "maxCrawlDepth": 1 4}
1# docs.apify.com 2 3## Index 4 5- [Home | Platform | Apify Documentation](https://docs.apify.com/platform): Apify is your one-stop shop for web scraping, data extraction, and RPA. Automate anything you can do manually in a browser. 6- [Web Scraping Academy | Academy | Apify Documentation](https://docs.apify.com/academy): Learn everything about web scraping and automation with our free courses that will turn you into an expert scraper developer. 7- [Apify Documentation](https://docs.apify.com/api) 8- [API scraping | Academy | Apify Documentation](https://docs.apify.com/academy/api-scraping): Learn all about how the professionals scrape various types of APIs with various configurations, parameters, and requirements. 9- [API client for JavaScript | Apify Documentation](https://docs.apify.com/api/client/js/) 10- [Apify API | Apify Documentation](https://docs.apify.com/api/v2) 11- [API client for Python | Apify Documentation](https://docs.apify.com/api/client/python/) 12...
Start generating /llms.txt files today and empower your AI applications with clean, structured, and AI-friendly data! ๐๐ค
Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool โ just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. Youโll be guided to input a search term and get structured results. No setup needed!