Split Markdown into structured chunks using header hierarchy. Built with LangChain, it preserves metadata for RAG, documentation, and analysis. Configure headers, strip content, and integrate with vector databases. Ideal for AI workflows.
Split Markdown content into structured chunks using header hierarchy
This actor intelligently splits Markdown documents into semantically meaningful chunks based on header hierarchy. Built with LangChain's MarkdownHeaderTextSplitter
, it preserves metadata and structure for downstream applications like RAG systems, documentation processing, and content analysis.
✅ Header-Based Chunking
Split content at specified header levels (e.g., #
, ##
, ###
).
✅ Metadata Preservation
Each chunk includes hierarchical header metadata for context tracking.
✅ Flexible Configuration
✅ RAG-Ready Output
Chunks are formatted for direct use in vector databases or LLM pipelines.
RAG Systems
Split long documents into context-aware chunks for:
Documentation Processing
Break technical docs into sections for:
Content Analysis
Analyze/report on document structure (e.g., API reference parsing).
markdown-splitter
1{ 2 "markdown_text": "# Title\n## Section 1\nContent...\n## Section 2\nMore content...", 3 "headers_to_split_on": ["#", "##"], 4 "strip_headers": true 5}
Field | Type | Description |
---|---|---|
markdown_text | string | Markdown content to split (required) |
headers_to_split_on | array | Header levels to split on (default: ["#", "##", "###", "####", "#####", "######"] ) |
strip_headers | boolean | Remove headers from chunk content (default: true ) |
1{ 2 "chunks": [ 3 { 4 "content": "Section content here", 5 "metadata": { 6 "Header 1": "Title", 7 "Header 2": "Section 1" 8 } 9 }, 10 ... 11 ] 12}
##
headersYes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.
No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.
It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.
Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.
You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!