Legacy PhantomJS Crawler – Automate Web Scraping with Ease
2 min read
Intro:
The Legacy PhantomJS Crawler is a powerful web scraping tool designed to help users extract data from websites using front-end JavaScript code and a headless browser environment. It is especially useful for developers needing to scrape data from dynamic web pages where traditional scraping methods may fall short.
🔍 What Is Legacy PhantomJS Crawler?
The Legacy PhantomJS Crawler utilizes the deprecated but still functional PhantomJS headless browser to recursively navigate websites and collect data. It operates much like a user would, executing JavaScript on each page to extract relevant information. This tool is particularly beneficial for developers engaging in web scraping tasks, especially when needing to interact with JavaScript-heavy sites.
✨ Features
- Headless browser capabilities with PhantomJS
- JavaScript-based data extraction
- Recursive crawling of web pages
- Customizable page function for tailored data extraction
- Support for cookies and session management
- Ability to use proxies for anonymity
- Finish webhook notifications
- Compatibility with legacy Apify Crawler settings
🛠️ How to Use It
- Go to the tool’s page: Legacy PhantomJS Crawler
- Click “Try for free” or “Run actor”
- Fill in the required input fields:
- Start URLs
- Page function JavaScript code
- Optional configurations (proxy settings, cookies, etc.)
- Click “Run” and wait for results to be processed.
- Download results or send to the provided webhook.
🧪 Sample Input (JSON)
json { "startUrls": [ "http://www.example.com" ], "pageFunction": "function pageFunction(context) { return { title: document.title }; }" }
📤 Output Data (Fields)
id
: Unique identifier for the requesturl
: The requested URLloadedUrl
: Final URL loaded after processing any redirectsrequestedAt
: Timestamp of the requestresponseStatus
: HTTP status code of the loaded pagepageFunctionResult
: Result from the user-defined JavaScript codeerrorInfo
: Description of any errors encountered
💰 Pricing
This actor is priced at $0.50/hour. It also provides a free tier for limited usage.
👨💻 Built By
Apify — from Apify.com
✅ Final Thoughts
The Legacy PhantomJS Crawler remains a great choice for developers looking to scrape data from complex, JavaScript-heavy websites. Although it is a legacy tool, its capabilities allow for robust data extraction. However, considering that PhantomJS is no longer actively developed, users are encouraged to explore more modern alternatives like the Web Scraper actor.
🔗 Try the Actor Now
👉 Legacy PhantomJS Crawler