Rust Input Function Example

Rust Input Function Example

Dynamically compile and run input-provided page function. Like Cheerio Scraper but in Rust.

DEVELOPER_TOOLSOPEN_SOURCEApify

Example actor showcasing running a user-provided function in a static-typed compiled language.

How does it work

  1. Reads the input from disk or via Apify API
  2. Extracts the page_function string from the input
  3. Stores the page_function string to the disk
  4. Spawns a system process using cargo to compile the page_function into a dynamic library
  5. Dynamically links the library and converts the page_function into a regular Rust function. It must adhere to predefined input/output types.
  6. The example code gets HTML from the input provided url and parses it into a document using the Scraper library
  7. The user-provided page_function gets the document as an input parameter and returns a JSON Value type using the json macro

Page function

Page function can use a predefined set of Rust libraries, currently only the Scraper library and serde_json for JSON Value type are provided.

TODO

But technically, thanks to dynamic compiling, we can enable users to provide a list of libraries to be used in the page_function.

Example page_function

1use serde_json::{Value,json};
2use scraper::{Html, Selector};
3
4fn selector_to_text(document: &Html, selector: &str) -> Option<String> {
5    document
6        .select(&Selector::parse(selector).unwrap())
7        .next()
8        .map(|el| el.text().next().unwrap().into() )
9}
10
11#[no_mangle]
12pub fn page_function (document: &Html) -> Value { 
13    println!("page_function starting");
14
15    let title = selector_to_text(&document, "title");
16    println!("extracted title: {:?}", title);
17
18    let header = selector_to_text(&document, "h1");
19    println!("extracted header: {:?}", header);
20
21    let companies_using_apify = document
22        .select(&Selector::parse(".Logos__container").unwrap())
23        .next().unwrap()
24        .select(&Selector::parse("img").unwrap())
25        .map(|el| el.value().attr("alt").unwrap().to_string())
26        .collect::<Vec<String>>();
27
28    println!("extracted companies_using_apify: {:?}", companies_using_apify);
29
30    let output = json!({
31        "title": title,
32        "header": header,
33        "companies_using_apify": companies_using_apify,
34    });
35    println!("inside pageFunction output: {:?}", output);
36    output
37}

Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!