GetOData

Archive.org Advanced Search – Effortlessly Find Any Archive Content

3 min read

Intro:

The Archive.org Advanced Search Actor from Apify.com is a powerful tool for accessing the wealth of data available in the Internet Archive. Perfect for researchers, historians, and data analysts, it allows users to conduct highly customizable searches through millions of archived web pages, audio files, books, and more.

The Archive.org Advanced Search API leverages the vast digital repository of the Internet Archive to extract a diverse range of data types. Users can search by title, creator, description, collection, media type, and also utilize custom fields. This tool is particularly useful for individuals needing specific historical data, students conducting research, or professionals looking to analyze or scrape digital content.

✨ Features

  • Advanced Search Filters: Use operators like "contains" and "not" to fine-tune your searches.
  • Custom Fields Support: Specify up to 5 custom fields to meet niche searching needs.
  • Date Precision: Search by exact dates or ranges for more accurate results.
  • Pagination Support: Easily retrieve results in batches (up to 1000 per page) across multiple pages.
  • Sorting Options: Sort results by fields like publicdate or downloads.
  • Detailed Output: Receive structured JSON data that includes metadata such as titles, URLs, and dates.

🛠️ How to Use It

Step-by-step tutorial:

  1. Go to the tool’s page: Archive.org Advanced Search
  2. Click “Try for free” or “Run actor”.
  3. Fill in the required input fields:
    • any_field_value
    • title_value
    • creator_value
    • description_value
    • collection_value
    • mediatype_value
    • Up to 5 custom fields (name, value, operator)
    • Date and range fields (date, date_range_start, date_range_end)
    • Pagination and sorting fields (hits_per_page, page, sort_name, sort_value)
  4. Click “Run” and wait for the results.
  5. Download results or send to a webhook.

🧪 Sample Input (JSON)

json { "any_field_value": "", "any_field_operator": "contains", "title_value": "learn spanish", "title_operator": "contains", "creator_value": "", "creator_operator": "not", "description_value": "", "description_operator": "not", "collection_value": "", "collection_operator": "contains", "mediatype_value": "", "mediatype_operator": "is", "custom_field_1_name": "", "custom_field_1_value": "", "custom_field_1_operator": "contains", "custom_field_2_name": "", "custom_field_2_value": "", "custom_field_2_operator": "contains", "custom_field_3_name": "", "custom_field_3_value": "", "custom_field_3_operator": "contains", "custom_field_4_name": "", "custom_field_4_value": "", "custom_field_4_operator": "contains", "custom_field_5_name": "", "custom_field_5_value": "", "custom_field_5_operator": "contains", "date": "2019-05-10", "date_range_start": "1994-06-06", "date_range_end": "2024-02-06", "hits_per_page": 50, "page": 1, "sort_name": "downloads", "sort_value": "desc" }

📤 Output Data (Fields)

  • index
  • service_backend
  • hit_type
  • identifier
  • filename
  • publicdate
  • downloads
  • title
  • url
  • Additional metadata fields

💰 Pricing This actor is priced at $0.002 per request. There is a free tier available for trying out the actor before committing.

👨‍💻 Built By Maged120 — from Apify.com

✅ Final Thoughts The Archive.org Advanced Search is an essential tool for researchers, historians, and anyone interested in accessing archived digital content. Its flexibility and precision make it a must-try for anyone needing deep and tailored searches across the Internet Archive’s extensive collection.

🔗 Try the Actor Now 👉 Archive.org Advanced Search