Ubuntu Images Scraper

Ubuntu Images Scraper

This scraper enables you to retrieve Ubuntu image ID(s) available on various public clouds. Ubuntu is a complete Linux operating system, freely available with both community and professional support.

DEVELOPER_TOOLSOPEN_SOURCEApify

This scraper enables you to retrieve Ubuntu image ID(s) available on various public clouds. Ubuntu is a complete Linux operating system, freely available with both community and professional support. If you want to know more about ubuntu, check following docs.

Motivation

Imagine you have tons of deployments running in public cloud (AWS, Microsoft Azure, Google Cloud,..) on Ubuntu as your base operating system. These images needs to be periodically updated for reasons like:

  • new features
  • bug fixes
  • security patches

This scraper saves you time by extracting relevant image IDs for your cloud provider. You can even hook some post-processor to it (for example opening Pull-Request to your repository).

Usage

Run as actor on Apify Platform. One run of this actor consumes cca 0.003 CU, with memory size 1024 MB. Use of apify proxy is not required for this actor.

Input and Output

This scraper fetches data from https://cloud-images.ubuntu.com/locator/. As you can see dropdowns at the bottom of the page corresponds with inputs.

NameDescriptionExample
CloudSelect images only for specific cloud providerAmazon AWS
ZoneZone ~ group of data-centersus-east-1
NameUbuntu friendly namefocal
VersionUbuntu version20.04
ArchitectureProcessor architectureamd64
Instance typeInstance type depending on cloud providerhvm-ssd
ReleaseUbuntu release20200729
IDImage IDami-0758470213bdd23b1
Number of resultsNumber of image descriptions to be fetched1

All of the inputs above are optional. However it is not very useful not to provide any, since they enable you to filter through all the images available.

With following input example you will get latest release of Ubuntu 20.04 in AWS Northern Virginia region.

Input:

1{
2    "cloud": "Amazon AWS",
3    "zone": "us-east-1",
4    "name": "focal",
5    "version": "20.04",
6    "arch": "Any",
7    "instanceType": "hvm-ssd",
8    "numberOfResults": 1
9}

Output:

1[
2    {
3        "cloud": "Amazon AWS",
4        "zone": "us-east-1",
5        "name": "focal",
6        "version": "20.04",
7        "arch": "amd64",
8        "instanceType": "hvm-ssd",
9        "release": "20200729",
10        "id": "ami-0758470213bdd23b1"
11    }
12]

Limitation / Cavetas

  • Maximum number of results is limited to 100 due to source page limitations

Versioning

Versioning upholds to Apify standards. See docs for more information.

Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!