Programming Language Detector

Programming Language Detector

this Actor identifies the programming language with high accuracy, providing confidence scores. Powered by advanced pattern matching and heuristic analysis, it supports over 100 programming languages and frameworks

AUTOMATIONDEVELOPER_TOOLSOTHERApify

📝 Overview

The Programming Language Detector is a powerful and efficient tool built for developers, data scientists, and automation enthusiasts. Whether you're analyzing a snippet of code or a file hosted online, this Actor identifies the programming language with high accuracy, providing confidence scores. Powered by advanced pattern matching and heuristic analysis, it supports over 100 programming languages and frameworks, making it an essential tool for code analysis, repository indexing, or educational purposes.

🚀 Why Choose This Actor?

  • Accuracy: Detects languages with precision, even in ambiguous or mixed-language files (e.g., HTML with embedded JavaScript and CSS).
  • Speed: Optimized for performance with dynamic sampling, pattern prioritization, and early stopping—handles large files efficiently.
  • Versatility: Accepts raw source code or file URLs as input, making it flexible for various use cases.
  • Detailed Insights: Provides confidence scores and candidate languages.

🎯 Use Cases

  • Code Analysis: Identify the language of code snippets in documentation, forums, or repositories.
  • Repository Indexing: Automatically tag files in a codebase with their programming languages.
  • Educational Tools: Help students and educators identify languages in code samples.
  • Automation Pipelines: Integrate language detection into your Apify workflows for processing code-related data.

🔧 How It Works

The Actor uses the LanguageDetector class, which employs a combination of:

  • Pattern Matching: Identifies language-specific keywords and syntax patterns (e.g., def for Python, <?php for PHP).
  • Heuristic Analysis: Resolves ambiguities between similar languages (e.g., JavaScript vs. TypeScript, C vs. C++).
  • Mixed-Language Support: Detects multiple languages in a single file (e.g., HTML with JavaScript and CSS).

The Actor accepts input as either raw source code or a file URL, processes it, and outputs a simplified result with the detected language, confidence score, candidate languages, and analysis.


Input Requirements

  • Provide at least one of sourceCode or fileUrl if both are provided it will favor the sourceCode over the url
  • sourceCode: A string containing the raw code to analyze.
  • fileUrl: A publicly accessible URL to a file containing the code (e.g., a GitHub raw file URL).

📥 Input Examples

Example 1: Raw Source Code (Python)

1{
2  "sourceCode": "def hello():\n    print(\"Hello, world!\")\n\nif __name__ == \"__main__\":\n    hello()"
3}

Example 2: File URL (Crystal)

1{
2  "fileUrl": "https://example.com/sample.cr"
3}

📤 Output Examples

The Actor outputs a simplified result in the following format:

  • language: The detected primary language.
  • extension: Extension of the file for the detected text
  • confidence: Confidence score for the primary language (0 to 1).
  • candidates: A dictionary of all detected languages with their confidence scores.
  • analysis: A textual description of the detection process.

Output for Example 1 (Python)

1{
2  "language": "javascript",
3  "extension": ".js",
4  "confidence": 0.85,
5  "candidates": {
6    "javascript": 0.85,
7    "dart": 0.35,
8    "elixir": 0.35
9  },
10  "analysis": "High confidence detection: javascript"
11}

Output Error (Empty Input)

The Actor will fail with the message:

No input provided. Please provide either 'sourceCode' or 'fileUrl'.

🌐 Supported Languages and Text Formats

The Actor supports over 100 programming languages, frameworks, and text formats, including:

Supported

  • python
  • javascript
  • typescript
  • html
  • css
  • java
  • c
  • cpp
  • csharp
  • php
  • ruby
  • rust
  • scala
  • kotlin
  • swift
  • sql
  • bash
  • powershell
  • matlab
  • perl
  • lua
  • haskell
  • dart
  • groovy
  • elixir
  • clojure
  • vba
  • julia
  • fortran
  • shell
  • objective-c
  • pascal
  • ada
  • d
  • nim
  • crystal
  • arduino
  • assembly
  • verilog
  • vhdl
  • latex
  • markdown
  • yaml
  • json
  • xml
  • makefile
  • dockerfile
  • graphql
  • hcl
  • postscript
  • G-code
  • GLSL
  • Haxe
  • Racket
  • sass
  • Tcl
  • Gherkin
  • PromQL
  • INI
  • Logtalk
  • ABAP
  • APL/J
  • COBOL
  • Erlang
  • F#
  • Gradle
  • Kotlin/Native
  • Lisp
  • OCaml/SML
  • R
  • VBScript
  • XSLT
  • Zsh
  • Django
  • Flask
  • Spring Boot
  • Ruby on Rails
  • Angular
  • React
  • Vue.js
  • ASP.NET Core
  • Express.js
  • Laravel
  • laravelblade
  • .... and more

Frameworks and Variants

  • Laravel (PHP)
  • Ruby on Rails (Ruby)
  • Angular (HTML/JavaScript)
  • Vue.js (HTML/JavaScript)
  • Laravel Blade (HTML/PHP)
  • Kotlin/Native
  • Gradle

Markup and Configuration Formats

  • HTML
  • CSS
  • LaTeX
  • Markdown
  • YAML
  • JSON
  • XML
  • Makefile
  • Dockerfile
  • GraphQL
  • HCL (HashiCorp Configuration Language)
  • PostScript
  • Sass
  • Gherkin
  • PromQL
  • INI
  • XSLT

Special Cases

  • Mixed-language files (e.g., HTML with embedded JavaScript and CSS).
  • Shebang-based scripts (e.g., #!/bin/bash for Bash).

🌟 Try the Programming Language Detector today and simplify your code analysis tasks! 🌟


Frequently Asked Questions

Is it legal to scrape job listings or public data?

Yes, if you're scraping publicly available data for personal or internal use. Always review Websute's Terms of Service before large-scale use or redistribution.

Do I need to code to use this scraper?

No. This is a no-code tool — just enter a job title, location, and run the scraper directly from your dashboard or Apify actor page.

What data does it extract?

It extracts job titles, companies, salaries (if available), descriptions, locations, and post dates. You can export all of it to Excel or JSON.

Can I scrape multiple pages or filter by location?

Yes, you can scrape multiple pages and refine by job title, location, keyword, or more depending on the input settings you use.

How do I get started?

You can use the Try Now button on this page to go to the scraper. You’ll be guided to input a search term and get structured results. No setup needed!