Subscan is a powerful subdomain enumeration tool built with Rust, specifically designed for penetration testing purposes. It combines various discovery techniques into a single, lightweight binary, making subdomain hunting easier and faster for security researchers

Features

🕵️ Smart Discovery Tricks
- Use multiple search engines (Google, Yahoo, Bing, DuckDuckGo, etc.)
- Integrate with APIs like Shodan, Censys, VirusTotal and more
- Perform zone transfer checks
- Subdomain brute-forcing with optimized wordlists
🔍 Resolve IP addresses for all subdomains
📎 Export reports in CSV, HTML, JSON, or TXT formats
🛠️ Configurable
- Customize HTTP requests (user-agent, timeout, etc.)
- Rotate requests via proxies (--proxy argument)
- Fine-tune IP resolver with --resolver arguments
- Filter and run specific modules with --skips and --modules
🐳 Docker Friendly
- Native support for amd64 and arm64 Linux platforms
- A tiny container that won't eat up your storage — under 1GB and ready to roll 🚀
💻 Compatible with multiple platforms and easy to install as a single binary

User Guide

This chapter provides an overview of the basic usage of Subscan, designed to help end users get started quickly and effectively

Here’s a quick overview of the sections included

Quickstart
- Install
- Usage
  - CLI
  - Docker
  - Crate
Commands
- scan
- brute
- module
  - list
  - get
  - run
Environments

Quickstart

Subscan is a fast and efficient subdomain enumeration tool designed for penetration testers and security researchers. In this chapter, you'll learn how to quickly set up and start using Subscan to discover subdomains and improve your security assessments

Here's what you'll find in this chapter

Install

There are several ways to install Subscan, depending on your preferences. You can install it via Cargo (Rust's package manager), use Docker for containerized environments, or download prebuilt cross-platform binaries. Choose the method that works best for your setup

Install With Cargo

🦀 Install the subscan tool using Cargo, Rust's package manager. Make sure you have Rust installed on your system. Then, run:

~$ cargo install subscan

Pull Docker Image

🐳 For containerized usage, you can pull the subscan Docker image directly from Docker Hub

~$ docker pull eredotpkfr/subscan:latest

Download Prebuilt Binaries

📦 Prebuilt cross-platform binaries are available on the releases page. Download the one compatible with your operating system

Usage

This chapter guides you on how to use the Subscan to efficiently discover subdomains. Subdomain discovery features are implemented as modular SubscanModule components, which are automatically executed when a scan is initiated. For technical insights, check out the Development chapter

✨ In this section, you'll find detailed instructions for different usage methods

How to use the Subscan CLI for quick and effective subdomain enumeration
Run Subscan in a Docker container for a lightweight, portable setup
Integrating Subscan as a Crate in your Rust project for seamless integration with your codebase

Subscan CLI Usage

🛠️ The Subscan CLI is a versatile tool that provides the following functionalities

Start a scan to discover subdomains associated with a specific domain
Perform a brute force attack on a domain using a specified wordlist
Manage registered modules. See the module command details

✨ Here's a quick overview of how to use it

~$ subscan
            _
           | |
  ___ _   _| |__  ___  ___ __ _ _ __
 / __| | | | '_ \/ __|/ __/ _` | '_ \
 \__ \ |_| | |_) \__ \ (_| (_| | | | |
 |___/\__,_|_.__/|___/\___\__,_|_| |_|


Usage: subscan [OPTIONS] <COMMAND>

Commands:
  scan    Start scan on any domain address
  brute   Start brute force attack with a given wordlist
  module  Subcommand to manage implemented modules
  help    Print this message or the help of the given subcommand(s)

Options:
  -v, --verbose...  Increase logging verbosity
  -q, --quiet...    Decrease logging verbosity
  -h, --help        Print help (see more with '--help')
  -V, --version     Print version

Start Scan

To scan a domain using all available modules, use the following command:

~$ subscan scan -d example.com

You can also choose specific modules to run or skip using the --skips and --modules arguments. Module names should be provided as a comma-separated list¹

~$ # skip the commoncrawl and google modules during the scan
~$ subscan scan -d example.com --skips=commoncrawl,google

~$ # run only the virustotal module
~$ subscan scan -d example.com --modules=virustotal

If the module you’re using requires authentication, you can provide the necessary credentials, such as an API key, through module-specific environment variables. For more details about environment variables, refer to the Environments chapter
SUBSCAN_VIRUSTOTAL_APIKEY=foo subscan scan -d example.com --modules=virustotal

Brute Force

Use the brute command to start a brute force attack with a specific wordlist

~$ subscan brute -d example.com --wordlist file.txt

To specify wordlist into docker container, see the Docker usage

If a module is included in both the --skips and --modules arguments, it will be skipped and not executed ↩

Docker Usage

Once you’ve pulled the pre-built image from Docker Hub, you can easily run the container to perform subdomain enumeration

~$ docker run -it --rm eredotpkfr/subscan scan -d example.com

Specify environment variable via docker --env

~$ docker run -it --rm \
    --env SUBSCAN_VIRUSTOTAL_APIKEY=foo \
    eredotpkfr/subscan scan -d example.com --modules=virustotal

Specify .env file from your host machine, use /data folder

~$ docker run -it --rm \
    --volume="$PWD/.env:/data/.env" \
    eredotpkfr/subscan scan -d example.com --skips=commoncrawl

Saving output reports to host machine, use /data folder

~$ docker run -it --rm \
    --volume="$PWD/data:/data" \
    eredotpkfr/subscan scan -d example.com

To specify wordlist into docker container, use /data folder

~$ docker run -it --rm \
    --volume="$PWD/wordlist.txt:/data/wordlist.txt" \
    eredotpkfr/subscan brute -d example.com \
    -w wordlist.txt --print

Build a Docker Image

To build a Docker image locally, run the following command

~$ docker build -t subscan .

If you encounter memory issues while building on an Apple Silicon machine, you can run Colima with the following parameters
~$ colima start --cpu 11 --memory 16

Crate Usage

You can easily add Subscan to your code and use its results in your application. Since Subscan works asynchronously, you need to use it in async code blocks. We recommend using Tokio as the async runtime

This chapter provides step-by-step guidance on how to integrate Subscan into your code. For more detailed usage and additional code examples, visit the project's docs.rs page or check the examples/ folder in the repository

Add subscan crate into your project dependencies
```
~$ cargo add subscan
```

Create a new instance and start to use it

 #[tokio::main]
 async fn main() {
     // set module conccurrency to 1
     // set HTTP timeout to 120
     let config = SubscanConfig {
         concurrency: 1,
         filter: CacheFilter::FilterByName(ModuleNameFilter {
             modules: vec!["alienvault".into()],
             skips: vec![],
         }),
         requester: RequesterConfig {
             timeout: Duration::from_secs(120),
             ..Default::default()
         },
         ..Default::default()
     };

     let subscan = Subscan::from(config);
     let result = subscan.scan("domain.com").await;

     for item in result.items {
         // do something with item
     }
 }

Commands

This chapter provides a comprehensive guide to the Subscan CLI commands. Below is a list of available commands. For detailed information on usage and arguments, refer to the corresponding sections

`scan`

This command starts a scan by running registered modules for subdomain discovery. See the module command to manage registered modules

Argument List

All arguments below can be used with the scan command and you can customize a scan according to your needs, see here for common use cases

Name	Short	Description
`--domain`	`-d`	Target domain to be scanned
`--user-agent`	`-u`	Set a `User-Agent` header
`--http-timeout`	`-t`	HTTP timeout as seconds
`--proxy`	`-p`	Set HTTP proxy
`--output`	`-o`	Set output format (`txt`, `csv`, `json`, `html`)
`--print`		If sets, output will be logged on stdout
`--module-concurrency`	`-c`	Module runner concurrency level
`--resolver-timeout`		IP resolver timeout
`--resolver-concurrency`		IP resolver concurrency level
`--resolver-list`		A text file containing list of resolvers. See `resolverlist.template`
`--disable-ip-resolve`		Disable IP address resolve process
`--modules`	`-m`	Comma separated list of modules to run
`--skips`	`-s`	Comma separated list of modules to skip
`--help`	`-h`	Print help

Common Use Cases

Adjust HTTP request timeouts for slow networks
```
~$ subscan scan -d example.com -t 120
```

Use a proxy server to bypass anti-bot systems

~$ subscan scan -d example.com -t 120 --proxy 'http://my.prox:4444'

Increase concurrency to speed up the scan
```
~$ subscan scan -d example.com -c 10
```

Fine-tune IP address resolver component according to your network

~$ subscan scan -d example.com --resolver-timeout 1 --resolver-concurrency 100

Disable the IP resolution process

~$ subscan scan -d example.com --disable-ip-resolve

Customize the scan by filtering modules

# skip the commoncrawl and google modules during the scan
~$ subscan scan -d example.com --skips=commoncrawl,google

# run only the virustotal module
~$ subscan scan -d example.com --modules=virustotal

If a module is included in both the --skips and --modules arguments, it will be skipped and not executed

`brute`

With this command you can use the brute force technique to discover subdomains on a domain

Argument List

All arguments below can be used with the brute command, see here for common use cases

Name	Short	Description
`--domain`	`-d`	Target domain to be scanned
`--wordlist`	`-w`	Wordlist file to be used during attack
`--print`		If sets, output will be logged on stdout
`--stream-to-txt`	`-s`	Optional `txt` file to create file stream for the subdomains that found. If sets the `--output` parameter will be disabled
`--output`	`-o`	Set output format (`txt`, `csv`, `json`, `html`)
`--resolver-timeout`		IP resolver timeout
`--resolver-concurrency`		IP resolver concurrency level
`--resolver-list`		A text file containing list of resolvers. See `resolverlist.template`
`--help`	`-h`	Print help

Common Use Cases

Run a basic brute force attack with default settings
```
~$ subscan brute -d example.com -w wordlist.txt
```

Increase resolver concurrency to improve attack speed

~$ subscan brute -d example.com -w wordlist.txt --resolver-concurrency 200

Fine-tune IP address resolver component according to your network

~$ subscan brute -d example.com -w wordlist.txt --resolver-timeout 1 --resolver-concurrency 100

Skip creating a report and print results directly to stdout
```
~$ subscan brute -d example.com -w wordlist.txt --print
```

`module`

Subscan is designed with an extensible architecture, where each subdomain discovery component is referred to as a SubscanModule. In Subscan terminology, any component involved in subdomain discovery is considered a module. You can create your own custom modules and integrate them into Subscan. Modules can also include additional components. Details on how to develop and integrate your own modules are available in the Development chapter

The module command allows you to list the modules registered in Subscan, view their details, and run any module. Below are the subcommands that serve these purposes;

`list`

Lists the modules registered on Subscan as a table with their details. The output looks like the following

~$ subscan module list

+--------------------+---------------+----------------+-------------+
| Name               | Requester     | Extractor      | Is Generic? |
+--------------------+---------------+----------------+-------------+
| bing               | HTTPClient    | HTMLExtractor  | true        |
| duckduckgo         | ChromeBrowser | HTMLExtractor  | true        |
| google             | HTTPClient    | HTMLExtractor  | true        |
| yahoo              | HTTPClient    | HTMLExtractor  | true        |
| alienvault         | HTTPClient    | JSONExtractor  | true        |
| anubis             | HTTPClient    | JSONExtractor  | true        |
+--------------------+---------------+----------------+-------------+

`get`

Gets a single module with details

~$ subscan module get zonetransfer

+--------------+-----------+-----------+-------------+
| Name         | Requester | Extractor | Is Generic? |
+--------------+-----------+-----------+-------------+
| zonetransfer | None      | None      | false       |
+--------------+-----------+-----------+-------------+

`run`

This command runs the specified module and is primarily used to quickly test a new module during its implementation. It has a similar set of arguments as the scan command

Argument List

Name	Short	Description
`--domain`	`-d`	Target domain to be scanned
`--output`	`-o`	Set output format (`txt`, `csv`, `json`, `html`)
`--print`		If sets, output will be logged on stdout
`--user-agent`	`-u`	Set a `User-Agent` header
`--http-timeout`	`-t`	HTTP timeout as seconds
`--proxy`	`-p`	Set HTTP proxy
`--resolver-timeout`		IP resolver timeout
`--resolver-concurrency`		IP resolver concurrency level
`--resolver-list`		A text file containing list of resolvers. See `resolverlist.template`
`--disable-ip-resolve`		Disable IP address resolve process
`--help`	`-h`	Print help

Common Use Cases

Run module by name

~$ # runs google module on example.com
~$ subscan module run google -d example.com

Run module by name without IP resolve

~$ # runs shodan module on example.com without IP resolve
~$ subscan module run shodan -d example.com --disable-ip-resolve

If the module has authentication, set it as environment variable
```
~$ # runs censys module on example.com
~$ SUBSCAN_CENSYS_APIKEY=foo subscan module run censys -d example.com --user-agent 'subscan' -t 120
```
For more details about environment variables, refer to the Environments chapter

Environments

Subscan has the ability to read all your environment variables from the .env file in your working directory. To learn how to define your environment variables in the .env file, you can refer to the .env.template file. All the Subscan environment variables uses SUBSCAN namespace as a prefix

There are two types of environment variables:

Dynamic: These environment variables follow a specific format (e.g., SUBSCAN_<MODULE_NAME>_FOO) and Subscan can read them automatically
Static: These are predefined environment variables that we know already

Statics

Name	Required	Description
`SUBSCAN_CHROME_PATH`	`false`	Specify your Chrome executable. If not specified, the Chrome binary will be fetched automatically by headless_chrome based on your system architecture

Dynamics

Name	Required	Description
`SUBSCAN_<MODULE_NAME>_HOST`	`false`	Some API integration modules can provide user specific host, for these cases, set module specific host
`SUBSCAN_<MODULE_NAME>_APIKEY`	`false`	Some modules may include API integration and require an API key for authentication. Set the API key in these cases
`SUBSCAN_<MODULE_NAME>_USERNAME`	`false`	Set the username for a module if it uses HTTP basic authentication
`SUBSCAN_<MODULE_NAME>_PASSWORD`	`false`	Set the password for a module if it uses HTTP basic authentication

Creating `.env` File

Please see the .env.template file in project repository. Your .env file should follow a similar format as shown below

SUBSCAN_BEVIGIL_APIKEY=foo
SUBSCAN_BINARYEDGE_APIKEY=bar
SUBSCAN_BUFFEROVER_APIKEY=baz

Development

This chapter provides an in-depth guide for developers on how to contribute to and extend Subscan. It covers everything from setting up the development environment to understanding the core architecture and adding new features or modules.

Here’s a quick overview of the sections included

Setup Development Environment

This section covers topics like setting up a development environment and running tests for those who want to contribute to Subscan

To set up your development environment, please follow the instructions below

Clone repository

~$ git clone https://github.com/eredotpkfr/subscan && cd subscan

Install pre-commit and its hooks

~$ # Install pre-commit Mac or Linux
~$ make install-pre-commit-mac
~$ # Install pre-commit hooks
~$ make install-pre-commit-hooks
~$ # Check everything is OK
~$ pre-commit run -a

Install required cargo tools for development

~$ # Install cargo tools
~$ make install-cargo-tools

Create .env file from .env.template
```
~$ cp .env.template .env
```

Finally build the project and run CLI

~$ cargo build && target/debug/subscan --help

Running Tests

You have many options to run the tests, below are the command sets on how to run the tests differently

~$ # run all tests
~$ cargo test # or `make test`
~$ # capture outputs
~$ cargo test -- --nocapture
~$ # run only doc tests
~$ cargo test --doc
~$ # run a single test
~$ cargo test -- engines::bing_test::bing_run_test
~$ # run only integration tests
~$ cargo test --tests modules::integrations

To run tests via nextest, run following command

~$ make nextest

Create coverage report with cargo-llvm-cov

~$ make coverage

Building Docs

To build documentations, run following command

~$ make doc # or `cargo doc`

To serve project book with hot reload, use following command

~$ # run book tests and serve
~$ make live-book

Components

This chapter provides detailed information about the functionality of the core components that make up Subscan. These components are reusable structures designed to simplify repetitive tasks, such as organizing HTTP requests or facilitating subdomain extraction operations. By using these components, you can streamline your workflow and avoid redundant code

You can also create custom components tailored to your specific needs and integrate them into the subdomain discovery process. These components add modularity to Subscan, allowing it to be easily extended and customized

The core components in Subscan are listed below. Follow the links for more details

Requesters

Requesters are components designed to manage HTTP requests through a unified interface. Each requester offers unique features, and when HTTP requests are needed during subdomain discovery, the appropriate requester can be selected based on the requirements.

Subscan includes predefined requesters like

ChromeBrowser

This requester component runs a Chrome process in the background, allowing HTTP requests through the browser. It has advantages such as rendering JavaScript, bypassing anti-bot systems, etc.
HTTPClient

The HTTP client requester component is identical to the standard HTTP client, using the reqwest crate's client as its implementation

Create Your Custom Requester

Each requester component should be implemented following the interface below. For a better understanding, you can explore the docs.rs page and review the crates listed below

#[async_trait]
#[enum_dispatch]
pub trait RequesterInterface: Sync + Send {
    // Returns requester configurations as a RequesterConfig object
    async fn config(&mut self) -> &mut RequesterConfig;
    // Configure current requester object by using new RequesterConfig object
    async fn configure(&mut self, config: RequesterConfig);
    // HTTP GET method implementation to fetch HTML content from given source URL
    async fn get_content(&self, url: Url) -> Result<Content>;
}

Below is a simple example of a custom requester. For more examples, you can check the examples/ folder on the project's GitHub page. You can also refer to the source code of predefined requester implementations for a better understanding

pub struct CustomRequester {
    config: RequesterConfig,
}

#[async_trait]
impl RequesterInterface for CustomRequester {
    async fn config(&mut self) -> &mut RequesterConfig {
        &mut self.config
    }

    async fn configure(&mut self, config: RequesterConfig) {
        self.config = config;
    }

    async fn get_content(&self, _url: Url) -> Result<Content> {
        Ok(Content::Empty)
    }
}

Extractors

Extractor components are responsible for parsing subdomain addresses from any Content object

The extractor components already implemented in Subscan are as follows

HTMLExtractor

Extracts subdomain addresses from inner text by given XPath or CSS selector
JSONExtractor

Extracts subdomain addresses from JSON content. JSON parsing function must be given for this extractor
RegexExtractor

Regex extractor component generates subdomain pattern by given domain address and extracts subdomains via this pattern

Create Your Custom Extractor

Each extractor component should be implemented following the interface below. For a better understanding, you can explore the docs.rs page and review the crates listed below

#[async_trait]
#[enum_dispatch]
pub trait SubdomainExtractorInterface: Send + Sync {
    // Generic extract method, it should extract subdomain addresses
    // from given Content
    async fn extract(&self, content: Content, domain: &str) -> Result<BTreeSet<Subdomain>>;
}

Below is a simple example of a custom extractor. For more examples, you can check the examples/ folder on the project's GitHub page. You can also refer to the source code of predefined requester implementations for a better understanding

pub struct CustomExtractor {}

#[async_trait]
impl SubdomainExtractorInterface for CustomExtractor {
    async fn extract(&self, content: Content, _domain: &str) -> Result<BTreeSet<Subdomain>> {
        let subdomain = content.as_string().replace("-", "");

        Ok([subdomain].into())
    }
}

Subscan Module

SubscanModule components are the core components that can be executed by Subscan. Each module capable of performing subdomain discovery is named a SubscanModule, and when the subscan scan command is run, these modules are read from an in-memory cache and executed asynchronously. This architecture makes Subscan extensible and modular

A SubscanModule may contain various components such as Requester and Extractor. Most modules implemented in Subscan use these components. You can list the implemented modules with their details using the subscan module list command. If you'd like to view the in-memory cache, you can check the CacheManager struct, which is another component designed for operations like filtering the cache or accessing a specific module

Create Your Own Module

Each SubscanModule component should be implemented following the interface below. For a better understanding, you can explore the docs.rs page and review the crates listed below

#[async_trait]
#[enum_dispatch]
pub trait SubscanModuleInterface: Sync + Send {
    /// Returns module name, name should clarify what does module
    async fn name(&self) -> &str;
    /// Loads `.env` file and fetches module environment variables with variable name.
    /// If system environment variable set with same name, `.env` file will be overrode
    /// See the [`SubscanModuleEnvs`](crate::types::env::SubscanModuleEnvs) for details
    async fn envs(&self) -> SubscanModuleEnvs {
        self.name().await.into()
    }
    /// Returns module requester address as a mutable reference if available
    async fn requester(&self) -> Option<&Mutex<RequesterDispatcher>>;
    /// Returns module extractor reference if available
    async fn extractor(&self) -> Option<&SubdomainExtractorDispatcher>;
    /// Configure module requester instance
    async fn configure(&self, rconfig: RequesterConfig) {
        if let Some(requester) = self.requester().await {
            requester.lock().await.configure(rconfig).await;
        }
    }
    /// Just like a `main` method, when the module run this `run` method will be called.
    /// So this method should do everything
    async fn run(&mut self, domain: &str, results: Sender<OptionalSubscanModuleResult>);
    /// Builds [`OptionalSubscanModuleResult`](crate::enums::result::OptionalSubscanModuleResult)
    /// with any [`Subdomain`](crate::types::core::Subdomain)
    async fn item(&self, sub: &Subdomain) -> OptionalSubscanModuleResult {
        (self.name().await, sub).into()
    }
    /// Builds [`OptionalSubscanModuleResult`](crate::enums::result::OptionalSubscanModuleResult)
    /// with any [`SubscanModuleStatus`](crate::types::result::status::SubscanModuleStatus)
    async fn status(&self, status: SubscanModuleStatus) -> OptionalSubscanModuleResult {
        (self.name().await, status).into()
    }
    /// Builds [`OptionalSubscanModuleResult`](crate::enums::result::OptionalSubscanModuleResult)
    /// with custom error message
    async fn error(&self, msg: &str) -> OptionalSubscanModuleResult {
        (self.name().await, msg).into()
    }
}

Below is a simple example of a custom module. For more examples, you can check the examples/ folder on the project's GitHub page. You can also refer to the source code of predefined requester implementations for a better understanding

pub struct CustomModule {
    pub requester: Mutex<RequesterDispatcher>,
    pub extractor: SubdomainExtractorDispatcher,
}

#[async_trait]
impl SubscanModuleInterface for CustomModule {
    async fn name(&self) -> &str {
        &"name"
    }

    async fn requester(&self) -> Option<&Mutex<RequesterDispatcher>> {
        Some(&self.requester)
    }

    async fn extractor(&self) -> Option<&SubdomainExtractorDispatcher> {
        Some(&self.extractor)
    }

    async fn run(&mut self, _domain: &str, results: Sender<OptionalSubscanModuleResult>) {
        let subdomains = BTreeSet::from_iter([Subdomain::from("bar.foo.com")]);

        for subdomain in &subdomains {
            results.send((self.name().await, subdomain).into()).unwrap();
        }
    }
}

Generic Modules

Some module implementations are very similar to each other, and sometimes we can use the same logic and algorithms while performing subdomain discovery. For example, during an API integration, the following steps will almost always be the same for most modules

Make an API call to the endpoint
Parse the subdomains from the incoming JSON content
Check if there is a pagination
- If pagination exists, go back to step 1 for the next page
- If there is no pagination, break the loop
Return the discovered subdomains for Subscan

To reduce the implementation time and avoid code duplication in Subscan, there are generic modules. Some of the registered modules in Subscan use these generic implementations, which can be viewed with the subscan module list command. Below are details of two modules, one using a generic module and one not

~$ subscan module get alienvault
+------------+------------+---------------+-------------+
| Name       | Requester  | Extractor     | Is Generic? |
+------------+------------+---------------+-------------+
| alienvault | HTTPClient | JSONExtractor | true        |
+------------+------------+---------------+-------------+

~$ subscan module get zonetransfer
+--------------+-----------+-----------+-------------+
| Name         | Requester | Extractor | Is Generic? |
+--------------+-----------+-----------+-------------+
| zonetransfer | None      | None      | false       |
+--------------+-----------+-----------+-------------+

The zonetransfer module is a very custom subdomain discovery method that performs DNS queries, so we cannot define it generically. Also as you can see, it has not any Requester or Extractor component. However, a module that makes API calls and parses the resulting output, such as the alienvault module, can use a generic module like GenericIntegrationModule within its implementation and return an instance of GenericIntegrationModule during its implementation

The following generic modules are defined within Subscan. For more details, follow the links provided

Generic Integration Module

The GenericIntegrationModule is primarily used for simple API integrations. To understand how it works, check the source code on the docs.rs page. Additionally, looking at the source code of other modules that use this implementation will help you understand how to utilize it

A module that uses this one internally would look like the following

pub const EXAMPLE_MODULE_NAME: &str = "example";
pub const EXAMPLE_URL: &str = "https://api.example.com/api/v1";

pub struct ExampleModule {}

impl ExampleModule {
    pub fn dispatcher() -> SubscanModuleDispatcher {
        let requester: RequesterDispatcher = HTTPClient::default().into();
        let extractor: JSONExtractor = JSONExtractor::new(Box::new(Self::extract));

        let generic = GenericIntegrationModule {
            name: EXAMPLE_MODULE_NAME.into(),
            auth: AuthenticationMethod::NoAuthentication,
            funcs: GenericIntegrationCoreFuncs {
                url: Box::new(Self::get_query_url),
                next: Box::new(Self::get_next_url),
            },
            components: SubscanModuleCoreComponents {
                requester: requester.into(),
                extractor: extractor.into(),
            },
        };

        generic.into()
    }

    pub fn get_query_url(domain: &str) -> String {
        format!("{EXAMPLE_URL}/{domain}/subdomains")
    }

    pub fn get_next_url(_url: Url, _content: Content) -> Option<Url> {
        None
    }

    pub fn extract(content: Value, _domain: &str) -> Result<BTreeSet<Subdomain>> {
        if let Some(items) = content["items"].as_array() {
            let filter = |item: &Value| Some(item["hostname"].as_str()?.to_string());

            return Ok(items.iter().filter_map(filter).collect());
        }

        Err(JSONExtract.into())
    }
}

Generic Search Engine Module

The GenericSearchEngineModule is primarily used for search engine integrations. It performs subdomain discovery by conducting dork searches on search engines and provides a generic implementation for search engines that use the same dork structure. To understand how it works, review the source code on the docs.rs page. Additionally, the source code of other module implementations that use this implementation can help guide you in its usage

A search engine module that uses this internally would look like the example below

pub const EXAMPLE_MODULE_NAME: &str = "example";
pub const EXAMPLE_SEARCH_URL: &str = "https://www.example.com/search";
pub const EXAMPLE_SEARCH_PARAM: &str = "q";
pub const EXAMPLE_CITE_TAG: &str = "cite";

pub struct ExampleModule {}

impl ExampleModule {
    pub fn dispatcher() -> SubscanModuleDispatcher {
        let url = Url::parse(EXAMPLE_SEARCH_URL);

        let extractor: HTMLExtractor = HTMLExtractor::new(EXAMPLE_CITE_TAG.into(), vec![]);
        let requester: RequesterDispatcher = HTTPClient::default().into();

        let generic = GenericSearchEngineModule {
            name: EXAMPLE_MODULE_NAME.into(),
            param: EXAMPLE_SEARCH_PARAM.into(),
            url: url.unwrap(),
            components: SubscanModuleCoreComponents {
                requester: requester.into(),
                extractor: extractor.into(),
            },
        };

        generic.into()
    }
}

Integrate Your Module Step by Step

This chapter provides a step-by-step guide on how to convert your custom subdomain discovery module into a SubscanModule component and integrate it with Subscan

Follow the steps below to integrate your module with Subscan

1. Create Your Custom Module

At first, you need to implement a module that follows the SubscanModuleInterface so that Subscan can run your module. If you're unsure how to implement it, you can follow the Create Your Own Module title. Also, if your module will use a generic implementation, refer to the Generic Modules chapter

The file should be created in the src/modules directory, where the modules are organized by their functionality: generics are stored in generics/, integrations in integrations/, and search engines in engines/. Since we are integrating a custom module, we can create our file as src/modules/example.rs

Here is an example module that is compatible with the SubscanModuleInterface. Let's integrate it into Subscan

pub struct ExampleModule {
    pub name: String,
}

#[async_trait]
impl SubscanModuleInterface for ExampleModule {
    async fn name(&self) -> &str {
        &self.name
    }

    async fn requester(&self) -> Option<&Mutex<RequesterDispatcher>> {
        None
    }

    async fn extractor(&self) -> Option<&SubdomainExtractorDispatcher> {
        None
    }

    async fn run(&mut self, _domain: &str, results: Sender<OptionalSubscanModuleResult>) {
        let subdomains = BTreeSet::from_iter([
            Subdomain::from("bar.foo.com"),
            Subdomain::from("baz.foo.com"),
        ]);

        for subdomain in subdomains {
            results.send(self.item(&subdomain).await).unwrap()
        }
    }
}

2. Define Your Module as `SubscanModule`

To define this module as a SubscanModule, we need to wrap it with a SubscanModuleDispatcher, as shown in the implementation of the SubscanModule type below

/// `SubscanModule` type wrapper
pub type SubscanModule = Arc<Mutex<SubscanModuleDispatcher>>;

impl From<SubscanModuleDispatcher> for SubscanModule {
    fn from(module: SubscanModuleDispatcher) -> Self {
        Self::new(Mutex::new(module))
    }
}

2.1. Add a New Dispatcher Variant

Dispatchers are enumeration structures defined in the src/enums/dispatchers.rs file. Instead of using Box for dynamic dispatching when running modules, we can store the module variants within an enum. This allows the compiler to know the types based on the dispatcher during module execution and enables static dispatching. For more detailed technical information, you can refer to the enum_dispatch crate

Now, let's add a dispatcher variant for our module. If we are using a generic implementation, we don't need to do this, as a variant has already been created for generic implementations, as shown below

#[enum_dispatch(SubscanModuleInterface)]
pub enum SubscanModuleDispatcher {
    /// Enum variant of generic API integrations. It can be used for all generic API modules
    /// at the same time, for this only requirement is the module should be implemented as
    /// a [`GenericIntegrationModule`]
    GenericIntegrationModule(GenericIntegrationModule),
    /// Also another generic variant for search engines, It can be used for all generic search
    /// engine modules at the same time. Just modules should be implemented as
    /// a [`GenericSearchEngineModule`]
    GenericSearchEngineModule(GenericSearchEngineModule),
    /// Non-generic `ExampleModule` module variant
    ExampleModule(ExampleModule), // Add this line
}

2.2. Implement the `dispatcher()` Method for Your Module

After adding the dispatcher variant, we can add a method named dispatcher( to our module that will return it as a dispatcher variant

impl ExampleModule {
    pub fn dispatcher() -> SubscanModuleDispatcher {
        let example = Self {
            name: "example".into(),
        };

        example.into()
    }
}

3. Add Your Module to the In-Memory Cache

The only thing left to do is add our module to the in-memory cache as a SubscanModule so that the CacheManager component can use it. To do this, let's add our module to the in-memory cache called MODULE_CACHE in cache.rs file

/// All `Subscan` modules are stores in-memory [`Vec`] as a [`SubscanModule`](crate::types::core::SubscanModule)
static MODULE_CACHE: LazyLock<Vec<SubscanModule>> = LazyLock::new(|| {
    vec![
        // Search engines
        SubscanModule::from(bing::Bing::dispatcher()),
        SubscanModule::from(duckduckgo::DuckDuckGo::dispatcher()),
        SubscanModule::from(google::Google::dispatcher()),
        SubscanModule::from(yahoo::Yahoo::dispatcher()),
        // Integrations
        SubscanModule::from(alienvault::AlienVault::dispatcher()),
        SubscanModule::from(example::ExampleModule::dispatcher()), // Add this line
    ]
});

4. Run Your Module

~$ cargo build && target/debug/subscan module run example -d example.com

5. Write Tests for Your Module

Please write unit tests for your module. You can use the tests/ folder as a reference

Keyboard shortcuts

subscan