Skip to content

Brumbelow/gh-secret

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gh-secret

gh-secret is a Python CLI that uses gh search code plus local filtering and ranking heuristics to find likely secret exposures tied to a literal target such as example.com.

It is designed around the current gh code-search API, with a website-search fallback for GitHub rate-limit failures. The tool runs six exact query variants for the target, extracts assignment-like matches from returned snippets, and prints the highest-confidence findings.

Prerequisites

  • Python 3.13 or newer
  • GitHub CLI (gh) available in PATH
  • Valid GitHub authentication:
gh auth login -h github.com

Installation

python3 -m pip install -e .

On Kali/Debian systems with an externally managed Python environment, use a virtual environment instead:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -e .

Alternatively:

pipx install --editable .

Usage

Basic command:

gh-secret example.com

CLI synopsis:

gh-secret TARGET [--top N] [--per-query-limit N] [--exclude-docs] [--redact]

Arguments and flags:

  • TARGET: required literal target string such as example.com
  • --top N: maximum number of ranked findings to print, default 10
  • --per-query-limit N: maximum raw results to request for each query variant, default 500, maximum 500
  • --exclude-docs: exclude documentation-like files and search-instruction content
  • --redact: mask extracted candidate values in the output

Examples:

gh-secret example.com --redact
gh-secret example.com --top 5 --redact
gh-secret example.com --per-query-limit 100 --exclude-docs --redact

Optional environment variables:

  • GH_SECRET_WEB_COOKIE: GitHub web-session cookie string used only when gh search code hits an HTTP 403 rate limit and gh-secret attempts the website JSON fallback

How It Works

For each run, gh-secret issues these six upstream query variants:

  • "TARGET" "secret="
  • "TARGET" "secret ="
  • "TARGET" "token="
  • "TARGET" "token ="
  • "TARGET" "key="
  • "TARGET" "key ="

It then:

  • parses textMatches fragments from the gh search code response
  • if gh search code returns an HTTP 403 rate-limit error, it tries the equivalent website JSON search for that exact query using GH_SECRET_WEB_COOKIE
  • treats the upstream gh search code hit as proof that the target domain and keyword family matched somewhere in the same file
  • extracts only assignment-like candidates such as secret=..., token = ..., or api_key: ... when an actual value is present
  • optionally excludes obvious documentation and tutorial noise when --exclude-docs is set
  • removes duplicates across result fragments
  • scores candidates based on value shape, keyword context, and path hints
  • prints only findings with a score of at least 40

The target domain does not need to appear right next to the detected secret in the returned snippet. The domain/file match comes from the GitHub search query, while value extraction comes from the snippet fragments that gh search code returns.

Output

Successful runs print either ranked findings or:

No high-confidence findings.

Ranked findings use this structure:

Found 1 high-confidence finding(s) for example.com.

1. score 82 | secret | owner/repo/.env
value: sEc******123
snippet: DOMAIN=example.com SECRET=sEc******123
url: https://github.com/owner/repo/blob/main/.env

By default, gh-secret prints full extracted values. Use --redact if you want the value: line and matching snippet content masked.

Limitations

  • Searches public repositories only. Private repositories are ignored by the current implementation.
  • Wildcard characters such as *, ?, [, ], {, and } are rejected in v1.
  • gh search code uses legacy GitHub code-search semantics, so punctuation-sensitive website-style queries do not map cleanly to the CLI.
  • The website-search fallback depends on a private GitHub JSON route and a valid GH_SECRET_WEB_COOKIE; if it is unavailable or unusable, gh-secret preserves the original 403 error from gh.
  • Results are heuristic. False positives and missed findings are both possible.
  • Searching all file types by default improves recall but also increases noise.
  • Repeated searches can hit GitHub API rate limits, including HTTP 403 responses from gh search code.

Troubleshooting

If gh is missing:

`gh` is required but was not found in PATH.

Install the GitHub CLI and retry.

If authentication is invalid or missing:

`gh` authentication is not healthy. Run `gh auth login -h github.com` and retry.

If GitHub rate limits your searches, gh-secret first tries the website JSON fallback for the same exact query when GH_SECRET_WEB_COOKIE is set. If that fallback is not configured or does not return usable authenticated results, the original gh search code HTTP 403 error is surfaced. In that case, wait for the rate-limit window to reset, avoid repeated runs in a short period, or lower --per-query-limit.

If the tool returns No high-confidence findings., that means either no matching results were found or all raw hits were filtered out as low-confidence noise.

Development

Run the test suite from the repo root:

python3 -m pytest

About

leaklayer secret detection for broad gh cli searches based off of domain or wildcard specified

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages