🧠 Master Guide to Using NVIDIA Garak for LLM Vulnerability Testing

https://github.com/NVIDIA/garak | https://garak.ai/garak_aiv_slides.pdf | https://garak.ai | https://reference.garak.ai/en/latest/

Author: Harshit Rajpal, Security Engineer, Bureau Veritas Cybersecurity North America

Introduction

In this guide, we will explore Garak – an open-source Generative AI Red-teaming and Assessment Kit by NVIDIA – and how to use it for scanning Large Language Models (LLMs) for vulnerabilities. We’ll cover everything from installation and setup to running scans, focusing on key features like connecting Garak to different LLM interfaces (including a local REST API chatbot), using specific probes (e.g. jailbreaking attacks), customizing prompts, speeding up scans, understanding Garak’s components, writing your own plugin, and interpreting Garak’s output reports. This comprehensive, step-by-step walkthrough will feel like a technical whitepaper, complete with code examples, command-line usage, and references to official documentation and community insights.

Table of Content

1. Installation and Environment Setup

Since Garak has its own dependencies, this guide will use Conda. Conda is a powerful command-line tool for package and environment management that runs on Windows, macOS, and Linux.

  • Python: 3.10

  • OS: Windows 10+, Linux, or macOS

  • RAM: Minimum 8GB (more if using local LLMs via Ollama or transformers)

Optional:

Let's get started with the setup.

I will be using a Windows 10 host in this guide; however, feel free to use the supplemental commands for your specific OS. Some of the key alternate commands will be given here.

First, let's get Conda up and running. You can choose your installer here and then use the following commands for download and installation.

Windows
# Navigate to your project folder. I am creating a 'Downloads' folder within it.
mkdir Downloads

wget "https://repo.anaconda.com/archive/Anaconda3-2025.06-0-Windows-x86_64.exe" -outfile "./Downloads/Anaconda3-2025.06-0-Windows-x86_64.exe"

#Run the installer via GUI
Linux
# Navigate to your project folder. I am creating a 'Downloads' folder within it.
mkdir Downloads && cd Downloads

wget https://repo.anaconda.com/archive/Anaconda3-2025.06-1-Linux-x86_64.sh

chmod +x Anaconda3-2025.06-1-Linux-x86_64.sh && ./Anaconda3-2025.06-1-Linux-x86_64.sh

Follow the standard installation process. Once done, within your project folder (mine would be C:\Users\hex\Desktop\Garak), check a valid Anaconda installation via the conda command.

We are ready to set up a new environment for Garak.

Windows and Linux
conda create --name garak python=3.10
conda activate garak
git clone https://github.com/NVIDIA/garak.git
cd garak
python -m pip install -e .

Once installed, Garak provides a command-line interface. To see basic usage, run garak -h

garak LLM vulnerability scanner v0.13.2.pre1 ( https://github.com/NVIDIA/garak ) at 2025-10-13T21:40:19.630483
usage: python -m garak [-h] [--verbose] [--report_prefix REPORT_PREFIX] [--narrow_output]
                       [--parallel_requests PARALLEL_REQUESTS] [--parallel_attempts PARALLEL_ATTEMPTS]
                       [--skip_unknown] [--seed SEED] [--deprefix] [--eval_threshold EVAL_THRESHOLD]
                       [--generations GENERATIONS] [--config CONFIG] [--target_type TARGET_TYPE]
                       [--target_name TARGET_NAME] [--probes PROBES] [--probe_tags PROBE_TAGS] [--detectors DETECTORS]
                       [--extended_detectors] [--buffs BUFFS] [--buff_option_file BUFF_OPTION_FILE |
                       --buff_options BUFF_OPTIONS] [--detector_option_file DETECTOR_OPTION_FILE |
                       --detector_options DETECTOR_OPTIONS] [--generator_option_file GENERATOR_OPTION_FILE |
                       --generator_options GENERATOR_OPTIONS] [--harness_option_file HARNESS_OPTION_FILE |
                       --harness_options HARNESS_OPTIONS] [--probe_option_file PROBE_OPTION_FILE |
                       --probe_options PROBE_OPTIONS] [--taxonomy TAXONOMY] [--plugin_info PLUGIN_INFO]
                       [--list_probes] [--list_detectors] [--list_generators] [--list_buffs] [--list_config]
                       [--version] [--report REPORT] [--interactive] [--generate_autodan] [--fix]

LLM safety & security scanning tool

options:
  -h, --help            show this help message and exit
  --verbose, -v         add one or more times to increase verbosity of output during runtime
  --report_prefix REPORT_PREFIX
                        Specify an optional prefix for the report and hit logs
  --narrow_output       give narrow CLI output
  --parallel_requests PARALLEL_REQUESTS
                        How many generator requests to launch in parallel for a given prompt. Ignored for models that
                        support multiple generations per call.
  --parallel_attempts PARALLEL_ATTEMPTS
                        How many probe attempts to launch in parallel. Raise this for faster runs when using non-local
                        models.
  --skip_unknown        allow skip of unknown probes, detectors, or buffs
  --seed, -s SEED       random seed
  --deprefix            remove the prompt from the front of generator output
  --eval_threshold EVAL_THRESHOLD
                        minimum threshold for a successful hit
  --generations, -g GENERATIONS
                        number of generations per prompt
  --config CONFIG       YAML config file for this run
  --target_type, -t, --model_type, -m TARGET_TYPE
                        module and optionally also class of the generator, e.g. 'huggingface', or 'openai'
  --target_name, --model_name, -n TARGET_NAME
                        name of the target, e.g. 'timdettmers/guanaco-33b-merged'
  --probes, -p PROBES   list of probe names to use, or 'all' for all (default).
  --probe_tags PROBE_TAGS
                        only include probes with a tag that starts with this value (e.g. owasp:llm01)
  --detectors, -d DETECTORS
                        list of detectors to use, or 'all' for all. Default is to use the probe's suggestion.
  --extended_detectors  If detectors aren't specified on the command line, should we run all detectors? (default is
                        just the primary detector, if given, else everything)
  --buffs, -b BUFFS     list of buffs to use. Default is none
  --buff_option_file, -B BUFF_OPTION_FILE
                        path to JSON file containing options to pass to buff
  --buff_options BUFF_OPTIONS
                        options to pass to buff, formatted as a JSON dict
  --detector_option_file, -D DETECTOR_OPTION_FILE
                        path to JSON file containing options to pass to detector
  --detector_options DETECTOR_OPTIONS
                        options to pass to detector, formatted as a JSON dict
  --generator_option_file, -G GENERATOR_OPTION_FILE
                        path to JSON file containing options to pass to generator
  --generator_options GENERATOR_OPTIONS
                        options to pass to generator, formatted as a JSON dict
  --harness_option_file, -H HARNESS_OPTION_FILE
                        path to JSON file containing options to pass to harness
  --harness_options HARNESS_OPTIONS
                        options to pass to harness, formatted as a JSON dict
  --probe_option_file, -P PROBE_OPTION_FILE
                        path to JSON file containing options to pass to probe
  --probe_options PROBE_OPTIONS
                        options to pass to probe, formatted as a JSON dict
  --taxonomy TAXONOMY   specify a MISP top-level taxonomy to be used for grouping probes in reporting. e.g. 'avid-
                        effect', 'owasp'
  --plugin_info PLUGIN_INFO
                        show info about one plugin; format as type.plugin.class, e.g. probes.lmrc.Profanity
  --list_probes         list all available probes. Usage: combine with --probes/-p to filter for probes that will be
                        activated based on a `probe_spec`, e.g. '--list_probes -p dan' to show only active 'dan'
                        family probes.
  --list_detectors      list available detectors. Usage: combine with --detectors/-d to filter for detectors that will
                        be activated based on a `detector_spec`, e.g. '--list_detectors -d misleading.Invalid' to show
                        only that detector.
  --list_generators     list available generation model interfaces
  --list_buffs          list available buffs/fuzzes
  --list_config         print active config info (and don't scan)
  --version, -V         print version info & exit
  --report, -r REPORT   process garak report into a list of AVID reports
  --interactive, -I     Enter interactive probing mode
  --generate_autodan    generate AutoDAN prompts; requires --prompt_options with JSON containing a prompt and target
  --fix                 Update provided configuration with fixer migrations; requires one of --config /
                        --*_option_file, / --*_options

See https://github.com/NVIDIA/garak

2. Getting Started With Garak

To quickly test Garak’s setup with a sample LLM generator and a probe.

This uses Garak’s built-in test components:

  • Generator: test.Blank – a mock model which can be specified with --target_type

  • Probe: test.Blank – sends a dummy input, which can be specified with --probes

garak --target_type test.Blank --probes test.Test

As you may have observed, the JSON and HTML summary reports have been written to the default directory ~\.local\share\garak\garak_runs\

Modules

Now, a bit about various modules in Garak. The major components are as follows:

Component

Description (2 sentences)

Example

Probes

Probes are the attackers. They generate specific prompts or input scenarios to test the LLM for vulnerabilities or behavioral weaknesses. Each probe targets a particular issue like jailbreaks, injections, or bias.

jailbreak.JailbreakProbe sends “ignore all instructions” prompts to test guardrail bypasses.

Generators

Generators are the LLM interfaces that Garak queries; they handle sending prompts and retrieving model responses. They abstract away API calls or local model inference.

ollama.OllamaGenerator connects Garak to a locally running LLaMA2 via Ollama’s REST API.

Detectors

Detectors analyze model outputs to decide if a failure or unsafe behavior occurred. They can check for toxicity, leakage, or rule-breaking based on text analysis or regex.

toxicity.BasicDetector flags model responses containing hate or violence terms.

Evaluators

Evaluators summarize and score the test outcomes, turning raw detector results into metrics or human-readable reports. They can output JSON, CSV, or formatted text.

basic.JSONEvaluator saves a JSON file showing which probes passed or failed.

Harnesses

Harnesses control how tests are executed. Managing multiple probes, generators, detectors, and parallelization. They coordinate test scheduling and repeatability.

default.Harness runs a round-robin of probes vs. detectors across chosen models.

Resources

Resources are helper files, datasets, or lookup tables used by probes, detectors, and evaluators. These can include wordlists, pattern definitions, or canned prompts.

The resources/jailbreak_prompts.txt file provides a base set of jailbreak prompts for testing.

By combining these, we can tailor our scans.

3. Connecting Garak to LLM Interfaces

4. Running Garak Tests

5. Selective Probes for Targeted Testing

6. Custom Prompt Sources and External Datasets

7. Speeding Up Scans

8. Understanding Garak’s Plugin Architecture

9. Writing Your Own Plugin

10. Evaluating and Reading Garak Reports

11. Appendix: CLI Reference and Troubleshooting

Headline: Quick Reference and Common Fixes

Main Content: A concise cheatsheet for Garak’s key CLI options (--target_type, --probes, --detectors, --evaluators, --parallel_runs). Includes common issues like encoding errors, missing plugins, and REST connection fixes, with PowerShell vs. Linux equivalents. Perfect as a back-pocket reference when setting up new scans.


EVERYTHING BELOW: IGNORE

Generative AI Red-teaming & Assessment Kit - GARAK

garak checks if an LLM can be made to fail in a way we don't want. garak probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. If you know nmap or msf / Metasploit Framework, garak does somewhat similar things to them, but for LLMs.

garak focuses on ways of making an LLM or dialog system fail. It combines static, dynamic, and adaptive probes to explore this.

Installation

Due to the intricacies of packages and to make sure our system packages don't break we'll use conda

Use this to identify your env https://repo.anaconda.com/archive/

I'll use "Anaconda3-2025.06-1-Linux-x86_64.sh" on my Linux machine

Just install it with all default options


Once conda is installed, proceed with garak installation

conda create --name garak "python>=3.10,<=3.12" conda activate garak git clone https://github.com/NVIDIA/garak.git

cd garak python -m pip install -e .

Once installed, confirm installation with

garak -h


Now garak can connect to different LLM interfaces. Most common is the HTTP REST API endpoint that returns JSON/plaintext output.

let's assume an LLM is replying on /api/v1/ai/chat endpoint on a host "example.com"

Let's assume the API request looks like the following:

POST /api/v1/ai/chat HTTP/1.1
Host: example.com
Cookie: Bearer AUTH TOKEN
Content-Length: 349
User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36
Content-Type: application/json
Accept: */*
Origin: example.com
Accept-Encoding: gzip, deflate, br
Priority: u=1, i
Connection: keep-alive

{"datetime":"2025-09-25T19:58:25.344+00:00","domain_id":"example.com","user_id":"1","content_id":"randomcontent","item_id":"123456789","prompt":"query","question":"What is the weather in jersey"}

Let's assume the response body looks like:

HTTP/1.1 200 OK
Date: Thu, 25 Sep 2025 19:58:26 GMT
Content-Type: application/json
Content-Length: 518
Connection: keep-alive
Server: nginx
Access-Control-Allow-Methods: POST, GET, OPTIONS

{"version": "1", "response": [{"text": "### Weather in Jersey\n\nUnfortunately, the provided content does not contain information about the weather in Jersey. If you are looking for weather updates, it is recommended to check a reliable weather website or app for the most current information.", "rts": 0.8001093830025638, "logged": []}], "model": "gpt-35-turbo-16k-1106"}

You would manually need to create a JSON config file. You can refer to the docs here: https://reference.garak.ai/en/latest/garak.generators.rest.html

For our case, config becomes like:

api_web_config.json

{
   "rest": {
      "RestGenerator": {
         "name": "Example Content Copilot",
         "uri": "https://example.com/api/v1/ai/chat",
         "method": "post",
         "headers": {
            "Cookie": "Bearer AUTH TOKEN",
            "Content-Type": "application/json",
	    "Accept": "*/*",
	    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36"
	    "Origin": "example.com",
	    "Referer": "https://hsptpentest.latest.highspot.com/items/68d0267d9d2908070669c7de?lfrm=shp.1",
	    "Accept-Encoding": "gzip, deflate, br",
	    "Priority": "u=1, i",
            "Connection": "keep-alive"
	 },
         "req_template_json_object": {
            "datetime":"2025-09-25T15:49:27.950+00:00",
            "domain_id":"example.com",
            "user_id":"1",
            "content_id":"randomcontent",
            "item_id":"123456789",
            "prompt":"query",
            "question":"$INPUT",
         },
         "response_json": true,
         "response_json_field": "$.response[0].text",
         "skip_codes": [500,504,422],
         "request_timeout": 3000
      }
   }
}

Now the two main fields we need to focus in the config file above is identifying which parameter user sends their query in and which parameter does the LLM respond back in.

Here, "question" parameter in the request holds user's query and "response.text" contains LLM response

So, these two fields are specially marked in api_web_config.json

"$INPUT" tells garak where to inject prompts in for testing. Put this in the user controlled param for LLM query in you case.

"response_json_field" tells garak where to look for LLM response to analyze whether attack vectors worked or not. You can define the specific parameter using basic JSON object definition syntax. For example, here response is in the first field "text" encapsulated by "response" object so we defined "$.response[0].text"

Once done you are free to run garak!

garak --model_type rest -G api_web_config.json

You can ploy with speed throttles as well

garak --model_type rest -G api_web_config.json --parallel_attempts 20


Test Garak

garak --model_type test.Blank --probes test.Test

garak --model_type rest -G api_web_config.json --probes test.Test

Let's say you only want specific tests like prompt injections. You can use "garak --probes" to list all the different probes

garak --model_type rest -G api_web_config.json --probes promptinject --parallel_attempts 20


Reading the report

# header
echo 'uuid,probe_classname,prompt.turns.content.text,outputs.text' > out.csv

# extract rows, skipping those without a valid uuid
jq -r '
  # stash id and filter: require a string uuid with length > 0
  (.uuid // .UUID) as $id
  | select($id != null and ($id | type) == "string" and ($id | length) > 0)
  | [
      $id,
      (.probe_classname // ."probe classname" // ""),
      ((.prompt.turns  // []) | map(.content?.text // "") | join(" | ")),
      ((.outputs       // []) | map(.text // (.content?.text // "")) | join(" | "))
    ]
  | @csv
' data-output-report.jsonl >> out.csv

Last updated

Was this helpful?