Skip to main content

How to Set Service Recovery Options (Failure Actions) in Batch Script

Creating a Windows Service is only half the battle. To ensure high availability and reliability, you must also configure the service's behavior in the event of a crash. Windows allows you to define "Recovery Options" (or Failure Actions) that can automatically restart the service, reboot the computer, or even run a separate script when the service fails.

This guide will explain how to use the sc failure command in a Batch script to automate these recovery settings, ensuring your critical background processes remain resilient.

The SC FAILURE Command

The sc (Service Control) utility includes a sub-command called failure specifically for managing these recovery rules. Using this command, you can set actions for the first, second, and subsequent failures.

Basic Syntax

sc failure "ServiceName" reset= [Seconds] actions= [Action1]/[Delay1]/[Action2]/[Delay2]/...
warning

The Space Requirement. As with many sc commands, you must include a space after the equals sign (e.g., actions= ). Without this space, the command will return a syntax error.

Configuring the Actions

The actions= parameter is a slash-separated string that defines what happens and when.

Available Action Types:

  • restart: Restarts the service.
  • run: Runs a program or script.
  • reboot: Restarts the computer.
  • "": Does nothing (no action).

Example: Standard Restart Logic

Here is a script that configures a service to restart after 1 minute on the first failure, and after 2 minutes on every failure after that.

@echo off
set "SvcName=MyCriticalSvc"

:: Verify the service exists before configuring recovery
sc query "%SvcName%" >nul 2>&1
if %errorlevel% neq 0 (
echo [ERROR] Service '%SvcName%' does not exist.
pause
exit /b 1
)

echo [ACTION] Configuring recovery options for %SvcName%...

:: reset= 86400: Reset the failure count after 1 day (86400 seconds)
:: actions= restart/60000/restart/120000:
:: 1st failure: Restart after 60 seconds (60000ms)
:: 2nd and subsequent: Restart after 120 seconds (120000ms)
sc failure "%SvcName%" reset= 86400 actions= restart/60000/restart/120000

if %errorlevel% equ 0 (
echo [SUCCESS] Recovery options applied.
) else (
echo [ERROR] Failed to configure recovery. Ensure you are running as Administrator.
)

pause

Running a Custom Script on Failure

One of the most powerful options is the run action. This allows the service to launch a "Sanity Check" or "Cleanup" script if it crashes.

@echo off
set "TargetSvc=MyDatabaseProxy"
set "FixScript=C:\Scripts\fix_proxy.bat"

:: Verify the service exists
sc query "%TargetSvc%" >nul 2>&1
if %errorlevel% neq 0 (
echo [ERROR] Service '%TargetSvc%' does not exist.
pause
exit /b 1
)

:: Verify the recovery script exists
if not exist "%FixScript%" (
echo [ERROR] Recovery script not found: %FixScript%
pause
exit /b 1
)

:: Set up a failure action that runs a cleanup script after 5 seconds
:: The command= parameter specifies which program to run
sc failure "%TargetSvc%" command= "%FixScript%" actions= run/5000

if %errorlevel% equ 0 (
echo [SUCCESS] Service will now run %FixScript% 5 seconds after a crash.
) else (
echo [ERROR] Failed to configure recovery. Ensure you are running as Administrator.
)

pause
info

The command= parameter. When you use the run action in sc failure, the path to the script you want to execute is specified using the command= parameter on the same sc failure command line.

Best Practices and Rules for Recovery Configuration

1. Administrative Privileges

Modifying service recovery options changes the system registry at a deep level. Your Batch script must be run as an Administrator. Standard users do not have permission to modify Service Failure Actions.

2. Time in Milliseconds

In the actions= string, the delays are measured in milliseconds, not seconds.

  • 1000 = 1 second
  • 60000 = 1 minute
  • 3600000 = 1 hour

3. Reset Counter

The reset= value is in seconds. This is the time during which if no failures occur, the failure counter returns to zero. Setting a reasonable reset time (like 24 hours) prevents a service that crashes once a week from ever reaching its "Subsequent Failure" action.

How to Avoid Common Errors

Wrong Way: Formatting the actions string incorrectly

The actions string must strictly follow the type/time format.

Wrong Approach:

:: Missing slashes or invalid types
sc failure "Svc" actions= restart 60

Correct Way:

sc failure "Svc" actions= restart/60000

Best Practice: Combining with "Automatic Start"

Recovery options are most effective when the service is also set to start= auto. This ensures that even after a computer reboot, the service (and its recovery logic) are active.

Real-World Use Case: The "Unstoppable" Agent

Imagine a security agent that must be running at all times.

@echo off
set "SvcName=SafeAgent"

:: Verify the service exists
sc query "%SvcName%" >nul 2>&1
if %errorlevel% neq 0 (
echo [ERROR] Service '%SvcName%' does not exist. Install it first.
pause
exit /b 1
)

echo [STEP 1] Setting 3-tier recovery...
:: 1st failure: Restart after 1 second (1000ms)
:: 2nd failure: Restart after 10 seconds (10000ms)
:: 3rd and subsequent: Restart after 1 minute (60000ms)
:: Reset the failure counter after 12 hours (43200 seconds)
sc failure "%SvcName%" reset= 43200 actions= restart/1000/restart/10000/restart/60000

if %errorlevel% neq 0 (
echo [ERROR] Failed to set recovery options. Ensure you are running as Administrator.
pause
exit /b 1
)

echo [STEP 2] Enabling recovery actions for non-crash stops...
sc failureflag "%SvcName%" 1

if %errorlevel% equ 0 (
echo.
echo [SUCCESS] Recovery hardening complete for '%SvcName%'.
echo 1st failure: Restart after 1 second
echo 2nd failure: Restart after 10 seconds
echo 3rd+ failure: Restart after 1 minute
) else (
echo [WARNING] Recovery options were set but the failure flag could not be configured.
)

pause

Conclusions

Configuring Service Recovery Options is the hallmark of a professional IT environment. By using the sc failure command, you transform your Windows Services from fragile processes into resilient, self-healing background agents. Always remember to test your failure triggers in a development environment to ensure your scripts (like the run action) have the necessary permissions to execute when the main service dies.