Skip to main content

How to Create a Watchdog Script That Monitors a Process in Batch Script

In a perfect world, software would never crash. In the real world, background agents, sync tools, and server processes occasionally hang or close unexpectedly. A "Watchdog" is a simple but powerful script that stays active in the background, constantly checking if your mission-critical application is still running. If the application disappears, the Watchdog immediately restarts it, ensuring maximum uptime with zero manual intervention.

This guide will explain how to build a resilient Watchdog script using native Batch commands and the tasklist utility.

The Logic: The "Monitor and React" Loop

A Watchdog script operates in a continuous loop:

  1. Check: Is the process name visible in the system task list?
  2. Verify: If not found, log the failure and trigger a restart.
  3. Rest: Wait for a few seconds before checking again (Polling).

Method 1: The Basic Process Watchdog

This script monitors a hypothetical application called MyService.exe.

@echo off
setlocal

set "ProcessName=MyService.exe"
set "AppPath=C:\Apps\MyService\MyService.exe"
set "LogFile=%~dp0watchdog_log.txt"
set "PollInterval=10"

:: Verify the application executable exists
if not exist "%AppPath%" (
echo [ERROR] Application not found: %AppPath%
pause
exit /b 1
)

title Watchdog: Monitoring %ProcessName%

:: Initial startup delay to allow the system to fully load
echo [%date% %time%] Watchdog started. Monitoring %ProcessName% every %PollInterval% seconds.
echo [%date% %time%] Watchdog started >> "%LogFile%"
timeout /t 10 /nobreak >nul

:WatchLoop
:: Check if the process is running
tasklist /fi "imagename eq %ProcessName%" /nh 2>nul | findstr /i "%ProcessName%" >nul

if %errorlevel% neq 0 (
echo [%date% %time%] ALERT: %ProcessName% is not running!

:: Check if it is already running before attempting restart (race condition guard)
tasklist /fi "imagename eq %ProcessName%" /nh 2>nul | findstr /i "%ProcessName%" >nul
if !errorlevel! neq 0 (
echo [%date% %time%] Attempting to restart...
start "" "%AppPath%"
echo [%date% %time%] Restarted %ProcessName% >> "%LogFile%"

:: Wait for the application to initialize before next check
timeout /t %PollInterval% /nobreak >nul
)
)

:: Wait before the next check
timeout /t %PollInterval% /nobreak >nul
goto :WatchLoop

Method 2: The "Aggressive" Watchdog (Handle Hanging Apps)

Sometimes an application is "Running" (its process exists), but it is "Not Responding" (frozen). This enhanced watchdog detects and kills frozen processes, then allows the main watchdog loop to restart them.

@echo off
setlocal enabledelayedexpansion

set "ProcessName=MyService.exe"
set "AppPath=C:\Apps\MyService\MyService.exe"
set "LogFile=%~dp0watchdog_log.txt"
set "PollInterval=30"

title Watchdog: Monitoring %ProcessName% (with hang detection)

echo [%date% %time%] Aggressive watchdog started for %ProcessName%.
echo [%date% %time%] Watchdog started >> "%LogFile%"

:: Initial startup delay
timeout /t 15 /nobreak >nul

:HeavyWatch

:: Phase 1: Check if the process exists at all
tasklist /fi "imagename eq %ProcessName%" /nh 2>nul | findstr /i "%ProcessName%" >nul

if !errorlevel! neq 0 (
echo [!date! !time!] ALERT: %ProcessName% is not running.
echo [!date! !time!] Starting %ProcessName%...
start "" "%AppPath%"
echo [!date! !time!] Restarted %ProcessName% (was stopped^) >> "%LogFile%"
timeout /t %PollInterval% /nobreak >nul
goto :HeavyWatch
)

:: Phase 2: Check if the process is frozen (NOT RESPONDING)
tasklist /fi "imagename eq %ProcessName%" /fi "status eq NOT RESPONDING" /nh 2>nul | findstr /i "%ProcessName%" >nul

if !errorlevel! equ 0 (
echo [!date! !time!] WARNING: %ProcessName% is frozen! Force-killing...
taskkill /f /im "%ProcessName%" >nul 2>&1
echo [!date! !time!] Killed frozen %ProcessName% >> "%LogFile%"

:: Wait for the process to fully terminate
timeout /t 5 /nobreak >nul

echo [!date! !time!] Restarting %ProcessName%...
start "" "%AppPath%"
echo [!date! !time!] Restarted %ProcessName% (was frozen^) >> "%LogFile%"
timeout /t %PollInterval% /nobreak >nul
goto :HeavyWatch
)

:: All clear, wait and check again
timeout /t %PollInterval% /nobreak >nul
goto :HeavyWatch
warning

Avoid CPU Overload. Do not set your polling interval too low. Checking every 1 second can consume unnecessary CPU cycles and might even interfere with the application's ability to start correctly. A delay of 10 to 60 seconds is recommended for most watchdogs.

Best Practices and Rules

1. Unique Process Names

If you are monitoring a generic process (like python.exe or java.exe), tasklist will return "Running" even if it's a different script that is active.

Best Practice: For multi-script environments, check for the Window Title using tasklist /fi "windowtitle eq ..." or use a specific service wrapper if possible.

2. Logging

A good watchdog should keep a record of how many times it had to restart the app. If you see 10 restarts in one hour, the problem isn't the watchdog, it's the application itself that needs debugging.

3. Handle Boot Timing

If you set the Watchdog to start automatically when Windows boots, add a startup delay (as shown in both methods). This gives the system time to load drivers and network connections before the Watchdog starts complaining that the app is missing.

How to Avoid Common Errors

Wrong Way: Launching multiple instances

Failure to check if the app is already running before trying to start it can lead to 10 instances of your application opening at once.

Correct Way: Use the tasklist | findstr check as the absolute guard before every start command (as shown in both methods).

Problem: Watchdog window closing

If the user closes the Watchdog window, the protection stops.

Solution: Run the Watchdog as a Windows Service (using a tool like NSSM) to ensure the monitor itself is resilient and hidden.

Conclusions

Building a Watchdog script in Batch is one of the most cost-effective ways to increase the reliability of your system. By combining the tasklist filter with a controlled loop and a logging mechanism, you transform fragile manual tasks into self-healing background operations. Whether you are managing a shop-floor display or a remote server agent, a well-written watchdog ensures that your software stays online, even when things go wrong.