Watching Files and Directories in Python with watchdog
React to config changes, new log lines, and dropped files in real time. A practical guide to the watchdog library for event-driven Python automation.
- #python
- #bash
- #automation
A surprising number of automation problems are really “do something when a file changes.” Reload a service when its config is edited. Process a file the moment it lands in a drop directory. Tail a log and alert on a pattern. For years I solved these with a while true; sleep 5 loop that re-scanned a directory, which is wasteful and laggy. The right answer is to let the operating system tell you when something changed, and in Python that means the watchdog library.
watchdog wraps the platform’s native file-system event APIs (inotify on Linux, FSEvents on macOS) behind one clean interface. Here is how I use it for real event-driven automation, and where I keep an AI assistant on a short leash.
Why polling is the wrong default
A polling loop re-stats every file on an interval. With a few files it is fine; with thousands it burns CPU and still reacts up to one interval late. Native events are push-based: the kernel notifies you the instant a file is created, modified, or deleted, with essentially zero idle cost.
pip install watchdog
That installs both the library and a handy watchmedo CLI, which I will come back to.
The core pattern: handler plus observer
watchdog splits the job in two. An event handler defines what to do when something happens; an observer watches a path and dispatches events to the handler.
import time
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler
class ConfigHandler(FileSystemEventHandler):
def on_modified(self, event):
if event.src_path.endswith("config.yaml"):
print(f"Config changed: {event.src_path}")
reload_service()
observer = Observer()
observer.schedule(ConfigHandler(), path="/etc/myapp", recursive=False)
observer.start()
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
observer.stop()
observer.join()
The handler subclasses FileSystemEventHandler and overrides the events it cares about: on_created, on_modified, on_deleted, on_moved. The observer runs in a background thread, so the main thread just waits.
Processing files dropped into a directory
A classic pipeline: another system drops files into an inbox and you process each one. Watch for on_created and act:
from watchdog.events import FileSystemEventHandler
class InboxHandler(FileSystemEventHandler):
def on_created(self, event):
if event.is_directory:
return
process_drop(event.src_path)
There is a subtle trap here: on_created may fire while the file is still being written. If the producer copies a large file, you can start reading a half-written file. Two common fixes:
- Have the producer write to a temp name and
renameinto place when done — rename is atomic, and you watch foron_moved. - Wait for the file size to stop changing before processing.
import os, time
def wait_until_stable(path, interval=0.5):
last = -1
while True:
size = os.path.getsize(path)
if size == last:
return
last = size
time.sleep(interval)
Pro Tip: Prefer the atomic-rename approach when you control the producer. Size-polling is a heuristic and will occasionally misfire on slow or paused transfers. Watching for the rename is deterministic.
Debouncing noisy events
Editors and some tools fire several on_modified events for a single save. If your handler does real work, you will run it multiple times. Debounce by ignoring events that arrive within a short window:
import time
from watchdog.events import FileSystemEventHandler
class DebouncedHandler(FileSystemEventHandler):
def __init__(self, delay=1.0):
self.delay = delay
self._last = 0.0
def on_modified(self, event):
now = time.monotonic()
if now - self._last < self.delay:
return
self._last = now
self.handle(event)
def handle(self, event):
reload_service()
This collapses a burst of events into one action. Tune delay to be longer than the burst but short enough to feel responsive.
Filtering events with patterns
Watching a busy directory floods your handler with events for files you do not care about. PatternMatchingEventHandler filters by glob before your code ever runs:
from watchdog.events import PatternMatchingEventHandler
class YamlHandler(PatternMatchingEventHandler):
def __init__(self):
super().__init__(
patterns=["*.yaml", "*.yml"],
ignore_patterns=["*.swp", "*~", "*.tmp"],
ignore_directories=True,
)
def on_modified(self, event):
reload_config(event.src_path)
The ignore_patterns list is doing real work here: editors create .swp and ~ backup files constantly, and without ignoring them your handler fires on noise. Filtering at the handler level keeps your logic clean and your CPU idle.
Recursive watches and their cost
Setting recursive=True on schedule() watches an entire subtree, which is convenient but not free. On Linux, inotify registers a watch per directory, and there is a system-wide limit (fs.inotify.max_user_watches). Watch a tree with tens of thousands of directories and you can exhaust it, at which point the observer silently stops seeing some events.
observer.schedule(handler, path="/data", recursive=True)
Before deploying a deep recursive watch, I check the limit and the directory count:
cat /proc/sys/fs/inotify/max_user_watches # often 8192 by default
find /data -type d | wc -l # how many watches you'll need
If the tree is large, narrow the watched path or raise the sysctl deliberately rather than discovering the limit through missed events in production.
The watchmedo CLI for quick jobs
For one-off “run this command when files change” needs, you do not even need to write Python. The watchmedo CLI that ships with watchdog does it:
watchmedo shell-command \
--patterns="*.py" \
--recursive \
--command='echo "${watch_src_path} changed"; pytest' \
./src
I use this constantly for local dev loops — re-run tests on save, rebuild on change. It is the quickest way to get event-driven behavior without a script.
Letting AI scaffold the handler
The handler/observer scaffolding is boilerplate, and an AI assistant writes it well. I will describe the trigger and action to Claude or Cursor — “watch /etc/myapp for changes to config.yaml and call reload_service, debounced” — and get a working skeleton fast.
I treat that output as a fast junior engineer’s draft and review it before it runs unattended, because a file watcher reacting incorrectly can trigger destructive actions in a tight loop. I check:
- That the handler is debounced or otherwise guarded against event storms, so it cannot hammer a reload or processing step.
- That partial-write handling exists for any drop-directory processing.
- That
event.src_pathis validated before it is used in a shell command or a delete — never pass it straight intosubprocesswithshell=True. - That no real secrets were pasted into the prompt. A file watcher needs none of your credentials.
I keep that checklist in the prompt workspace and route watchers that take destructive actions through the code review dashboard first.
Conclusion
watchdog replaces wasteful polling loops with push-based file-system events: subclass FileSystemEventHandler, schedule it on an Observer, and react to creates, modifies, moves, and deletes. Handle partial writes with atomic renames, debounce noisy editor events, and reach for the watchmedo CLI for quick jobs. Let an AI scaffold the handler, then review it closely — a watcher that fires the wrong action in a loop does damage fast.
More in the Bash and Python automation category. Reusable starters are in the prompt library, and curated sets are in the prompt packs.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.