Troubleshooting NFS and Samba Shares on Linux with an AI Copilot
Stale handles, permission mismatches, and hung mounts make file shares miserable. Here's a diagnostic workflow for NFS and Samba with AI decoding the errors.
- #linux
- #nfs
- #samba
- #networking
- #storage
File shares are where Linux, Windows, permissions, and the network all conspire to ruin your morning. A “Stale file handle” error, a Samba share that mounts but shows everything as owned by nobody, an NFS mount that hangs and takes df down with it — these problems are notoriously hard to diagnose because the failure is spread across the client, the server, the network, and two different permission models. This is fertile ground for an AI copilot acting as a fast junior engineer: it’s read thousands of NFS and Samba errors and can translate a cryptic message into a concrete next check, while I run the commands and own the conclusions. Here’s the diagnostic flow I follow.
Confirming the export exists and is reachable
Before touching the client, verify the server is actually offering what you think.
# On an NFS client, list what the server exports
showmount -e nfs-server
# Check the RPC services are up
rpcinfo -p nfs-server
If showmount times out, you have a network or firewall problem, not a share problem — NFS needs more than just port 2049. Paste the rpcinfo output into an AI and ask which services should be present; it knows the difference between mountd, nfs, and rpcbind and will spot a missing one.
Diagnosing the dreaded stale file handle
“Stale NFS file handle” means the server’s view of a file changed underneath the client — usually the export was recreated or the underlying filesystem changed.
ls /mnt/share # often hangs or errors here
sudo umount -f -l /mnt/share
sudo mount -a
The lazy force-unmount (-l) is the escape hatch when a normal umount hangs. When you’re unsure why it went stale, describe the sequence to the AI (“the share worked, the storage team rebuilt the export, now I get stale handle”) and it’ll explain that the file handle includes a filesystem identifier that changed, and that a remount is the fix. That kind of mechanism-level explanation is what keeps you from cargo-culting.
Pro Tip: An NFS mount that hangs can wedge any command that touches it, including df and tab-completion in that directory. Mount with the soft and timeo= options on non-critical shares so a dead server returns an error instead of freezing your shell forever.
Untangling permission mismatches
This is the single most common file-share complaint: “I can see the files but I can’t write them.” For NFS, the cause is usually UID/GID mismatch between client and server, or root squashing.
id myuser # check UID/GID on the client
ls -ln /mnt/share # see numeric owners
ls -ln shows numeric UIDs instead of names, which immediately reveals a mismatch — files owned by 1005 that your user (UID 1002) can’t touch. Paste both into the AI and ask it to explain the mapping problem and whether root_squash, an idmapd misconfig, or a plain UID drift is the likely cause. It reasons through the permission model faster than I untangle it by hand. I keep these permission-diagnosis prompts in my prompt workspace so the team triages the same way.
Debugging Samba authentication and mapping
Samba adds a second auth system on top of POSIX permissions, so failures are doubly confusing.
smbclient -L //smb-server -U myuser
sudo mount -t cifs //smb-server/share /mnt/smb \
-o username=myuser,uid=1002,gid=1002,vers=3.0
The vers= option trips people up constantly — a server that only speaks SMB3 will reject an old default. When the mount fails with a terse mount error(13): Permission denied, hand it to the AI; it knows error 13 is auth, error 112 is host-down, and error 115 is a protocol-version mismatch, and it’ll tell you which -o option to adjust.
Reading the server logs
The real answers usually live in the server’s logs, and they’re verbose.
# NFS server
journalctl -u nfs-server -n 100
# Samba server
sudo tail -100 /var/log/samba/log.smbd
These logs are dense and full of false-positive noise, which makes them ideal to summarize with AI. Paste a chunk and ask: “Summarize the actual errors here for the client at 10.0.4.12 and ignore routine reconnects.” The model filters the signal. Redact internal hostnames and IPs first — a share log is a map of your internal network, and that’s not something to hand a third-party tool unscrubbed.
Verifying the fix end to end
Once you’ve applied a change, prove it with a real read and write, not just a mount that succeeds.
touch /mnt/share/.write-test && rm /mnt/share/.write-test && echo OK
A clean OK means the permission model actually allows what the user needs. The AI suggested the hypothesis; this command confirms it on the real system.
Diagnosing slow shares versus broken ones
“The share is slow” is a different problem from “the share is down,” and people conflate them. For NFS performance, nfsstat and mountstats expose where the time actually goes.
nfsstat -c # client-side RPC call counts
mountstats /mnt/share | head -40 # per-operation latency
mountstats breaks down average latency per operation type — READ, WRITE, GETATTR — which tells you whether you’re bottlenecked on data transfer or on metadata chatter. Paste that into the AI and ask it to interpret: a high GETATTR count with low READ usually means a workload doing lots of small stat calls, which points at an application pattern rather than a network problem. That distinction changes who owns the fix — you or the app team.
Pro Tip: Before blaming the network for a slow NFS share, check whether the workload is metadata-heavy. Thousands of tiny stat and open calls feel like “slow storage” but are really a chatty access pattern, and no amount of network tuning fixes them — caching options like actimeo do.
Testing throughput cleanly
When you do suspect raw transfer speed, measure it with a controlled write rather than guessing.
dd if=/dev/zero of=/mnt/share/speedtest bs=1M count=1024 conv=fdatasync
rm /mnt/share/speedtest
conv=fdatasync forces the data to actually hit the server before dd reports, so you get a real number instead of a buffered lie. Hand the result and the mount options to the AI and ask whether the throughput is reasonable for the protocol version and rsize/wsize in use — it knows the rough expectations and will flag when undersized rsize/wsize values are throttling you.
Keeping it safe
The discipline here is the same as everywhere: the AI decodes errors and proposes checks, but it works from logs and command output you’ve scrubbed, it never runs mount or unmount commands against your servers, and it never gets credentials for the file servers or the domain. It’s a fast junior engineer fluent in NFS and Samba’s error dialects — the human runs the commands and decides the fix. Force-unmounting a busy share or changing export options can disrupt every client connected, so a human verifies on a single client before touching the export. If you’re triaging a share outage under pressure, the incident-response dashboard keeps the timeline and decisions in one place.
Conclusion
NFS and Samba are hard not because any one piece is complex but because the failure is smeared across client, server, network, and two permission models — and that’s exactly the kind of cross-layer error an AI copilot is good at decoding. Confirm reachability, diagnose stale handles and UID mismatches, decode the mount errors, and verify with a real write. The model translates; the human runs and decides. More in the Linux admin category, and the Linux admin prompt pack includes the share-triage prompts I lean on.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.