Hardening SSH Access to Production Servers: A Practical

SSH is the front door to nearly everything I run. After 25 years of securing production fleets, I’ve learned that most “sophisticated” breaches start with a boring SSH mistake: a password-auth box exposed to the internet, a stale key that should have been revoked two jobs ago, or a sshd_config nobody has read since the server was provisioned.

This is the checklist I actually apply, plus how I use AI to audit a config before I reload the daemon — because the one mistake that ends careers is locking yourself out of every box at once.

Rule zero: never reload sshd without a safety net

Before you touch sshd_config, open a second SSH session and keep it open. Validate the config syntax, reload in the first session, and confirm you can open a third session before you close anything. If the new session fails, you back out from the still-open second one.

sudo sshd -t        # validate syntax, exits non-zero on error
sudo systemctl reload ssh

This single habit has saved me from countless 2am bastion-rebuild adventures.

Disable password authentication entirely

Passwords are guessable, reusable, and phishable. Keys are not. The single highest-value change you can make is going key-only.

# /etc/ssh/sshd_config
PasswordAuthentication no
ChallengeResponseAuthentication no
KbdInteractiveAuthentication no
PubkeyAuthentication yes
PermitEmptyPasswords no

Before flipping this, confirm every account that needs access already has a working key in ~/.ssh/authorized_keys. I’ve watched teams lock out a service account because nobody checked.

Never let root log in directly

Root login over SSH means a single compromised credential is game over, with no audit trail of who did it.

PermitRootLogin no

Engineers log in as themselves, then escalate with sudo. Now your audit log tells you which human ran the destructive command.

Tighten the cryptographic primitives

Default cipher suites carry legacy algorithms for backwards compatibility you almost certainly don’t need. Restrict to modern, strong choices:

KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512

Pair this with Ed25519 user keys (ssh-keygen -t ed25519). They’re short, fast, and resistant to the implementation footguns that plague RSA.

Limit who can connect, from where

Defense in depth. Constrain access at the daemon level and the network level:

AllowGroups sshusers
MaxAuthTries 3
MaxSessions 4
LoginGraceTime 20
ClientAliveInterval 300
ClientAliveCountMax 2

Then put production SSH behind a bastion host so individual servers aren’t internet-facing at all. Workloads accept SSH only from the bastion’s security group, and the bastion is the single audited choke point.

# ~/.ssh/config on your laptop
Host prod-*
  ProxyJump bastion.example.com
  User jjoyner
  IdentitiesOnly yes

IdentitiesOnly yes matters more than people realize — it stops your agent from offering every key you own to every host, which is both a privacy and a lockout-risk problem.

Rotate and inventory keys

Static keys that live forever are a liability. Two practices fix most of the risk:

Inventory every authorized key. You can’t revoke access you don’t know exists. Periodically collect every authorized_keys file across the fleet and reconcile against your current roster.
Move to short-lived certificates where you can. An SSH CA issues certificates that expire in hours, so a leaked credential is worthless by morning and there’s nothing to manually revoke.

If you’re not ready for a CA, at minimum automate key rotation the same way you’d handle any other secret rotation — on a schedule, not “when someone leaves and we remember.”

Using AI to audit the config safely

This is where AI earns its keep — as a reviewer, never as the thing that runs systemctl reload. I paste the full sshd_config and prompt:

“Here is a production sshd_config. Identify every setting that weakens security against current best practice, explain the risk of each, and give the corrected line. Do not suggest any change that could lock out existing key-based sessions, and flag any change that requires verifying authorized_keys first.”

The model is genuinely good at catching the line you skimmed past — a lingering PasswordAuthentication yes inside a Match block, a weak MAC you forgot to remove, an overly broad AllowUsers. It reads the whole file more carefully than a tired human at the end of a maintenance window.

What it must not do is touch the box. AI reads and reasons; you validate with sshd -t and reload from a session you can fall back out of. For a structured second opinion on config diffs, our Code Review tool applies the same read-only, risk-classified approach.

Watch the logs after every change

Hardening isn’t done when the config reloads — it’s done when you’ve confirmed nothing legitimate broke and you can see attacks bouncing off. Tail the auth log:

sudo journalctl -u ssh -f
# look for: "Failed password" (should drop to zero once key-only),
# "Invalid user", "Connection closed by authenticating user"

Feed a sample to AI and ask it to distinguish background internet noise from a targeted attempt against a real username. The two need very different responses — one is the weather, the other is someone who knows your account names.

The short version

Go key-only, kill root login, modernize the ciphers, gate access behind a bastion, inventory and rotate keys, and never reload without an escape session open. Use AI to audit the config and triage the logs — but keep a human on the keyboard for anything that touches a running daemon. SSH is the one service where a confident-but-wrong automated change can lock you out of everything at once.

AI-generated config audits are assistive, not authoritative. Always validate with sshd -t and verify access from a second session before closing your existing connection.

Hardening SSH Access to Production Servers: A Practical Checklist