— Article — № 023

023 —Workflow

Legacy site manifests: a 20-minute audit blueprint

You inherit a 9-year-old WordPress site at 4pm Friday. By 4:20 you should have a manifest that tells the next person, or you in six months, what's actually running.

Open wooden card-index drawer on linen, brass label-holder, fanned index cards, clay-red ribbon on one card.
Hero · staged still№ 023

A Dutch agency we work with handed over a WordPress site last month: nine years old, around 140 posts, three custom plugins nobody remembered writing, a Magento 1 store stapled onto a subdomain, and a PHP version the hosting panel quietly listed as 7.2. The brief was "small content update." Two hours in, we were still reverse-engineering what the site actually was.

This is the part of legacy site work nobody writes down. The site has been running for a decade. Half the institutional knowledge left with the developer who built it. The other half lives in a Slack channel nobody can find anymore. Before you touch a single line of code, you need a manifest.

What a manifest actually is

A manifest is one plain-text file, checked into the site repo or stored alongside your project notes, that captures everything a successor would ask you in their first hour. Not the code: the context around the code. Versions, paths, cron jobs, hardcoded URLs, the email address the contact form posts to, the cert renewal cadence, the staging URL nobody else has access to.

Call it MANIFEST.md, SITE.md, README.ops, whatever the team will actually read. The name matters less than the discipline of writing one before you do anything else.

The 20-minute capture, in order

You can write a usable manifest in 20 minutes if you stop trying to make it pretty. Open a markdown file. Set a timer. Go through the following six steps in order, with no detours.

1. Runtime versions. SSH or FTP in, then drop a phpinfo file or check the existing one. From CLI:

php -v
mysql --version
nginx -v 2>&1 || apache2 -v 2>&1

If you only have FTP, write a one-line PHP probe and pull the response:

<?php echo phpversion(), "\n", PHP_OS, "\n"; ?>

Record exact versions, not ranges. "PHP 7.4.33" not "PHP 7.x." The patch number matters when CVE references come up later. The official php.net supported versions table tells you which line is still receiving security fixes.

2. Database connection. Find wp-config.php, settings.php, configuration.php, or app/etc/local.xml depending on the CMS. Note the database name, user, host, and whether it points at localhost or a separate host. If it's a separate host, write down the IP, not just the hostname. DNS records drift.

grep -E "DB_(NAME|USER|HOST)" wp-config.php

Do not paste the password into the manifest. A reference like "see vault entry kv/clients/acme/db" is enough.

3. .htaccess and rewrite rules. This is where the bodies are buried. Every legacy site has a .htaccess block someone added at 2am to fix a redirect loop and never documented. Cat the file, paste it verbatim into the manifest under its own heading, and annotate the non-obvious blocks. Even a note like "this block forces HTTPS for /shop/ only, reason unknown" is useful.

RewriteEngine On
RewriteCond %{HTTP_HOST} ^oldbrand\.nl$ [NC]
RewriteRule ^(.*)$ https://newbrand.nl/$1 [R=301,L]

Apache's own mod_rewrite documentation is the reference you want open in another tab when decoding a 12-rule chain.

4. Cron jobs. Run crontab -l on the server. Check /etc/cron.d/ for system-level jobs. For WordPress, look in wp_options for active wp-cron tasks and any HTTP-loopback scheduler. Magento 1 has its own cron.sh that documents separately.

5. Plugin and module inventory. On WordPress with WP-CLI installed:

wp plugin list --format=csv > plugins.csv

If WP-CLI isn't there, a directory listing of wp-content/plugins is enough. Record version numbers next to each. Mark plugins last updated more than 24 months ago. Those are your audit candidates.

6. Mail flow. Where does the contact form send to? Through PHP mail()? An SMTP plugin? A transactional service like Postmark or SendGrid? This is the question that always comes up six weeks in, when a customer asks why their order confirmation never arrived. Find the answer once, write it down.

Hardcoded paths and the long tail

The hardest part of a manifest is not the obvious entries. It's the long tail of small assumptions baked into the codebase that nobody flagged. A few that recur on every legacy site we audit:

  • Absolute paths in includes (require '/home/oldowner/public_html/lib.php') that break the moment the site moves hosts.
  • Hardcoded API keys in plugin files, not in wp-config.
  • SSL certificate paths in custom snippets, often pointing at a Let's Encrypt directory symlinked through three layers.
  • The one cron job that runs on a developer's laptop, not the server, and has done so for two years.

Grep for the obvious patterns and add the hits to the manifest:

grep -r "/home/" wp-content/ --include="*.php"
grep -rE "https?://[a-z0-9.-]+\.[a-z]{2,}" wp-content/themes/ --include="*.php"

Where to store it

The manifest belongs in three places at once. First, in the repo root if the site is in version control. Second, in your password manager or vault under the client entry, so a colleague who needs database access doesn't have to clone anything. Third, as a printed PDF in the project folder of whoever's invoicing the client. That last one sounds excessive until the client's domain registrar account goes silent for two weeks and you need to prove what you inherited.

Update the manifest every time you change something material. A manifest untouched for 18 months is worse than no manifest, because it lies confidently. OWASP's guidance on legacy application inventories says the same thing in more words: documentation has to be a living artifact or it becomes a liability.

A worked example

Here's a trimmed manifest from a real WordPress site we picked up earlier this year, names changed:

# acme-nl manifest
Last updated: 2026-04-12

## Runtime
- PHP 7.4.33 (host panel allows up to 8.1)
- MySQL 5.7.39 (MariaDB-compatible)
- Apache 2.4.41
- WordPress 6.2.3 (auto-updates disabled)

## Database
- Host: 145.220.x.x (not localhost)
- DB: acme_prod, user: acme_wp
- Credentials: vault://clients/acme/db
- Daily mysqldump to /backups/acme-YYYY-MM-DD.sql.gz, 30-day retention

## .htaccess quirks
- Forces HTTPS for /winkel/ only (legacy SEO redirect chain)
- Blocks /wp-login.php to non-NL IPs (geoblock added 2024-08)

## Cron
- wp-cron disabled; system cron runs /usr/local/bin/php wp-cron.php every 10m
- nightly mysqldump at 02:30 CET

## Mail
- WP Mail SMTP plugin pointed at Postmark, API key in wp-config (move to env)

## Known landmines
- Theme contains hardcoded path /home/acmenl1/ in functions.php line 312
- Custom plugin "acme-orders" written by previous dev, no source repo, 1.8MB
- SSL cert auto-renews via host panel, not certbot

That's it. 30 lines, fits on a screen, makes the next person 80% faster.

The audit that pays for itself

Every legacy site we've worked with that had a manifest got handed off in days. Every one that didn't burned a week of someone's life reconstructing missing context. The 20 minutes you spend writing it is the cheapest insurance the project will buy.

When we built Pier for editing legacy sites by chat, we ran into this exact problem on almost every site we docked with. The way we ended up handling it: Pier writes a baseline manifest the first time you connect, captures PHP and MySQL versions, schema, .htaccess, and the plugin inventory automatically, and stamps a new entry into version history every time something material changes. The manifest stops being something you remember to update.

The smallest thing you could do today: pick the oldest site in your portfolio, open a terminal, and start a MANIFEST.md with the three version commands at the top of this post. You'll have something useful before your coffee gets cold.

— Questions —

How is a site manifest different from a README?

A README explains what the site does. A manifest captures what's running it: versions, paths, cron jobs, dependencies, mail flow. Successors need both, but they need the manifest first.

Where should the manifest file live?

Three places at once: the repo root for developers, your password vault for anyone needing credentials, and a PDF in the project folder of whoever invoices the client.

How often should I update it?

Every time you change something material: PHP upgrade, plugin install, cert renewal, cron edit, host migration. An outdated manifest is worse than none because it lies confidently.

What if I inherit a site with zero documentation?

Run the 20-minute capture before touching code. Versions first, .htaccess second, database third. Cron and mail flow last, because they tend to reveal themselves under pressure.