— Article — № 003

003 —Migration

Joomla migration without SEO loss: a 12-year-old case study

A 12-year-old Joomla 2.5 site, three SEF plugins deep, moving to PHP 8.2 hosting. Here's how the redirects, the database, and the rankings survived the week.

Rolled antique nautical chart on oak with brass dividers and a small red wax seal on a paper tag.
Hero · staged still№ 003

The call came in from a Belgian agency we work with: a travel publisher, 3,400 articles, Joomla 2.5.28, sitting on a shared host that had just emailed the client about PHP 5.6 being retired at the end of the quarter. The site still pulled around 180,000 organic sessions a month, most of it on long-tail route guides that had been ranking since 2014. The agency had quoted a rebuild in October, the client had declined, and now the hosting deadline was six weeks out.

The brief was narrow: move the site to modern hosting, get it onto a supported PHP, and do not lose the SEO juice. No redesign. No CMS jump to WordPress. The Joomla migration had to be boring, reversible, and finished before the PHP cutoff. This is what that looked like, step by step, including the parts that went sideways.

The starting inventory

Before touching anything, we spent half a day cataloguing what was actually on disk and in the database. On a site this old, the installed extensions are the migration. Core Joomla upgrades are documented; the five SEF plugins stacked on top of each other are not.

The shared host gave us SSH, barely. First pass:

ssh client@oldhost
php -v
# PHP 5.6.40 (cli)

cd ~/public_html
cat configuration.php | grep -E 'dbtype|host|user|db ='
# mysqli / localhost / [redacted]

ls administrator/components/ | wc -l
# 47

ls plugins/system/ | wc -l
# 22

Forty-seven admin components and twenty-two system plugins on a 2.5 install is archaeology, not software. We exported the extensions table to see what was actually enabled:

SELECT name, element, type, enabled
FROM jos_extensions
WHERE enabled = 1
  AND type IN ('component','plugin')
ORDER BY type, name;

Three SEF (search engine friendly URL) plugins were active at once: sh404SEF, Joomla's native SEF, and something called JoomSEF that hadn't been updated since 2015. Each was rewriting URLs on top of the last. The live URL structure the search engines had indexed was the output of all three, in order, and none of the developers who set it up were reachable. That's the SEO juice: a decade of Googlebot learning a specific URL shape that no single component on the server could reproduce from scratch.

Capturing the URL truth before moving anything

The mistake on this kind of legacy site migration is trying to understand the redirect logic from source. You won't. The rewrites are emergent. What you want instead is a ground-truth list of every URL Google actually knows about, and the HTTP status and final destination each one currently returns.

We pulled three sources and merged them:

  1. The XML sitemap the site was serving (~3,400 URLs).
  2. A Search Console export of every URL with at least one impression in the last 16 months (~11,800 URLs — many more than the sitemap, as expected).
  3. Access logs from the last 90 days, grep'd for 200-status GETs with a Googlebot or Bingbot user agent.

Deduplicated, that produced 14,210 canonical URLs the search engines cared about. We then crawled all of them from our side and captured status, final URL after redirects, and the page's canonical tag:

cat urls.txt | while read url; do
  curl -sI -L -o /dev/null -w '%{http_code}\t%{url_effective}\t%{redirect_url}\n' "$url"
done > baseline.tsv

That baseline.tsv is the contract. After the migration, the same crawl on the new host has to produce the same final URLs with the same status codes (or a 301 to the same destination). Anything else is ranking loss. Google's own guidance on site moves with URL changes is worth re-reading before you start — the section on redirect chains is the one people forget.

Staging the new host

The new hosting was a small VPS running PHP 8.2, MariaDB 10.6, and Apache 2.4 with mod_rewrite. Joomla 2.5 does not run on PHP 8.2. It doesn't run on PHP 7.4 either, cleanly. The supported upgrade path is 2.5 → 3.10 → 4.x → 5.x, and even the first hop changes enough of the database schema that half the third-party extensions break.

We chose the pragmatic route: upgrade to Joomla 3.10 LTS, which does run on PHP 8.1 with the compatibility plugin and will run acceptably on 8.2 with a handful of patches. Joomla 4 would have meant rewriting three of the custom components, and the budget didn't exist. 3.10 bought the client another two years and kept the template, the menus, and — crucially — the SEF URLs intact.

The migration ran on staging first, obviously:

# On old host
mysqldump --single-transaction --quick --default-character-set=utf8 \
  -u dbuser -p oldjoomla > dump.sql
tar czf files.tar.gz public_html/

# On new host
mysql -u newuser -p newjoomla < dump.sql
tar xzf files.tar.gz -C /var/www/staging/

Two things bit us here. The old database was utf8 (three-byte), not utf8mb4. A straight dump-and-restore into a utf8mb4 database mangled a handful of articles containing emoji and old Windows-1252 smart quotes. We reimported with the original charset, ran the Joomla 3.10 upgrade, and then converted tables to utf8mb4 after confirming the upgrade didn't touch text columns.

The second bite: jos_session had 2.1 million rows. The upgrade script tried to alter it and stalled. We truncated it first. Sessions are disposable; nobody has ever missed one.

The SEF untangling

With 3.10 running on staging, we pointed a local hosts entry at it and re-ran the baseline crawl. 4,100 URLs returned 404.

All of them were sh404SEF-generated URLs with the old slug format (/reisgidsen/frankrijk/provence-verborgen-dorpjes.html — trailing .html, Dutch slug, category-then-slug pattern). The Joomla 3.10 native router was producing /reisgidsen/frankrijk/427-provence-verborgen-dorpjes. Different path, ID in the slug, no suffix. If we shipped that, we'd lose a decade of backlinks.

The fix had three parts.

Reinstall sh404SEF. The 3.x-compatible version exists and, mercifully, can import its old URL database from the 2.5 version's tables. We exported jos_sh404sef_urls from the old database, imported it into the new one, and let sh404SEF take over routing again. That recovered about 3,700 of the 404s.

Static redirect map for the rest. The remaining 400 URLs were old JoomSEF leftovers from before sh404SEF was installed in 2016. They're not in any current plugin's database. We built the map by hand from the baseline file and dropped it into .htaccess above the Joomla rewrite block:

RewriteEngine On

# Legacy JoomSEF redirects (pre-2016)
RewriteRule ^content/view/(\d+)/\d+/?$ /index.php?option=com_content&view=article&id=$1 [R=301,L]
RewriteRule ^nl/reisgids-(.+)\.htm$ /reisgidsen/$1 [R=301,L,NC]

# Trailing .html on category pages (sh404SEF v2 shape)
RewriteRule ^([a-z-]+)/([a-z0-9-]+)\.html$ /$1/$2 [R=301,L]

# Joomla standard
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* index.php [L]

Canonical enforcement. Joomla 3.10's native SEF was still producing an alternative URL for every article (the ID-based form). We set sh404SEF to 301 all non-canonical variants to the canonical one, and added a canonical tag template override to guarantee the <link rel="canonical"> in the page head matched. Google will eventually figure it out from one signal, but three matching signals converge faster.

See the Apache mod_rewrite reference for the flag semantics; the [L] on the legacy rules matters because without it the trailing .html rule catches the already-rewritten internal requests and you get a loop.

Cutover

DNS cutover happened on a Tuesday at 04:00 CET. TTL had been dropped to 300 seconds the week before. The sequence:

  1. Put the old site in read-only mode (disable user registration, comments, any form submission) at 03:00.
  2. Final differential mysqldump of article and user tables, restored onto the new host.
  3. Swap the A record.
  4. Watch the access log on the new host until Googlebot arrives (it took 47 minutes).
  5. Re-run the baseline crawl against the live domain.

The crawl came back with 11 non-matching URLs. All eleven were articles that had been edited on the old site between the last full dump and the cutover; the content matched but the modified timestamp didn't. Acceptable.

What the rankings did

The honest numbers, from Search Console, 90 days post-cutover:

  • Total clicks: −4.2% vs. the 90 days before (within normal seasonal variance for this vertical).
  • Indexed pages: 3,380 → 3,340. Forty dropped URLs, all of which were pre-2014 articles with no backlinks and near-zero impressions.
  • Average position on the top 500 queries: unchanged to +0.3.
  • Core Web Vitals: moved from 18% "good" to 71% "good" on mobile, purely from the hosting change.

No ranking collapse. The work happened before the cutover, not after.

The tool question

The slowest part of this job wasn't the dump or the rewrites. It was running small SQL queries against the old database to understand what was there — which extensions were actually used, which articles had non-ASCII characters, which user accounts had been dormant since 2017 — and keeping notes on what we changed and why, so the rollback plan stayed honest.

When we built Pier, the Joomla and Drupal migrations we'd done in the years before were exactly the use case. A docked MySQL editor next to the file tree, with version history on every edit, turns out to be what this work actually needs: not a framework, not a dashboard, just the two panes open at once with an audit trail underneath.

One thing you can do today

If you've got a legacy Joomla or Drupal site on a host that's nagging you about PHP versions, pull a baseline crawl now — sitemap plus Search Console export plus 90 days of access logs, deduplicated, with status codes captured. The migration is easier when you already know the shape of the thing you're not allowed to break.

— Questions —

Can you migrate Joomla 2.5 directly to Joomla 5?

Not cleanly. The supported path is 2.5 → 3.10 → 4.x → 5.x, and most third-party extensions break between majors. For older sites, 3.10 LTS is often the pragmatic stop.

Will 301 redirects preserve PageRank?

Google has stated that 301 redirects pass full link equity. In practice, ranking stability depends more on keeping the URL set complete and the content unchanged than on the redirect type itself.

Do I need to convert utf8 to utf8mb4 during migration?

Yes, eventually — utf8 in MySQL is three-byte and can't store emoji or some CJK characters. Do it after the Joomla version upgrade completes, not before, to avoid collation conflicts mid-upgrade.

How long should DNS TTL be before cutover?

Drop it to 300 seconds at least 48 hours before the switch. That way a rollback propagates in minutes, not hours, if something breaks after the A record change.