051 —Magento
Magento CSV import without downtime: a 9,000-row swap
A 23:41 Loom, a 9,043-row supplier CSV, and a storefront that could not go down. The post-mortem on how a Dutch agency staged the import without losing a minute of uptime.
The 23:41 Loom
A Loom link landed in our inbox at 23:41 on a Tuesday. Twelve minutes of screen recording from the lead developer at a Dutch agency, voice tight, narrating his terminal. Their largest Magento 2.4.5 client (about €4M/year in B2B parts and accessories) had just received a 9,043-row product CSV from the supplier. New pricing, new stock figures, eighteen new SKUs, plus description rewrites on roughly half the catalogue. The client wanted it live before 06:00, when their first wholesale buyers logged in.
The previous attempt, two weeks earlier, had killed the storefront for 38 minutes. Category pages 502'd. Cart updates timed out. Admin sessions hung. Eventually they restarted MySQL, lost two of the import rows mid-flight, and spent the next day reconciling. The lead's question on the Loom was direct: how do we replace 9,000 rows of Magento product data without taking the site down?
This is the post-mortem on that night. Names left out, every command and SQL fragment is the real thing we ran or recommended.
Where Magento's native importer gives up
Magento 2's Magento_ImportExport module is built for this job. bin/magento import:products or the Admin UI both work fine for moderate-sized CSVs. The trouble starts when the import is wide (lots of attributes per row) and the catalogue has live URL rewrites, multi-store, MSI inventory, and tier prices all wired up at the same time. Each saved row triggers writes against:
catalog_product_entityand its_int,_varchar,_text,_decimalEAV tablesurl_rewrite, on every URL key change (and Magento rewrites the rewrite on every save, even when the key did not actually change)cataloginventory_stock_itemand the MSIinventory_source_itemtablecatalog_product_index_pricepartitions, one per customer group- The full-text search index (
catalogsearch_fulltext_scope1on MySQL, the Elasticsearch index when configured)
On a busy shop, those writes collide with reads from category pages, layered navigation, and the cart. The InnoDB deadlock log fills with the same line:
SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying
to get lock; try restarting transaction, query was: UPDATE
`catalog_product_index_price` SET ...
Once enough deadlocks pile up, the import job retries, fails, and either restarts from row 1 or skips the failed rows entirely depending on which mode you picked. By then the front-end is already gasping.
Splitting the CSV and the import job
The first fix is unglamorous. Stop trying to import 9,000 rows in one transaction. Magento's async bulk API exists precisely for this. We split the supplier CSV into chunks of 300 rows each with a quick shell loop:
cd /var/www/magento/var/import
mkdir chunks
tail -n +2 supplier-2026-05.csv | split -l 300 -d - chunks/products-
head -n 1 supplier-2026-05.csv > header.csv
for f in chunks/products-*; do
cat header.csv "$f" > "$f.csv" && rm "$f"
done
ls chunks/ | wc -l
# 31
Thirty-one chunks of about 300 rows each. Each chunk takes 40 to 90 seconds to import on this client's box (8 vCPU, 16 GB RAM, RDS MySQL 8.0). Run them sequentially with a small sleep between them and the front-end stays responsive.
For the actual import we used the async bulk REST endpoints rather than the CLI. The async path drops jobs into RabbitMQ and a consumer processes them one at a time, which is exactly what we wanted. Setup is two blocks in app/etc/env.php:
'queue' => [
'consumers_wait_for_messages' => 0,
],
'cron_consumers_runner' => [
'cron_run' => true,
'max_messages' => 100,
'consumers' => ['async.operations.all'],
],
Then start the consumer once, with a single worker so chunks process in order:
bin/magento queue:consumers:start async.operations.all \
--single-thread --max-messages=10000 &
Indexer modes and the staging dance
The single biggest win was flipping every indexer to manual mode before the import started. By default Magento's indexers are set to Update on Save, which means every row write triggers a partial reindex of price, stock, EAV, and search. Multiply that by 9,000 and the math gets ugly. The indexer documentation covers the modes but is quiet on the operational implications.
bin/magento indexer:set-mode manual
bin/magento indexer:status
# Design Config Grid Manual Updates
# Customer Grid Manual Updates
# Category Products Manual Updates
# Product Categories Manual Updates
# Product Price Manual Updates
# Product EAV Manual Updates
# Stock Manual Updates
# Catalog Rule Product Manual Updates
# Catalog Product Rule Manual Updates
# Catalog Search Manual Updates
# Inventory Manual Updates
With indexers off, the import writes only to the source tables. Storefront reads continue to hit the previously-built index tables, which still hold last night's data. The customer sees last night's prices for the duration of the import. For this client, who explicitly said stale prices for 30 minutes is fine, downtime is not, that was the right trade.
Reindexing without taking the catalogue down
When the last chunk landed we had 9,043 rows updated, indexers untouched, storefront still serving fine. The next problem was reindexing without recreating the 38-minute outage.
The default bin/magento indexer:reindex runs every indexer in sequence and rebuilds them from scratch. On a catalogue this size, the price reindex alone takes 6 to 9 minutes and locks catalog_product_index_price for most of that. Category page queries that join against it queue up behind the rebuild.
Instead we reindexed selectively, in dependency order, and ran each indexer individually so we could pause between them:
bin/magento indexer:reindex catalog_product_attribute
bin/magento indexer:reindex catalog_product_price
bin/magento indexer:reindex inventory
bin/magento indexer:reindex cataloginventory_stock
bin/magento indexer:reindex catalogrule_product
bin/magento indexer:reindex catalog_category_product
bin/magento indexer:reindex catalogsearch_fulltext
For the price indexer, which is the heaviest, we used a partial reindex against only the SKUs that changed. Magento ships a _cl changelog table per indexer (catalog_product_price_cl) that's normally only used in scheduled mode, but you can populate it manually:
INSERT INTO catalog_product_price_cl (entity_id)
SELECT entity_id FROM catalog_product_entity
WHERE sku IN (SELECT sku FROM tmp_changed_skus);
-- Then trigger a partial reindex via the MView
UPDATE mview_state SET status = 'idle', mode = 'enabled'
WHERE view_id = 'catalog_product_price';
The partial reindex took 41 seconds against 9,043 SKUs versus 7 minutes for a full rebuild. Most of that 41 seconds was the customer-group fan-out: this client has four groups (Guest, Retail, Wholesale, B2B-Tier), so the index table grows to roughly 36,000 rows on every cycle.
Last, cache. The temptation is bin/magento cache:flush, which nukes everything including the cart and session caches. Don't. Targeted invalidation by tag is enough:
bin/magento cache:clean config block_html full_page
# leave layout, collections, db_ddl, eav, integration alone
What we would do differently next time
It worked. The whole job ran from 02:18 to 03:04. Storefront p95 response times stayed under 800ms throughout. The supplier CSV landed, prices and stock were live by 03:10, the lead developer slept until 11:00.
Two things we would change. First, the URL rewrite problem. Magento rewrote the entire url_rewrite table for products whose URL keys had not actually changed, because the importer sees the column in the CSV and writes it regardless. The fix is to strip the url_key column from the import CSV entirely when keys are not changing. We did that on chunks 4 through 31 and saved roughly 80,000 row writes against url_rewrite.
Second, the supplier should not be sending the full catalogue every month when only a fraction of rows changed. A pre-import diff against the previous month's CSV cut the actual import down to 1,247 rows the following cycle. awk is enough for this:
awk -F',' 'NR==FNR{a[$1]=$0; next} a[$1]!=$0' \
prev-month.csv supplier-2026-06.csv > delta.csv
wc -l delta.csv
# 1248 (header + 1247 changes)
When we built Pier we kept running into exactly this kind of work: small agencies who need to poke at a live Magento or WordPress MySQL editor without locking it, and who want a reliable version history when something goes sideways at 02:00. The way we ended up handling it was to wrap every edit in a per-change snapshot that rolls back with one click, which is what the agency above now uses to stage the monthly delta CSV and the inevitable one-off SQL tweaks without ever opening phpMyAdmin.
The smallest thing to try tomorrow
Open a staging Magento, run bin/magento indexer:set-mode manual, import a small CSV, and watch what happens to the storefront response time. If you have never seen the indexer mode flip in action, the difference is the kind of thing you remember the next time someone hands you a 9,000-row supplier dump.
— Questions —
Can I do this with bin/magento import:products instead of the async API?
Yes, but the CLI runs synchronously and a 9,000-row import in one process tends to deadlock. Chunking the CSV and using the async bulk endpoint at least lets you pause between chunks.
Does flipping indexers to manual mode lose data?
No. Entity changes queue in the mview changelog tables (the _cl tables). When you reindex, partial mode replays them. Just remember to flip indexers back to Update by Schedule afterwards.
What about Elasticsearch reindex time on a big catalogue?
catalogsearch_fulltext can take several minutes against a large index. Run it last, accept stale search results during the window, or run a partial reindex against only the changed SKUs.
Will any of this work on Magento 1?
Indexer mode flipping works similarly via Mage_Index, but the schema and bulk API do not exist. For Magento 1 the safer path is a direct SQL staging table with manual EAV writes, then a swap.
How big can a single chunk safely be?
On an 8 vCPU box with MySQL 8.0 and indexers in manual mode, 300 rows runs in under 90 seconds without contention. Above 500 rows we started to see lock waits on catalog_product_entity_varchar.