The first question a publisher should ask before moving off Arc XP or Brightspot is not “can we move?” It is “what happens to twenty years of archive on the way out?”
I get the platform question a lot now. A media organization has been on Arc XP™ — the system The Washington Post built and then licensed to other publishers — or on Brightspot™, and the renewal is coming up, and someone in the room has done the math on what another three-year term costs. So they call and ask whether WordPress can do what their enterprise CMS does. It usually can. That is the easy part of the conversation. The hard part is the archive, and the archive is where these migrations are won or lost.
If you run editorial operations, digital product, or technology at a publisher or broadcaster and you are weighing a move off Arc XP or Brightspot, this is for you. You need to understand the risk before you authorise the work — not write the import scripts yourself, but know what to insist on from whoever does.
Why publishers leave Arc XP and Brightspot
The reasons are rarely about the software being bad. Arc XP and Brightspot are capable platforms built by people who understand newsrooms. The reasons are about what owning an enterprise CMS licence actually costs an organization over time, and they tend to arrive together.
Cost is usually the one that starts the conversation. An enterprise CMS licence at publisher scale is a six-figure annual line, and it does not shrink when your newsroom does. The second reason is talent. When you need to hire a developer who can extend Arc XP, you are fishing in a small pond and paying pond-specific rates. When you need a WordPress developer, you are fishing in the largest pool of CMS talent on the open web, and you can hire one in most cities in the country. The third reason is control. Editorial teams want to change a template, add a field to a content type, or ship a new section without filing a vendor ticket and waiting on someone else’s release schedule. Owning your own platform means owning your own roadmap, and for a lot of publishers that is the real prize — not the licence saving, but the end of waiting in someone else’s queue.
None of that is a reason to leave on its own. Together, on a renewal date, they are why my phone rings.
What makes these migrations harder than a normal one
A standard CMS migration moves a few hundred pages and a contact form. An Arc XP or Brightspot migration at a real publisher moves the institutional record of a newsroom, and three things about that record make the job a different discipline.
The archive is enormous, and it is the asset. A regional daily that has been digital since the late nineties has somewhere between several hundred thousand and a few million articles. A network has more. Every one of those articles is a URL that earns traffic, a story a reader might link to, and a piece of the organization’s memory. The scale alone changes the engineering — you cannot eyeball a million-row import — but the scale is not the hard part. The hard part is that the archive is not disposable. It is the reason the organization has authority in search and trust with readers. A business site can afford to lose its old blog posts. A newsroom that loses its archive has lost the thing that made it a newsroom.
The metadata is deep, and most of it stays invisible until it breaks. Arc XP and Brightspot model content richly. A single article carries a canonical author that is a structured entity, not a text string — linked to a beat, a profile, an archive. It carries an original publish timestamp and a separate updated timestamp. It carries section taxonomy that maps to the print history, to the ad inventory, and to the editorial structure. It carries bylines that might credit a staff writer, a wire service, and a photographer separately. When that metadata flattens into plain post fields during an import, nothing throws an error. The articles look fine. Then the author archives are wrong, the section feeds are empty, the syndication credit is missing, and you find each problem one complaint at a time, weeks after launch.
The SEO is audience-critical and unforgiving. For most businesses, a migration that dents search traffic is an inconvenience that recovers in a quarter. For a publisher, search and link traffic is the audience, and a migration that breaks it breaks the revenue the same week. Every old URL that 404s is a story that used to rank and now sends readers nowhere. The stakes that make these migrations worth doing carefully are the same stakes that punish doing them fast.
The process I run
The work moves through five stages, and the order matters more than any single step inside it. Skipping ahead is how archives get lost.
It starts with a content audit — not a count of articles, but a true inventory of every content type, every URL pattern, and every metadata field the old system holds. This is where you find the formats nobody mentioned: the photo galleries, the live blogs, the data tables, the three generations of article template left over from past redesigns. You cannot migrate what you have not found, and the audit is where the real scope of the job stops hiding.
Then comes the migration methodology: the mapping from the old data model to the WordPress one. This is the engineering heart of the project, where an Arc XP author entity becomes a WordPress author with the right schema, where a Brightspot section becomes a taxonomy term, where a publish timestamp lands in the right field and keeps its timezone. I never validate this against the whole archive first. I migrate a representative slice — roughly ten percent, spread across the full date range so it touches the oldest formats and the newest — and I check that slice completely before scaling up. The reason for staging it this way is specific: the failures that lose an archive are the quiet ones. A silently skipped gallery, a publish date overwritten with today’s import date, a redirect that never got written — at ten percent you can still see each of those and fix the mapping. Multiply the same flaw across a million articles and you are no longer fixing a bug, you are reconstructing a record. It is far cheaper to learn that on a tenth of your content than on all of it the week before launch.
URL mapping runs alongside it and deserves its own discipline, because permanent links are editorial currency. Every URL pattern the old site produced goes on a list, built by looking at the live site section by section, not by trusting the obvious paths the discovery call surfaced. Each pattern maps to where it will live in the new structure. Then you build the redirects and test a real sample of them weeks before cutover. The redirect map is not really a launch-day SEO chore. It is what stands between a URL that earned a reader’s trust for a decade and a dead page where the institutional record used to be. When one of those links goes dark, what disappears is not a ranking — it is a citation a researcher followed, a source another newsroom linked to, a piece of the public record the organization was the custodian of. You build and test the map to keep that record reachable.
The editorial workflow is the stage teams underestimate most. A newsroom that has published in Arc XP for years runs on a rhythm that predates the platform, with custom post statuses baked into how the desk works. WordPress can match most of that workflow, but matching it is design work, not a default. The honest goal is parity on the workflows the desk uses every day, and a clear, agreed list of what changes — decided before launch, not discovered during it.
Last is the cutover. The archive imports in stages while the old site stays live and carries the traffic. The redirects go in. The team rehearses a full publishing day in the new system before the switch, not a demo — a complete day, every story and section and photo workflow, while the old site is still the one readers see. In an archive migration the rehearsal is doing a second job beyond proving editorial readiness: it is the first time anyone exercises the imported archive under real use. Publishing a story surfaces a broken template the import left behind. Linking to an archived piece surfaces a redirect that never got written. Pulling an author archive surfaces metadata that flattened on the way in. None of those show up when you spot-check a few articles by hand; they show up the moment someone works the way the desk actually works.
What actually causes archive loss — and how to prevent it
Archive loss is rarely a dramatic failure. It is almost always quiet — which is exactly why it survives a cursory check and surfaces months later, one reader complaint at a time. Most of it traces back to three causes, and naming them before you start is what makes them preventable.
The first is the missing redirect. The article migrated fine, but its old URL was never mapped to its new one, so the link that ranked for a decade now returns a 404. The reader is gone and the search authority bleeds out over the following weeks. The prevention is a complete redirect map, tested before launch, with a catch-all rule and a monitored 404 log for the long tail you could not test by hand. On a media migration, the redirect map is the migration.
The second is canonical and metadata erosion. The body text came across, but the canonical author flattened to a name, the original publish date got stamped with today’s import date, or the canonical tag now points at the staging domain. The article exists, but it has lost the structured truth that made it trustworthy to a search engine and a reader. The prevention is to treat metadata as first-class cargo, not a side effect — to validate authors, timestamps, canonicals, and section taxonomy on the ten-percent slice before anything scales, and to confirm the original dates survived the trip.
The third is the silent skip. The importer hit a format it did not understand — an old gallery type, an embedded data table, a story with an unusual attachment — and it moved on without flagging it. The count looks right. The content is not all there. The prevention is reconciliation: count what left the old system, count what arrived in the new one, and investigate every gap rather than rounding it off. A staged import makes this visible. A single overnight bulk import hides it until a reader finds the hole for you.
A realistic timeline, and the VIP question
For a publisher with a serious archive, a migration of this class runs six to nine months from audit to cutover, sometimes longer for a network moving multiple properties at once. Anyone quoting six weeks for a million-article archive with full metadata fidelity is quoting for a job that is not the job you have. The audit alone takes weeks if it is done honestly. The redirect map and the staged validation cannot be rushed without reintroducing exactly the risks they exist to remove.
On where to host it, I will give you the honest split. WordPress VIP earns its price when you are running a national network with the traffic and editorial volume to match, and when you need Automattic’s infrastructure and support standing behind the platform during a high-stakes cutover. I have worked on VIP directly through a national newspaper network, and for that class of organization it is the right answer. But a great many publishers leaving Arc XP or Brightspot are not at national-network scale. For them, the right architecture is a well-run managed WordPress host with aggressive caching, a lean plugin set, and someone who knows which plugins to take out. The mistake is assuming you must replace one expensive enterprise contract with another. Sometimes you should. Often the saving you came for is real, and the path to it is self-managed WordPress run by people who know what they are doing.
If you are early in this decision, the most useful thing you can do before talking to anyone is run the audit’s first step yourself: list every URL pattern your current site produces and every content type behind them. That single document tells you more about the true size of the job than any vendor demo will, and it is the first thing I will ask you for anyway.
Written from a decade of newsroom migration work, with structural help from Claude.
Product names referenced on this page — including WordPress and Claude — are trademarks or registered trademarks of their respective owners. Training offered here is independent and is not affiliated with, endorsed by, or sponsored by any of these companies.