The Daily Johannesburg

Johannesburg news, every day

News

Joburg's Digital Archive Crisis: The Numbers Behind a Flood of Duplicate Images Choking City Systems

From the City of Johannesburg's planning portals to Metrorail's maintenance databases, duplicated image files are consuming server space, inflating IT costs and slowing the digital services that residents depend on.

By Johannesburg News Desk · Published 4 July 2026, 8:45 pm

3 min read

Joburg's Digital Archive Crisis: The Numbers Behind a Flood of Duplicate Images Choking City Systems
Photo: Photo by Magda Ehlers on Pexels

The City of Johannesburg is sitting on a digital storage problem it can barely quantify. Across municipal departments — from the Development Planning directorate in Braamfontein to the Johannesburg Roads Agency on Grayston Drive in Sandton — duplicate image files have accumulated over years of poor digital governance, eating into budgets that officials say are already under strain. Preliminary internal audits, conducted earlier this year, flagged that as much as 30 to 40 percent of image assets stored across the city's shared network drives may be redundant copies of files already held elsewhere in the same system.

The timing matters. The ANC-DA coalition running Gauteng has made digital service delivery a centrepiece of its governance pitch since 2024, promising faster turnaround on building permits, rezoning applications and infrastructure queries — all document-heavy processes that rely on image-rich PDF submissions and photograph libraries. When back-end storage is bloated with duplicates, upload speeds slow, retrieval times lengthen and staff waste hours hunting for the correct file version. That inefficiency has a rand value. Industry benchmarks used by South African IT consultancies suggest that unmanaged duplicate data typically costs medium-to-large organisations between R180 000 and R450 000 annually in excess storage licensing, staff time and retrieval errors, depending on the size of the archive.

What the Data Actually Shows

The scale of the problem becomes clearer when you look at specific systems. Joburg Metrorail's asset-management platform, which photographs track infrastructure, station conditions and rolling stock at depots including the Braamfontein yard and the Naledi station in Soweto, generates hundreds of new images per inspection cycle. Engineers on the reform programme — which is being pushed by both the Passenger Rail Agency of South Africa and the Gauteng provincial government — have noted that field teams often upload the same site photograph multiple times due to poor connectivity during load-shedding windows, creating clusters of identical files. A single inspection at Park Station in 2025 reportedly generated three separate uploads of the same 47-image set before network stability was confirmed. Multiply that across 19 active inspection teams and the duplication compounds rapidly.

At the Johannesburg City Parks and Zoo on Zoo Lake Drive, a parallel issue emerged during the digitisation of its botanical records. Staff scanning decades of physical archive photographs discovered that a single digitisation contractor had delivered duplicate image batches on at least four separate occasions between March and November 2024, inflating what should have been a 12 000-image archive to over 21 000 files — nearly double — before a manual reconciliation exercise was ordered in January 2025.

Globally, the problem is well-documented. Research published by the Storage Networking Industry Association in 2023 found that duplicate and redundant data accounts for roughly 32 percent of all enterprise storage consumption worldwide. For South African municipalities operating on constrained IT budgets — Johannesburg's Information and Communications Technology department received an allocation of approximately R1.2 billion in the 2025/26 financial year — that figure represents a substantial portion of capacity that could be redirected to active service delivery infrastructure.

What Needs to Happen Next

Several practical remedies are already within reach. Perceptual hashing tools — software that assigns a unique fingerprint to each image regardless of filename or metadata — can identify near-duplicate files in large archives within hours. The Gauteng Department of e-Government has piloted such tools on a small subset of the provincial records system, though a city-wide rollout for Johannesburg's own departments has not yet been formally scheduled.

For residents submitting documents to city portals — whether for a rezoning application in Northcliff or a business licence renewal in the Fordsburg commercial district — the practical advice is straightforward: compress images before upload, use consistent filenames, and check your submission portal history before re-uploading a file you believe may not have processed. Duplicate submissions from the public side compound whatever duplication already exists internally.

The City of Johannesburg's Chief Information Officer's office has not publicly committed to a deduplication audit timeline. Without one, the numbers will keep growing — and so will the cost of doing nothing.

Topic:#News

How does this story make you feel?

Spread the word

See something wrong? Suggest a correction.

Have your say

Loading comments…

Sources

About this article

Published by The Daily Johannesburg

This article was produced by the The Daily Johannesburg editorial desk and covers news in Johannesburg. See our editorial standards for how we use AI.

The Daily Johannesburg brief

The day's Johannesburg news in a 2-minute read, every weekday morning. Free.

By subscribing you agree to receive emails from The Daily Johannesburg and accept our Privacy Policy. Unsubscribe anytime.

Daily brief

Enjoyed this? Wake up to Johannesburg news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Johannesburg and accept our Privacy Policy. Unsubscribe anytime.

More from The Daily Johannesburg

More in News

Enjoyed this story? Get tomorrow's briefing free.