Johannesburg's City of Joburg metropolitan municipality is in the middle of a quiet but consequential clean-up operation: removing tens of thousands of duplicate images that have accumulated across its digital asset management systems since at least 2009. The problem did not emerge overnight. It is the compounded result of rushed digitisation drives, incompatible software platforms, and years of under-investment in data governance.
Understanding why this matters requires a brief look at how the city's record-keeping evolved. Through the 2000s and into the early 2010s, municipal departments — from the Johannesburg Development Agency to the Pikitup waste-management entity — each ran their own image archives. Photographs of infrastructure, heritage sites, service delivery projects, and public events were captured and stored independently, with no centralised deduplication protocol in place. When the city eventually moved to consolidate these silos into a shared content management environment, the duplicates came with them.
Where the Problem Took Root
The consolidation push gathered pace around 2014 and 2015, when the City of Joburg began rolling out its Smart City initiative, which aimed to centralise data across departments. That process pulled image libraries from separate units — including those documenting projects in Soweto's Vilakazi Street precinct, the Sandton CBD streetscape upgrades along West Street, and the Rea Vaya BRT corridor running through Soweto and Braamfontein. Because metadata standards differed between departments, automated deduplication tools failed to flag images that were identical in content but different in file name, resolution, or timestamp.
The Johannesburg Heritage Foundation, which has collaborated with the city on photographic documentation of buildings in areas like Newtown and Fordsburg, flagged the duplication problem internally as far back as 2017, according to correspondence cited in a City of Joburg Digital Infrastructure Review document from 2022. The review noted that certain ward-level infrastructure photographs existed in as many as fourteen separate copies across the system. Storage costs for the city's data infrastructure were climbing accordingly — a problem that became harder to ignore as the ANC-DA coalition government that took shape in Gauteng after the 2024 elections prioritised municipal cost efficiency as a stated governance objective.
There is a broader pattern here that Johannesburg shares with other large African municipalities. A 2023 report by the African Centre for City Studies, based in Cape Town, found that among twelve major sub-Saharan cities surveyed, municipal digital archives on average contained a duplication rate of between 18 and 34 percent for photographic assets. Johannesburg's internal audit, completed in March 2025, placed the city's own duplication rate at approximately 27 percent of indexed image files — roughly 340,000 duplicated assets out of a total archive of around 1.26 million records.
The Fix, and What Comes Next
The remediation project, contracted to a local technology firm through a City of Joburg Supply Chain Management tender gazetted in November 2025, involves a phased deduplication sweep using perceptual hashing software — a technique that matches images visually rather than just by file name. Phase one, covering records linked to the Johannesburg Roads Agency and City Parks Johannesburg, was completed in April 2026. Phase two, which addresses archives related to community facilities from Alexandra township to Roodepoort, is scheduled for completion by October 2026.
The practical stakes extend beyond storage costs. Duplicate images have caused delays in legal processes, including land use adjudications before the City of Joburg's Development Planning tribunal, where multiple versions of the same site photograph were submitted as separate pieces of evidence. Councillors on the City's Infrastructure and Environment Portfolio Committee raised the issue in a public sitting in February 2026, noting that deduplication was a prerequisite for any credible move toward AI-assisted urban planning tools the city has been piloting in the Sandton Growth Node area along Rivonia Road.
For residents and ward committees tracking service delivery — particularly in areas like Diepsloot and Orange Farm where photographic documentation of infrastructure projects directly informs community oversight — the clean-up matters practically. Once phase two concludes in late 2026, the city says it will publish a consolidated, publicly searchable version of its image archive through the Joburg Open Data Portal. That would be the first time a unified, deduplicated photographic record of municipal activity is accessible to the public in a single place.