Johannesburg businesses and public-sector offices are collectively wasting millions of rands annually on cloud and on-premise storage bloated by duplicate image files — and most organisations have no active program to address it. That is the central finding emerging from audits conducted this year by several IT service providers operating out of Sandton's Fredman Drive technology corridor and the Rosebank office district.
The problem sounds mundane. It is not. With load shedding reduction having pushed more Joburg organisations toward cloud-first infrastructure since late 2024, server consolidation has exposed just how badly image libraries have grown out of control. Every time a power cut interrupted an upload, workers simply re-sent the file. Every campaign refresh at a marketing agency on Oxford Road meant downloading, renaming and re-uploading the same product photograph three or four times. Over five years, those habits compound.
What the Data Actually Shows
Industry benchmarks from global storage analytics firms suggest duplicate files typically account for between 20 and 30 percent of total unstructured data in any medium-sized organisation. For image-heavy sectors — retail, property, media — that figure climbs above 40 percent. Apply even the conservative end of that range to Joburg's commercial property sector alone, where estate agencies along Jan Smuts Avenue in Parktown North maintain enormous photo archives of listings, and the redundant storage runs into hundreds of gigabytes per firm.
Cloud storage pricing in South Africa has fallen but remains meaningful. Microsoft Azure and Amazon Web Services local-region pricing — both operators run South African availability zones — currently sits in a range that means a single terabyte of hot-tier cloud storage costs a Joburg business roughly R400 to R600 per month, depending on access patterns and contract terms. For a mid-sized agency holding 10 terabytes of images, with 30 percent duplicates, eliminating redundant files could free three terabytes and save between R1 200 and R1 800 every month. Small savings per firm, but multiplied across the Sandton CBD's estimated 6 000 registered businesses, the aggregate runs into tens of millions of rands a year.
The City of Johannesburg's own digital infrastructure is not immune. The Pikitup waste management entity and the Joburg Water billing system both underwent partial digital audits in 2025 as part of the Gauteng ANC-DA coalition's broader government efficiency drive. Neither has publicly disclosed results of image-specific storage reviews, and neither was available to confirm figures for this article. But IT consultants who work on municipal contracts — speaking in general terms about the sector rather than named clients — say duplicate image accumulation inside municipal content management systems is a recognised but low-priority problem.
What Organisations Can Do, and When
Automated deduplication tools have existed for years, but adoption in Joburg's small and medium business community has been slow. Platforms like Rclone, dupeGuru and commercial equivalents offered through local resellers on Commissioner Street in the Johannesburg CBD can scan and flag duplicates without deleting them automatically — a safeguard that matters when misidentifying a near-duplicate image could mean losing a legally important document scan.
The Joburg Centre for Software Engineering, based at the University of the Witwatersrand's East Campus in Braamfontein, has been running a digital literacy program for township-based small businesses in Soweto and Alexandra since 2023. Participants in that program report that storage hygiene — including duplicate file management — is now part of the curriculum, though image-specific deduplication training was added only in the first quarter of 2026.
Practically, any Johannesburg business paying monthly cloud storage invoices should pull a storage report before the end of July 2026, when most annual IT budget reviews begin for the February financial year. A free scan with an open-source tool takes under two hours for libraries below 500 gigabytes. The return — in rand terms and in faster system performance during the remaining load-shedding risk window — is immediate. The longer organisations wait, the deeper the redundancy goes, and the higher the bill climbs with every new campaign, every re-sent file, every interrupted upload.