The digital revolution’s latest frontier–artificial intelligence (AI)–is creating a mounting environmental obstacle that few are talking about. While much attention has been focused on the energy demands of AI processing, an equally concerning issue is hiding in plain sight: the exponential accumulation of AI-related data. As companies worldwide embrace AI, many are further burdening data centers with unnecessary storage requirements.
The scale of the problem is increasing daily. New projections indicate a 160% spike in power demand by 2030, with associated social costs–including resource depletion, ecosystem damage and detrimental effects on public health–potentially reaching $149 billion. The massive environmental toll isn’t just from running AI computing loads; it also stems from the energy-intensive storage of countless data fragments that many companies are reluctant to delete.
Just as the Industrial Revolution left us grappling with physical waste management, the AI Revolution is creating a crisis of digital waste. Companies have adopted a “save everything” approach to data storage–the digital equivalent of never taking out the trash.
This digital hoarding takes many forms. Consider a retail company working on a tool to predict shopping patterns. Rather than storing only the final, refined dataset, companies typically keep multiple versions of the same training data and years of raw surveillance footage that serve no ongoing purpose. Or a single chatbot service might store complete transcripts of millions of routine customer interactions when, in practice, only unique cases are truly valuable for improving the service.
During AI model development, companies often save multiple complete copies of model states–each of which can be hundreds of gigabytes–even though only the final version, and perhaps a few key checkpoints, are typically needed for practical use. This unnecessary redundancy, multiplied across thousands of companies, contributes significantly to data center energy consumption.
Environmental managers and chief sustainability officers can take charge of addressing this crisis through several types of initiatives. These include working with IT and compliance teams to assess an organization’s data footprint through systematic reviews of data storage practices treating digital waste with the same seriousness as physical waste streams.
Organizations often accumulate what is known as “dark data”—information collected but no longer, or never, used to generate insights or drive decisions, much like forgotten items in a storage unit. AI systems are particularly prone to generating dark data due to their iterative nature: training datasets become outdated as models improve, test runs produce temporary files that never get deleted, and development teams save the same information across several systems. What starts as valuable training data can quickly become digital waste, silently consuming energy in data centers while providing no business value.
While regulatory bodies work to establish clear guidelines around AI data retention, organizations can take immediate steps to minimize their digital waste. Smart data minimization starts with asking key questions: Is this data still relevant for our AI systems? Do we need to keep multiple versions of the same information? Can we sample or compress data without losing value? Companies can then implement practical solutions like setting clear expiration dates for AI training data, regularly archiving or deleting outdated model versions, and designing AI systems that only collect and retain essential information from the start. By treating data storage as an environmental concern rather than just an IT issue, companies can significantly reduce their AI systems' carbon footprint while maintaining operational efficiency.
The environmental community has a vital role to play in shaping how organizations approach AI data management. By treating digital waste with the same seriousness as physical waste, we can help prevent a climate crisis before it becomes irreversible.
While regulators work to establish guidelines, organizations must proactively manage their digital footprint through robust data classification systems and sustainable data storage practices. The convergence of AI technology and environmental stewardship presents an opportunity to ensure technological progress aligns with conservation principles. The environmental sector can lead this transformation now, before the mounting data crisis compromises both our digital ecosystem and our planet’s health.
Soniya Bopache, vice president and general manager of data compliance and governance at Arctera, leads the vision, strategy, and delivery of its data compliance portfolio and has extensive experience in cloud migration, cloud deliveries, and managing cloud-based hosted offerings. She earned a master's degree in software engineering from the Birla Institute of Technology and Science in Pilani, India.