UC Berkeley Press Release
Amount of new information doubled in last three years, UC Berkeley study finds
BERKELEY – If you feel like you're experiencing information overload, a team of University of California, Berkeley, researchers have a good idea why.
Worldwide information production has increased by 30 percent each year between 1999 and 2002, according to the team led by professors Peter Lyman and Hal Varian of the School of Information Management and Systems.
"All of a sudden, almost every aspect of life around the world is being recorded and stored in some information format," said Lyman. "That's a real change in our human ecology."
The researchers' report, which will be presented today (Tuesday, Oct. 28) at an information storage industry conference in Orlando, Fla., is supported by Microsoft Research, Intel, HP and EMC.
According to the researchers, the amount of new information stored on paper, film, optical and magnetic media has doubled in the last three years. And, new information produced in those forms during 2002 was equal in size to half a million new libraries, each containing a digitized version of the print collections of the entire Library of Congress, they added.
The researchers also report that electronic channels - such as TV, radio, the telephone and the Internet - contained three and a half times more new information in 2002 than did the information that was stored.
It's no surprise that the development of effective, reliable and cost-efficient strategies to store data is of increasing interest, and not just for commercial companies or for students downloading music.
Such storage is of growing importance to government agencies and institutions ranging from the Library of Congress, the Department of Homeland Security and the National Archives and Records Administration to the National Weather Service or NASA officials planning a mission to Mars.
"This study shows what an enormous challenge we and the rest of the information technology industry face in organizing, summarizing and presenting the vast amount of information mankind is accumulating," said Jim Gray, a Microsoft Bay Area Research Group distinguished engineer.
At Intel's storage components division, general manager Mike Wall agreed: "This calls for technology that can access and manage blocks of data the size of the Library of Congress to and from devices ranging from personal computers to PDAs anytime, anywhere, without losing as much as a bit."
Roy Sanford, EMC's vice president of content-addressed storage, said: "The study highlights the challenge of how to manage all their information according to its value at every stage of its life - from creation and protection to archive and disposal. It calls for application level integration directly into the storage infrastructure to allow for a policy-based, proactive management of the relentless growth of structured and unstructured data."
And Jeff Jenkins, director of marketing for storage, networking, and infrastructure with HP Industry Standard Servers, noted that today's enterprises face increasing complexities in managing and storing explosive data volumes generated from new financial legislation, disaster recovery implementations, real-time business communications and the Internet economy.
Because most new information is being stored in digital form while other, older formats are giving way to digital formats or are being digitally archived, the researchers chose to use the digital measurement of the terabyte as their standard gauge. A terabyte is a unit of computer data storage equal to a million megabytes, or roughly the text content of a million books. With the amounts of information so massive, however, the team also measured information by exabyte, or a million terabytes.
"Remember, it's not knowledge, just data," cautioned Lyman. "It takes thoughtful people using smart technologies to figure out how to make sense of all this information."
Among key findings in the report: