Category: Software © Yenra ®

Rapid massive data validation

Ibm-analytics.jpg

IBM scientists in Zurich, Dr. Costas Bekas (left) and Dr. Alessandro Curioni, stand next to a Blue Gene/P system. Photo by Michael Lowry.

Every aspect of technology continues to get better.

On February 25, 2010, IBM Research unveiled a method based on a mathematical algorithm that reduces the computational complexity, costs, and energy usage for analyzing the quality of massive amounts of data by two orders of magnitude.

IBM researchers used the fourth most powerful supercomputer in the world -- a Blue Gene/P system at the Forschungszentrum Julich in Germany -- to validate nine terabytes of data in less than twenty minutes, without compromising accuracy. Ordinarily this would take more than a day. Additionally, the process used just one percent of the energy that would typically be required.

One of the most computation-intense, yet critical factors in analytics is the measurement of the quality of the data, which shows how reliable the data is that is being used and also generated by the model. This method could pave the way to create more powerful, complex and accurate models with greater predictability.

The amount of digital data is increasing at enormous rates - due also to the ever more ubiquitous presence of sensors, actuators, RFID-tags or GPS-tracking-devices. These miniature computers measure everything from the degree of pollution of ocean water to traffic patterns to food supply chains.

With all of this data come new challenges as organizations are now struggling to not only extract the relevant information out of it, but to also make sure it's accurate. IBM researchers are pursuing leading edge research and actively engaging in client projects to extend the ability for analytics to predict outcomes and improve the speed and quality of business decisions.

The new method demonstrated by the IBM scientists brings down computational complexity and has very good scaling characteristics that reach to the full scale of the JuGene Supercomputer at the Forschungszentrum Julich with its 72 racks of IBM's Blue Gene/P system, 294,912 processors and a peak performance of one petaflop.

Retrieved from "http://www.yenra.com/wiki/Rapid_massive_data_validation"
Share Yenra: Random