With the proliferation of big data hardware and software solutions in industry and research, there is a pressing unmet need for benchmarks that can provide objective evaluations of alternative technologies and the approaches to solve a given big data problem.
There is a strong tradition of industry standards for computer system and database benchmarks, e.g., from standards groups like the Transaction Processing Performance Council (TPC) and the Standard Performance Evaluation Corporation (SPEC), and Supercomputing community activities such as the TOP500.
Without benchmarking, in a rapidly developing technological field, it is difficult to assess the quality and the utility of the new solutions. Are they good enough to meet the fast-changing requirements of the Big Data challenges? Are they scalable and robust? Can they handle, say, high velocity stream data from finance? Information-rich unstructured data such as text or video from multimedia? Complex and large network data from telecommunications?
Traditional large-scale industry standard benchmarks only test systems using database sizes up to 100 Terabytes. Benchmarks are thus needed to model real-world processing pipelines that incorporate realistic features of big data applications.
Benchmarks can be defined at multiple levels — from micro-benchmarks, for low-level system operations, to application-level benchmarks for testing scenarios based on end-user applications, whose “end-to-end” performance is more directly relevant to end users in a specific application domain.
Over 100,000 people subscribe to our newsletter.
See stories of Analytics and AI in your inbox.
Big data benchmarks can provide objective measures quantifying performance, scalability, elasticity, and price/performance of systems designed to support big data applications, to facilitate evaluation of alternative solutions. They can also characterize the new feature sets, enormous data sizes, and shifting loads of big data applications, and the large-scale and evolving system configurations and heterogeneous technologies of big data platforms.
The Seventh International Workshop in Big Data Benchmarking (WBDB 2015) will be held in India Habitat Centre, New Delhi, on December 14-15, 2015. WBDB 2015 will bring together experts and outstanding researchers from international and Indian academia, industry and government administration. Professor Michael J. Franklin, Thomas M. Siebel Professor and Chair of Computer Science Division in University of California at Berkeley, USA, and the Director of the Berkeley Algorithms, Machines, and People Laboratory (the renowned “AMPLab” – the birthplace of Spark Streaming, SparkR, etc.) will deliver the first keynote address at WBDB 2015.
The Workshop is jointly organized by San Diego Supercomputer Center, University of California San Diego, Indian Statistical Institute and Public Health Foundation of India. Besides Big Data Benchmarking, this workshop will also focus on Big Data Analytics in Health Systems, Air Quality Management and Agriculture. Papers and posters on cutting-edge themes will be presented by researchers from institutes nationwide such as Indian Institute of Science, Indian Institutes of Technology, Indian Institute of Public Health, as well as from institutions abroad.
For registration and further details, visit the workshop website: http://clds.sdsc.edu/wbdb2015.in