"The Handbook of Massive Data Sets" is comprised of articles written by experts on selected topics that deal with some major aspect of massive data sets. It contains chapters on information retrieval both in the internet and in the traditional sense, web crawlers, massive graphs, string processing, data compression, clustering methods, wavelets, optimization, external memory algorithms and data structures, the US national cluster project, high performance computing, data warehouses, data cubes, semi-structured data, data squashing, data quality, billing in the large, fraud detection, and data processing in astrophysics, air pollution, biomolecular data, earth observation and the environment. The proliferation of massive data sets brings with it a series of special computational challenges. This "data avalanche" arises in a wide range of scientific and commercial applications.
Preface.- Part I: Internet and the World Wide Web.- Part II: Massive Graphs.- Part III: String Processing and Data Compression.- Part IV: External Memory Algorithms and Data Structures.- Part V: Optimization.- Part VI: Data Management.- Part VII: Architecture Issues.- Part VIII: Applications.- Index.