Open Access

The unique strengths and storage access characteristics of discard-based search

  • Mahadev Satyanarayanan1Email author,
  • Rahul Sukthankar2,
  • Lily Mummert2,
  • Adam Goode1,
  • Jan Harkes1 and
  • Steve Schlosser3
Journal of Internet Services and Applications20101:1

https://doi.org/10.1007/s13174-010-0001-z

Received: 26 January 2010

Accepted: 2 February 2010

Published: 24 February 2010

Abstract

Discard-based searchis a new approach to searching the content of complex, unlabeled, nonindexed data such as digital photographs, medical images, and real-time surveillance data. The essence of this approach is query-specific content-based computation, pipelined with human cognition. In this approach, query-specific parallel computation shrinks a search task down to human scale, thus allowing the expertise, judgment, and intuition of an expert to be brought to bear on the specificity and selectivity of the search. In this paper, we report on the lessons learned in the Diamond projectfrom applying discard-based search to a variety of applications in the health sciences. From the viewpoint of a user, discard-based search offers unique strengths. From the viewpoint of server hardware and software, it offers unique opportunities for optimization that contradict long-established tenets of storage design. Together, these distinctive end-to-end attributes herald a new genre of Internet applications.

Keywords

Data-intensive computingNon-text search technologyMedical image processingInteractive searchComputer visionPattern recognitionDistributed systemsImageJMATLABParallel processingHuman-in-the-loopDiamondOpenDiamondStorage systemsI/O workloadsRAID