Although sepsis was described more than 2,000 years ago, and clinicians still struggle to define it, there is no “gold standard,” and multiple competing approaches and terms exist. Challenges include the ever-changing knowledge base that informs our understanding of sepsis, competing views on which aspects of any potential definition are most important, and the tendency of most potential criteria to be distributed in at-risk populations in such a way as to hinder separation into discrete sets of patients. We propose that the development and evaluation of any definition or diagnostic criteria should follow four steps: 1) define the epistemologic underpinning, 2) agree on all relevant terms used to frame the exercise, 3) state the intended purpose for any proposed set of criteria, and 4) adopt a scientific approach to inform on their usefulness with regard to the intended purpose. Usefulness can be measured across six domains: 1) reliability (stability of criteria during retesting, between raters, over time, and across settings), 2) content validity (similar to face validity), 3) construct validity (whether criteria measure what they purport to measure), 4) criterion validity (how new criteria fare compared to standards), 5) measurement burden (cost, safety, and complexity), and 6) timeliness (whether criteria are available concurrent with care decisions). The relative importance of these domains of usefulness depends on the intended purpose, of which there are four broad categories: 1) clinical care, 2) research, 3) surveillance, and 4) quality improvement and audit. This proposed methodologic framework is intended to aid understanding of the strengths and weaknesses of different approaches, provide a mechanism for explaining differences in epidemiologic estimates generated by different approaches, and guide the development of future definitions and diagnostic criteria.