What the Epworth Sleepiness Scale is and how to use it

The ESS is a self-administered questionnaire with 8 questions. It provides a measure of a person’s general level of daytime sleepiness, or their average sleep propensity in daily life. It has become the world standard method for making this assessment.

The ESS asks people to rate, on a 4-point scale (0 – 3), their usual chances of dozing off or falling asleep in 8 different situations or activities that most people engage in as part of their daily lives, although not necessarily every day. It does not ask people how often they doze off in each situation. That would depend very much on how often they happened to be in those situations. Rather it asks what the chances are that they would doze off whenever they were in each situation. This requires a mental judgment which, it seems, most people are able to make in a meaningful way. The total ESS score is the sum of 8 item-scores and can range between 0 and 24.The higher the score, the higher the person’s level of daytime sleepiness. Most people can answer the ESS, without assistance, in 2 or 3 minutes.

The total ESS score provides an estimate of a general characteristic of each person – their average level of sleepiness in daily life. This can be influenced by many factors, and the ESS does not distinguish which factor(s) have caused any particular level of daytime sleepiness. It is not a diagnostic tool in itself, but is a very useful tool for measuring one important aspect of a person’s sleep-wake health status.

There are other subjective and objective methods for measuring sleepiness, but the ESS has several advantages, not the least of which is the fact that it is very cheap to use and very simple to administer to large numbers of people.

There is more information below about the conceptual framework within which the ESS was developed, its validation, reliability and item-structure, how to score it, and its reference range of normal scores.

Conceptual Basis of the Epworth Sleepiness Scale (ESS)

This is summarized here, but has been described in detail elsewhere (see References).Whether a person is awake or asleep at any particular time depends not just on the time of day or how long they have been awake, but also on many other variables, including what that person is doing, their posture and activity, and their environment at the time.

Simply to lie down rather than stand up increases the likelihood of falling asleep, i.e. increases one’s sleep propensity at the time. Thus, lying down can be considered a more soporific activity than standing up. Sleep propensity can be measured only within the context of the subject’s situation and activity, both physical and mental, at the time. A person’s usual sleep propensity when in the same situation repeatedly can be called their situational sleep propensity (SSP), e.g. when sitting and watching TV. This is partly situation-specific for each subject.The ESS asks the respondent to rate on a 4-point scale (0-3) his or her chances of dozing off in each of 8 different situations that differ in their somnificity, or sleep-inducing characteristics (4). Responses to the questionnaire depend on retrospective reports of dozing behaviour, mostly during activities while sitting, in the course of daily life in “recent times”. These may not be very accurate assessments, but they are reliable in a test-retest sense over periods of months (4,5). Most can be reported reliably and independently by a spouse or partner who would be likely to observe the dozing behaviour (eyes closing, head tilting forward, then up again upon arousal).When the 8 ESS item-scores representing different situational sleep propensities are added together they give a total ESS score which is a measure of the subject’s average sleep propensity (ASP) in those 8 situations.

Total ESS scores can vary between zero and 24 in different subjects. The ASP is not synonymous with fatigue or tiredness, as reported in some other scales such as the Stanford Sleepiness Scale.Subjects with a moderately high ASP do not necessarily fall asleep during the day if they avoid soporific situations by keeping physically and mentally active, e.g. by not sitting down. By contrast, other subjects who, for various reasons, often lie down and consequently doze during the day, may not have a high ASP. The ESS does not assess how much sleep the subject has during the day. Nor is the ESS a measure of “sleep debt” as some may claim. While ESS scores are related to the usual duration of sleep at night and increase with relative sleep deprivation, they are not a useful measure of hours of “sleep debt”.

The Questions used in the ESS

The particular questions included in the ESS were chosen on a priori grounds to vary in what Johns has since called their somnificity. We are much more likely to doze off when engaged in activities with a high somnificity, (such as in Item-5; ‘lying down to rest in the afternoon when circumstances permit’), than during other activities with a lower somnificity, (such as Item-6;‘sitting and talking to someone’). The higher a person’s average sleep propensity, the higher their chances of dozing off in situations with a relatively low somnificity, and the higher their ESS scores will be.

It was important during the development of the ESS that only those activities be included that most people have experienced in their daily lives, although not necessarily on a daily basis. To ask how likely someone was to fall asleep at the wheel while driving may be a seek a very useful piece of information by itself. However, a significant minority of people in the general community do not have a driver’s license and do not drive. The ESS items 4 and 8 must allow for some respondents to be passengers rather than drivers of a vehicle. This has caused confusion among some people that cannot be avoided easily.

Validity and Reliability of ESS Scores

The Multiple Sleep Latency Test (MSLT) has been regarded by some as the gold standard against which other measurements of “sleepiness” should be compared (7,8). ESS scores in different people are related significantly, but not very closely, to their mean sleep latency in the MSLT (e.g. rho = -0.42, n = 44, p < 0.01) (4,7). However, within the conceptual framework of the ESS, the MSLT measures only one situational sleep propensity, albeit more accurately and objectively than item-scores of the ESS can. A subject’s sleep propensity in any one situation is not always closely related to that in a different situation (4,5). In fact, the ESS has been shown to be more accurate than the MSLT, and about as accurate as the MWT in distinguishing the sleepiness of narcoleptics from that of normal subjects (9). Currently we do not have an objective gold standard method for measuring a subject’s ASP in daily life, which the ESS is believed to do, based on subjective reports.

Subjective reports of any kind can be subject to bias and inaccuracy. That possibility that must be born in mind whenever the ESS is used, particularly if the answers provided are likely to be have legal or other implications. However, that does not mean that all subjective reports are inaccurate or invalid.

There is good evidence for the validity of total ESS scores as a measure of ASP. ESS scores differ between normal subjects and patients with obstructive sleep apnea that is known to increase “sleepiness” (1,2). The higher-than-normal ESS scores of such patients return to normal after successful treatment of their disorder by using nasal continuous positive airway pressure (CPAP) treatment when they sleep (2,10,12). The severity of obstructive sleep apnea, defined either by the frequency of apneas and hypopneas or by the level of arterial oxygen desaturation during sleep, has been correlated significantly with ESS scores in some, but not all, investigations. The same is also true for the “sleepiness” of such patients when measured by the MSLT. ESS scores alone do not diagnose the nature of any sleep disorder.

Total ESS scores are reliable in a test-retest sense over a period of months (rho = 0.82, n = 87, p < 0.001) (6). There is a high level of internal consistency within the ESS, as assessed by Cronbach’s alpha statistic (alpha = 0.88 – 0.74 in 4 different groups of subjects). Factor analysis performed on ESS item-scores for separate groups of adult subjects has usually revealed only one factor for each (4,5,6), but exploratory factor analysis in some groups has shown more than one factor.

The ESS is not suitable for measuring rapid changes in sleep propensity over periods of hours, e.g. to demonstrate the sedative affect of a single dose of a drug or to reveal a circadian rhythm of “sleepiness”. By contrast, the MSLT or the MWT is able to make such comparisons within the context of the situational sleep propensity that they measure.

The Reference Range of Normal ESS Scores

Data from Australia show that “normal” adults (N = 72) who do not have evidence of a chronic sleep disorder (including snoring) have a mean ESS score of 4.6 (95% confidence intervals 3.9 – 5.3) with a standard deviation of 2.8 and a range from zero to 10. The normal range defined by the 2.5 and 97.5 percentiles is also zero to 10 (13). This is different from the results first published in 1991, in which the normal range was reported as 2-10 for a small group of subjects (1). Many people with a variety of sleep disorders, nevertheless, have normal ESS scores. Very similar results have been reported from the United Kingdom (mean = 4.5 ± 3.3, n = 188) (11) and from Italy (4.4+/-2.8, n=54)(12). However, it is not yet clear whether the ESS scores of normal subjects in other cultures are the same. ESS scores usually do not differ significantly between normal men and women (13), nor do they change much with age. About 10 – 20 percent of the general population have ESS scores > 10 (i.e. 11 +).

Providing Information to Respondents about “normal” and “abnormal” ESS scores

The ESS scores that represent the normal range and degrees of excessive daytime sleepiness should not be reproduced with the questionnaire given to subjects. That information could well influence their answers. The ESS was never intended to be used as a self-diagnosing tool. Respondents should not be given an interpretation of their ESS scores until after completing the questionnaire.

How to Score the ESS

Scoring the ESS is very simple. The total ESS scores are the only numbers that most investigations will require. The ESS score is the sum of 8 item-scores. Most people can answer the ESS without difficulty in a few minutes, but some cannot decide on one number (0-3), and instead write down ½ or 1½, etc. for some answers. It is recommended that these scores be taken at face value, adding up all 8 item-scores, including halves. If the total ESS score includes a half (e.g. 6½) that score should be rounded up the next whole number. If one or more item-scores is missing, that ESS is invalid. It is not feasible to interpolate missing item-scores.

What Does it Mean to ask about ‘Recent Times’?

Respondents to the ESS rate their chances of dozing off in particular situations ‘in recent times’. It was a deliberate decision not to specify this time scale more accurately. It was intended to mean long enough for the subject to have experienced each situation referred to and to have formed an estimate of his/her chances of dozing in each. This may be a few weeks to a few months. However, experience with the rapid changes in sleep propensity that occur when patients with obstructive sleep apnea are treated with nasal CPAP suggests that periods of recall as short as a week or two may be possible to use with the ESS.

Format of the Answers to ESS Questions

It is essential that the words used in the ESS are not changed, but their spelling may be different in different countries e.g. ‘theatre’ in the UK and Australia becomes ‘theater’ in USA. ESS item-scores can be recorded as a number from 0-3 written in a single box for each question, as originally described (1). Alternatively, 4 boxes, labelled 0 to 3, can be used for each ESS-item, the subject ticking the appropriate box for each. The ESS scores for these 2 formats are similar. ESS scores derived from telephone interviews may be valid, but this needs confirmation.

The Instructions to RespondentsIt is just as important that the words of instruction to respondents are used in a standardized way as are the words in the questions. It appears that some researchers and others who have used the ESS have omitted or altered these instructions, in which case those ESS scores would be invalid.

The ESS in languages other than English

The ESS has been translated and used in many different languages. It is important that translations be checked by several people without detailed knowledge of the ESS who could independently translate the questionnaire back to English. Some degree of standardization of translations has been achieved, but his has not always been the case. Dr Johns was involved in some translations of the ESS, but not others, and cannot vouch for their accuracy.Investigations using translations of the ESS into various languages have been published, including Spanish, Portuguese, Italian, German, Swedish, Finnish, Greek, French, Mandarin, Japanese and Turkish.Translations have been made for many other languages that have been required for international investigations, but not necessarily for publication.