Previous Page TOC Next Page

Measuring quality of life

Measurement Principles

Measurement is the process of assigning numbers to objects or events according to rules.

It is the process of linking abstract concepts to empirical indicators that are the observable responses. Measurement is an explicit, organized plan for classifying and quantifying data in terms of a general concept. Instruments are the devises used to record data. Measurement is usually conceptualized within levels:

A scale is a set of symbols or numerals constructed so that the symbols or numerals can be assigned by rules to characteristics of individuals. Scales contain components (items) which describe a concept and a series of responses that are the ratings of the item.

Several different scales are common in quality of life research.

1. Discrete responses - uses categories of response such as excellent - good - fair- poor

2. Likert scaling - uses descriptions of opinion by rating agreement or disagreement to a series of statement. Some scales are considered to be Likert-like in that they ask for ratings of opinion on other dimensions such as satisfaction or importance.

3. Visual analogue - uses a line of fixed length anchored by words only at the extremes and no words along the line.

4. Adjectival - uses a continuum of responses along a line. Like a visual analogue scale except that it has words along the line.

Measurement Theory centers around the concepts of reliability and validity. For every score there is an obtained score (what the person scored on the test) and a true score (what the person would have gotten if the test had been perfect). The difference between these two is the error of measurement. Random error is a reliability problem. Non-random (systematic) error is a validity problem.

In random error unwanted factors got measured. Situational contaminants like the person being aware of being observed, environmental factors like an unfriendly atmosphere. There is also the problem of response-set bias. These are relatively enduring characteristics of people to respond in certain ways like extreme responses, acquiescence, and social desirability. Transitory personal factors like fatigue, hunger, anxiety, and pain also can affect measurement. Administration variations such as methods of data collection and instrument clarity also affect measurement error.

Reliability of an instrument is the extent to which an instrument or any measuring procedure yields the same results on repeated trials. This form of reliability is also called stability, consistency, and dependability. In stability reliability an instrument is measured at two different points in time. Consistency reliability refers to all items in the scale measuring the same concept. Reliability may also refer to the level of accuracy when 2 or more raters use the instrument (interrater reliability) or accuracy of one rater using the instrument at 2 or more different times. A reliable measure maximizes true-score and minimizes error. It is not a fixed property of an instrument but only the property of an instrument when administered to a particular sample under certain conditions.

Validity of an instrument describes the degree to which a test or instrument measures what it is supposed to measure. An instrument is only as valid as it measures the concept that it was designed to measure. Validity of a measure is more difficult to assess than reliability as there is often no criterion (gold standard) or even agreed upon definition for many concepts of interest to nurses and other health professionals. This is particularly a problem in quality of life measurement..

Major types of validity include:

• Face--Does the instrument look like it measures the concept?

• Content – Are the items within the instrument an adequate sampling and representative of the concept?

• Criterion – Is the measure related to another measure of the same concept or phenomenon? In order to measure this, another equally valid instrument is needed.

• Concurrent – Is there a correlation between two measures of the same concept at the same point in time?

• Predictive – Is there a correlation between the measure and some future measurement of the same or similar concept?

• Construct - Is the construct under investigation adequately measured with the instrument?

Sensitivity of an instrument is crucial for evaluative purposes. Sensitivity or responsiveness refers to the ability of a measure to detect hypothesized changes such as treatment effects over time. Sensitivity to change can be explored either through experimental or longitudinal designs.

An excellent on-line resource for understanding design and measurement issues is located at The above two figures are examples from this site. (Citation: Trochim, William M. The Research Methods Knowledge Base, 2nd edition. Internet page at URL (cited above). Version current as of August 2, 2000.

Also see Streiner and Norman (1995) and Bowling (1997). Both are excellent books dealing with measurement issues in quality of life and health status.

Types of Instruments

Global, generic, and specific instruments represent three different types of measures for the assessment of quality of life.

Global measures are those designed to measure quality of life in the most comprehensive or overall manner. This may be a single question that asks the person to rate his/her overall quality of life or an instrument such as the Flanagan Quality of Life Scale (Flanagan, 1978, 1982) that asks people to rate their satisfaction on 15 domains of life.

Generic measures have much in common with global measures and were designed primarily for descriptive purposes. In health care they delineate as comprehensively as possible the full impact of a disease or its symptoms on the patient’s life. Generic measures are applicable to a wide range of populations. The main advantage is their broad coverage and the fact that they allow comparisons of different patient populations or across studies. A disadvantage is that they may not address topics of particular relevance for a given disease. Importantly, evidence suggests that they are less responsive to treatment-induced changes than disease-specific measures.

Disease Specific measures were developed to monitor the response to treatment in a particular condition. These measures are confined to addressing the problems of selected patient groups. They tend to have high sensitivity to change but often lack a conceptual link to quality of life definitions.

Dimension Specific measures focus on a particular problem within a patient group such as pain, fatigue, physical functioning. These measures are useful for monitoring specific problems that are to be addressed by an intervention.

Instruments may also vary in the method of administration. Standardized questionnaires allow uniform administration and unbiased quantification of data, as the response options are predetermined and thus equal for all respondents. Increasingly, the emphasis has been on self-administered questionnaires. However, these may exclude certain groups of patients, for example, those who cannot read or write, the elderly, and those with severe somatic conditions. Another problem is that the use of self-administered questionnaires can mean the possible loss of data if patients do not fill out every question. Quality control can minimize this problem.

Interviews have the advantages that most patients can be assessed and the completeness of the data is ensured. These advantages tend to be outweighed by the disadvantages of time and expense.

Last updated 5 September 2000

Carol Burckhardt

Previous Page TOC Next Page