Measures Applied to the Assessment of Fibromyalgia

Fibromyalgia Impact Questionnaire (FIQ), Brief Pain Inventory (BPI), the Multidimensional Fatigue Inventory (MFI-20), the MOS Sleep Scale, and the Multiple Ability Self-Report Questionnaire (MASQ; cognitive dysfunction)

David A. Williams, Ph.D., Professor and Lesley M. Arnold, M.D., Professor

David A. Williams

Anesthesiology, Medicine, Psychiatry, and Psychology, The University of Michigan

Find articles by David A. Williams

Lesley M. Arnold

Psychiatry and Behavioral Neuroscience, University of Cincinnati College of Medicine, Cincinnati, Ohio

Find articles by Lesley M. Arnold David A. Williams, Anesthesiology, Medicine, Psychiatry, and Psychology, The University of Michigan;

Address all correspondence to: David A. Williams, Ph.D., Chronic Pain & Fatigue Research Center, University of Michigan, 24 Frank Lloyd Wright Drive, Lobby M, Ann Arbor, MI 48106, Phone: 734 998-6961, FAX: 734 998-6900, ude.hcimu@smwaevad

The publisher's final edited version of this article is available free at Arthritis Care Res (Hoboken)

Introduction

The assessment of fibromyalgia (FM) is challenging because there are no biomarkers for this condition. Clinicians must rely upon patient-reported symptoms in order to understand the complexities of this condition. While in 1990 the American College of Rheumatology developed research classification criteria involving tender point counts, it has only been within the past year that the American College of Rheumatology (ACR) proposed clinical diagnostic criteria 1 . Historically, many symptoms have been thought to be associated with FM. In order to narrow the field to those symptoms with the greatest clinical relevance, a working group within OMERACT (Outcome Measures in Rheumatology) conducted several Delphi exercises within both patients and clinicians to obtain consensus regarding which domains should be assessed in clinical trials for FM 2,3 . The instruments to be reviewed in this paper reflect the clinically relevant domains defined by this OMERACT working group.

A wide variety of instruments have been used to index the OMERACT domains for FM. Many of the instruments were developed for use generically or have been borrowed from other clinical populations. In recent phase 2 & 3 clinical trials of medications for FM, wide variation was observed in the selection of domain indices (see Table 1 ). While many of these measures are reviewed elsewhere in this special issue, we have selected a representative measure from each of the following domains of relevance: pain (Brief Pain Inventory), fatigue (Multi-dimensional Fatigue Inventory), sleep disturbance (MOS Sleep Scale), and cognitive dysfunction (Multiple Ability Self-Report Questionnaire. Mood and Functional status are also important domains for FM; however the instruments most commonly used to assess these domains are reviewed elsewhere in this special issue and will not be repeated here (e.g. mood (Hospital Anxiety and Depression Scale), and functional status (SF-36). Recent work in the development of responder indices suggest that either these specific instruments or other measurement tools from within the same domain can be used to differentiate responders from non-responders in clinical treatment trials for FM 4 . The precision by which these domains will be able to be assessed in the future is likely to be enhanced as newer measurements that are being developed using either classical test construction methods or methods such as item response theory and computer adaptive testing as is being done in the NIH-sponsored Patient Reported Outcomes Measurement System (PROMIS) 5 .

Table 1

Outcome measures in fibromyalgia trials of FDA-approved medications

Fibromyalgia DomainOutcome Measure
PainVisual Analog Scale (VAS) (daily diary)
Numeric Rating Scale (NRS) (0–10) (daily diary)
Fibromyalgia Impact Questionnaire (FIQ) pain (0–10)
Brief Pain Inventory (BPI) pain severity scores (0–10)
SF-36 bodily pain
TendernessDolorimetry (Tender point threshold)
FatigueVAS (0–100) (daily diary)
FIQ fatigue (0–10)
SF-36 vitality
Multidimensional Fatigue Inventory (MFI)
Multidimensional Assessment of Fatigue (MAF)
SleepNRS (0–10) daily diary of sleep quality
FIQ morning rested feelings (0–10)
Medical Outcomes Study sleep scale (MOS-Sleep scale)
DepressionBeck Depression Inventory (BDI)
Hamilton Depression Rating Scale (HAM-D)
FIQ Depression (0–10)
Hospital Anxiety and Depression Scale (HADS) Depression
AnxietyFIQ anxiety
HADS Anxiety
CognitionMultiple Abilities Self-Report Questionnaire (MASQ)
StiffnessFIQ stiffness (0–10)
Physical functionSF-36 physical function
FIQ physical function

FIBROMYALGIA IMPACT QUESTIONNAIRE

General Description

Purpose

The Fibromyalgia Impact Questionnaire (FIQ) was developed in the late 1980’s by clinicians at Oregon Health & Science University (OHSU) to assess the total spectrum of problems related to FM and associated responses to therapy 6 . The FIQ was first published in 1991 7 and modified in both 1997 and 2002 to refine items and to clarify the scoring system 6 . The FIQ was revised in 2009 (FIQR) to better reflect current understanding of FM and to address limitations of the original FIQ while retaining its essential properties 8 .

Content

The original FIQ (1991) covered 3 domains: function, overall impact, and symptoms. The first domain, “function”, contained 10 physical functioning items related to the ability to perform large muscle tasks, including the ability to do shopping, do laundry, prepare meals, wash dishes by hand, vacuum a rug, make beds, walk several blocks, visit friends or relatives, do yard work, and drive a car. The “overall impact” domain contained 2 items asking about the number of days individuals’ felt well and the number of days they were unable to work because of FM symptoms. The domain assessing “symptoms” contained 7 items using 10 cm visual analog scales on which patients’ rate work difficulties, pain, fatigue, morning tiredness, stiffness, anxiety, and depression. The 1997 version modified items about “work” to include “housework” and a new item about “climbing stairs” was added to the “functioning” domain. Finally the 1997 version added hash-marks (i.e., vertical lines) every 1 cm. to the formatting of all visual analog scales.

The 2009 FIQR has the same 3 domains as the original FIQ (function, overall impact, and symptoms), but differs in several ways. First the physical functioning domain was reduced to 9 items and modified to reflect a better balance between large-muscle activities in the upper and lower extremities and that would have less gender and ethnicity bias. The physical functioning items include the ability to brush or comb hair, walk continuously for 20 minutes, prepare a homemade meal, vacuum, scrub, or sweep floors, lift and carry a bag full of groceries, climb one flight of stairs, sit in a chair for 45 minutes, and go shopping for groceries. The overall impact domain was completely revised to reflect the overall impact of FM on functional ability and the overall impact of FM on the perception of reduced function. The symptom domain retained items on pain, fatigue, morning tiredness, stiffness, anxiety, and depression and added four additional items on tenderness, memory, balance, and environmental sensitivity.

Number of items

The original FIQ (1991) had 19 items capturing 3 domains. The 1997 version of the FIQ retained the same domains but added an additional item for a total of 20 items. In the 2009 FIQR, the first domain (physical function) has 9 items; the second domain (overall impact) has 2 items; and the third domain (symptoms) has 10 items for a total of 21 items.

Response options/scales

The physical functioning items in the 1991 and 1997 versions of the FIQ are rated on a 0–3 scale that best reflects the patient’s ability to do the activity (0=always; 1=most; 2=occasionally; 3=never). The overall impact items are rated on a 0–7 scale for the number of days the patient felt well and the number of days the patient missed work, respectively. The symptom items are visual analog scales (0–10 cm), with higher numbers indicating greater symptomatology. All of the items in the 2009 FIQR are (0–10) numeric rating scales using 11 boxes, , with higher numbers reflecting greater severity.

Recall period for items

The recall period is over the past week.

Endorsements/ Examples of use

Since 1991, the FIQ has been one of the most frequently used assessment tools in the evaluation of FM, and has been particularly useful as an outcome measure in FM clinical trials. The FIQ has been cited in over 300 articles between 1991 and 2010 (see www.myalgia.com/FIQ/FIQ_REFS_2010.htm for a complete listing of article abstracts). The use of the FIQR in clinical studies has not yet been published.

Practical Application

How to obtain

The FIQ and the FIQR are free for academic and clinical use. An online license to use the FIQ is available by registering at www.myalgia.com/FIQ/FIQ_academic_agreement.htm. The original FIQ is published in 7 . The 1997 version with the 2002 scoring revision was published in 2005 6 and is also available at www.myalgia.com/FIQ/FIQ_B.htm. The FIQR is available at this same website and was published in 2009 8 .

Method of administration

The FIQ and FIQR are administered as self-report questionnaires.

Scoring

The 1991 and 1997 FIQ versions have similar scoring. The final scores for each item of the FIQ should range from 0 (no impairment) to 10 (maximum impairment).

The physical functioning items are rated on a 4 point Likert type scale. Raw scores on each question can range from 0 (always) to 3 (never). Because some patients may not do some of the tasks listed, they are given the option of deleting questions from scoring. The scores for the items that the patient has rated are summed and divided by the number of questions answered. An average raw score between 0 and 3 is obtained. This value is then multiplied by 3.33.

The first impact item that asks the number of days in the past week the patient felt well is reverse scored so that a higher number indicates impairment. Raw scores range from 0 to 7 and are then multiplied by 1.43.

The second impact item is scored as the number of days the patient was unable to do regular work activities. Raw scores range from 0–7 and are then multiplied by 1.43.

Symptom items are visual analog scales. In the 1991 version, the items are scored in number of centimeters from 0–10. Because the 1997 version added hash-marks to all of the visual analog scales, these items are scored in numerical increments from 0–10, allowing scores to include 0.5 if the patient marks the space between 2 vertical lines.

In the 1991 version, patients were instructed to cross out items 3 and 4 if they did not work. Therefore, the total maximum FIQ score was reduced from 100 to 80. With the 1997 revision in which questions 3 and 4 were modified to include housework, the total FIQ scores should always range from 0–100. In 2002, a modification of the scoring was recommended to address incomplete data. In order to maintain homogeneity on a 0 to100 continuum, the final score is to be adjusted to reflect a final maximum score of 100. For example, if a patient missed 2 questions, the total recorded score should be adjusted by a factor of 10/8.

The FIQR has 21 individual items and all items are based on an 11-point numeric rating scale of 0 to 10, with 10 being the ‘worst.’ The summed score for the function domain, which contains 9 items (range 0 to 90) is divided by 3, the summed score for overall impact, which contains 2 items (range 0 to 20) is not changed, and the summed score for symptoms, which contains 10 items (range 0–100) is divided by 2. As in the FIQ, the total maximum score for the FIQR is 100. The weighting of the 3 domains is different from the FIQ in that function accounts for 30% of the total score as opposed to 10% in the FIQ, the symptom domain makes up 50% of the score instead of 70% in the FIQ, and the overall impact domain remains the same as the FIQ at 20% 8 .

Score interpretation

The final scores for each of the FIQ and FIQR items range from 0 (no impairment) to 10 (maximum impairment). The total maximum score for both the FIQ and the FIQR is 100, which represents the maximum impact of FM on the patient.

Respondent burden

It takes approximately 3–5 minutes to complete the FIQ. The FIQR is estimated to take just over 1 minute to complete.

Administration burden

The FIQ and FIQR are easily administered by handing the questionnaires to the participant. The scales include simple instructions for the respondents. No formal training is required for the FIQ or FIQR. Scoring is relatively simple for both the FIQ and the FIQR but the use of numeric rating scoring for all of the FIQR items further simplifies the scoring and allows for use of electronic versions of the FIQR that can be administered online as was done in the validation study 8 .

Translations/adaptations

The FIQ has been translated into 13 languages: Czech (Czech Republic), Dutch (The Netherlands), French (France, Canada), German (Germany), Hebrew (Israel), Italian (Italy), Korean (Korea), Polish (Poland), Romanian (Romania), Spanish (Argentina, Spain), Swedish (Sweden), Turkish (Turkey) (see www.myalgia.com/FIQ/FIQ_B.htm for more information on translations)

Psychometric Information

Method of development

The initial version of the FIQ was based on an intake questionnaire used by the Oregon Health Sciences University (OHSU) rheumatology clinic and informal discussions with FM patients. This FIQ was mailed at weekly intervals for a total of 6 weeks to 64 women with FM, along with the Arthritis Impact Measurement Scale (AIMS). A second group of 25 women with FM attending the OHSU Fibromyalgia Treatment Clinic completed the FIQ as part of their routine clinical evaluation. The construct validity, test-re-test reliability, and content relevance of the FIQ were assessed in these 2 groups of patients 6,7 .

The FIQR was based on previous experience with the FIQ and patients’ evaluation of important symptoms 8 . The new questionnaire was tested in a focus group of 10 female patients with FM. Following discussions among the patients and investigators, agreement was reached on the final version of the FIQR. The FIQR was then tested in an online survey that was completed by patients with FM, patients with rheumatoid arthritis (RA), lupus (SLE), or major depressive disorder (MDD), and healthy controls. The participants also completed the original FIQ and the 36-item Short Form Health Survey (SF-36).

Acceptability

The FIQ was originally developed to assess the current health status of women with FM, and may therefore have a gender bias, particularly in the functional items in which several of these questions relate to activities that are more likely to be performed by women. The functional questions were intended for a relatively affluent patient who was assumed to have possession of a car, a vacuum cleaner, and a washing machine and may therefore not generalize to all patients with FM. The FIQ also has problems related to the deletion of physical function items deemed “not applicable” by the respondent, which may result in an underestimation of functional severity. Some patients report difficulty understanding the scoring of the physical function questions and note that the questions do not allow them to rate the degree of difficulty in performing the activity. For example, a patient may report that they were “always” able to do shopping even though it took a great deal of time and effort to complete the task. The FIQ functional items are oriented toward high levels of disability, resulting in a potential floor effect. For example, in one study, 12% of patients scored a zero on the FIQ physical function score (i.e., no dysfunction) 9 . The FIQR was developed to correct some of the problems with the FIQ. In particular, the physical functioning items wwere revised to have less gender and ethnicity bias than the FIQ and to improve the ease of scoring the functional activities on a 0–10 scale ranging from “no difficulty” to “very difficult” 8 .

Reliability

In the original 1991 study to evaluate the FIQ, the test-re-test reliability (Pearson’s r) was assessed by the weekly recording of data over 6 weeks. The reliability ranged from 0.56 on the pain score to 0.95 for physical function 7 . The internal consistency (Cronbach’s alpha) was not reported in the original analysis. The Cronbach’s alpha for the FIQR was 0.95, with item-total correlations ranging from 0.56 to 0.93. Test-re-test reliability was not determined for the FIQR 8 .

Validity

The content validity of the original FIQ was assessed from an analysis of missing data for each item. Missing data from the physical functioning items were limited to 11% of patients who did not do dishes by hand and 20% who did no yard work. Because many patients were not working outside the home, the 2 work items of the original FIQ were not relevant for 38% of the patients 6,7 . In the validation study of the FIQR, patient suggestions about content and wording of the instrument during the focus group meeting contributed to the face validity of the final version of the FIQR. Content validity of the FIQR was suggested by strong correlation between the FIQR and the SF-36. For example, the FIQR function domain was most highly correlated with the SF-36 physical functioning subscale 8 .

The construct validity of the 1991 FIQ was determined by measuring the correlation of the FIQ individual items with the AIMS. The FIQ physical functioning items had a significant correlation (r=0.67) with the AIMS lower extremity physical function component score. The pain, depression, and anxiety items of the FIQ showed significant correlations with the corresponding AIMS scales (0.69, 0.73, and 0.76, respectively). The AIMS visual analog of syndrome impact correlated least robustly with the FIQ items, the highest correlation being with pain (r=0.48). Item correlations with the AIMS syndrome activity question tended to be higher, ranging from 0.28 to 0.83. A principal components analysis yielded 5 factors. The 10 physical functioning questions loaded on the first factor with component loading ranging from 0.50 to 0.95. Factor 2 consisted of work difficulty, feeling good, pain, fatigue, rest, and stiffness. Anxiety, depression, and days of work missed all loaded on separate factors 6,7 .

Convergent validity was assessed by comparing the FIQR to both the SF-36 and the FIQ. The three domains of the FIQR and the associated individual items correlated closely with the corresponding subscales on the SF-36. Each of the three FIQR domains was also highly correlated with the total FIQR score. There was a strong correlation (0.88) between the FIQR and the FIQ, suggesting that the questionnaires are capturing similar information about the impact of FM. The mean total score of the FIQR was about 4 points lower than the mean FIQ total score, which was attributed to the change of the weighting in the FIQR scoring 8 . Each of the three FIQR domains predicted unique variance in SF-36 domains, providing evidence for discriminant validity. Discriminant validity was also evaluated by comparing the FIQR total scores in FM patients with the scores in healthy controls, patients with RA or SLE, and patients with MDD. The FM FIQR scores were significantly higher than in the other three groups 8 .

Ability to detect change

The FIQ has been most commonly used as an outcome measure in treatment trials and, in general, has demonstrated an ability to detect clinical change 6 . The FIQ total score was also included as an outcome measure in trials of the three US Food and Drug Administration (FDA)-approved medications for FM, pregabalin, duloxetine, and milnacipran 10–12 . For example, in a pooled analysis of 4 placebo-controlled, double-blind studies of duloxetine in FM, the total FIQ scores improved significantly in the duloxetine groups compared with placebo, with a mean (SE) reduction of 12.62 (0.61) in the duloxetine patients compared with a mean reduction of 8.2 (0.69) in the placebo group (P<0.001) 13 . A recent study suggested that a 14% change or an absolute change of 8.1 (95% CI, 7.6; 8.5) in the FIQ total score represented a clinically meaningful change in FM status (i.e., MCID). The MCID was determined by calculating the percentage change in the FIQ total score from baseline and linking this to each patient’s global assessment of change (PGIC) score 14 .

References

The validation of the original FIQ is published in 7 . A review of the development, operating characteristics and uses of the FIQ is found in 6 and the validation study of the FIQR is found in the Bennett et al publication in 2009 8 .

Critical appraisal of overall value to rheumatology community

Strengths

FM is associated with multiple symptoms and functional impairment. The FIQ and FIQR are useful assessment tools in FM because they evaluate the total spectrum of problems related to FM, including functional impairment, overall impact, and FM-related symptoms. The FIQ total score has proved to be a useful outcome measure in key clinical trials of FM.

Caveats, cautions

The FIQ functional items are oriented toward high levels of disability, resulting in a possible floor effect. Because the FIQ was originally developed in a patient population of relatively affluent women, there is a potential problem with gender and ethnicity bias. Although the individual domains and/or items on the FIQ were not originally intended to be used in isolation, some recent studies have reported single item or domain scores from this instrument. The internal consistency (Cronbach’s alpha) was not reported in the original analysis of the FIQ.

The FIQR was designed to correct some of the problems with the FIQ, but has not yet been tested in the context of clinical trials. Test-re-test reliability was not determined for the FIQR.

Clinical usability

The FIQ and FIQR are brief, self-report questionnaires that assess the impact of FM on patients. The FIQ has most commonly been used in clinical studies, but has the potential for use in the clinical setting to monitor patients’ response to treatment over time.

Research usability

The FIQ has been used in large scale clinical trials of therapeutics for FM, supporting its ability to assess and detect change in FM.

BRIEF PAIN INVENTORY

General Description

Purpose

The Brief Pain Inventory (BPI) was designed to measure multiple clinically relevant aspects of pain such as pain intensity and interference from pain in cancer populations 15 . The BPI was originally called the Wisconsin Brief Pain Questionnaire 16 . Subsequently support for its valid use in non-cancer populations such as musculoskeletal, neuropathic and other central pain conditions has been established 17,18 . There are two versions; the short version is the most commonly used and is often included in the context of clinical trials. This is the version that possesses most foreign language translations. A longer, less frequently used, version is available that includes more pain descriptors and may have clinical utility; however the developers recommend the short form for most applications. Only the shorter form will be considered here.

Content

The BPI assesses for the presence of pain, pain intensity (i.e., worse, least, average, current) and functional interference from pain (i.e., activity, mood, walking ability, normal work, relations with others, sleep, and life enjoyment). It also catalogues the types of pain medications being used, the percentage of pain relief obtained from medications, and assesses the distribution of pain via a body map.

Number of items

The BPI contains a total of 15 items.

Response options/scales

The BPI uses a mixture of item types. Item one querying about the presence of pain is a dichotomous “yes”, “no”. Item 2, the body map asks that areas of pain be shaded and an “x” placed on the body region that hurts the most. Items 3–6 (intensity items) utilize a 0 (no pain) to 10 (pain as bad as you can imagine) 11-point rating scale. Item 7 is an open-ended response to list pain medications. Item 8 (percentage of pain relief) ranges from 0% (no relief) to 100% (complete relief). Item 9 (a-g) inquires about interference using an 11-point numeric rating scale. Each item ranges between 0 (does not interfere) to 10 (completely interferes).

Recall period for items

The time frame for the BPI is typically based upon “the past week” but some versions allow for the past 24 hours.

Endorsements / Examples of use

The BPI is widely used in clinical trials for pain and in pain research generally. It is one of the instruments recommended by the Initiative on Methods, Measurement, and Pain Assessment in Clinical Trials (IMMPACT) group 19 for inclusion in any clinical trial evaluating pain.

Practical Application

How to obtain

The BPI is available through the following address:

The Department of Symptom Research Attn: Assessment Tools The University of Texas MD Anderson Cancer Center 1515 Holcombe Boulevard, Unit 1450 Houston, TX 77030 713-745-3805

The BPI is available free of charge for non-funded academic research. For funded academic research there is a charge per project (e.g., $300) and a charge for commercial research (e.g., $800 per project).

Method of administration

The BPI can be administered as a self-report questionnaire or as an interview.

Scoring

While some of the items represent single item values, pain intensity, indexed by the “Pain Severity Score” is calculated by obtaining the mean of the 4 pain intensity items. The Pain Interference Score is obtained by calculating the mean of the 7 interference items.

Score interpretation

The “Pain Severity Score” has a maximum value of 10 (i.e., “pain as bad as you can imagine” and a minimum value of 0 (i.e., “No pain”). The Pain Interference Scale similarly has a maximum value of 10 (i.e., “Completely Interferes”) to 0 (i.e., “Does not Interfere”). The BPI is easily scored by hand.

Respondent burden

It takes approximately 5 minutes to complete the BPI.

Administration burden

The BPI is easily administered by handing the questionnaire to the participant or by asking each question verbally. Scoring is accomplished by calculating 2 means which can be done in less than 5 minutes.

Translations/adaptations

Validated translations are available for the following languages: English, Spanish, Italian, Russian, Norwegian, Greek, German, Japanese, Chinese, Arabic, Bulgarian, Cebuano, Croatian, Czech, Filipino, French, Hindi, Korean, Malay, Slovak, Slovenian, and Thai.

Psychometric Information

Method of development

Prior to the development of the BPI, there was no specific instrument designed to the intensity and impact of cancer pain that was brief and that could be administered repeatedly over time to monitor the effects of treatment. Existing measures at the time (e.g., the McGill Pain Questionnaire), were developed for non-cancer pain. Based upon patient interviews, it was discovered that existing questionnaires were too ambiguous, irrelevant or too lengthy for the assessment of cancer pain. The questionnaire was developed in accordance with the best guidelines for test construction available at the time (i.e., 1970’s); Standards for Educational and Psychological Tests published by the American Psychological Association, American Educational Research Association, and by the National Council on Measurement in Education. Item development was informed by patient interviews and by field testing of items. Even though this questionnaire was developed 30 years ago, the approach conforms to the more recently published Draft Guidance for Industry, Patient-reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims by the FDA. The BPI has since been validated for use as a brief and meaningful pain assessment tool for non-cancer pain conditions as well 17,18 .

Acceptability

Acceptability was assessed in a non-cancer pain population. The BPI was readily accepted by patients, was not associated with excessive missing data, and did not have problematic floor/ceiling effects 20 .

Reliability

Internal consistency for the Pain Severity Score and for the Interference scale has been reported as being 0.85 and 0.88 respectively in non-cancer pain populations 18 . Test-Retest reliability has been assessed for both cancer and non-cancer forms of pain and for over varying time frames. For very short time intervals (e.g., 30–60 minutes), the test-retest reliability was 0.98 for pain severity and 0.97 for pain interference ( 21 . Test-retest reliability for daily administration ranges between 0.83 to 0.88 for pain severity and between 0.83 to 0.93 for pain interference 22 . FM is considered to be a form of non-cancer or musculoskeletal pain and as such these metrics could be applied to FM; however, formal assessment of reliability of the BPI in FM is not available.

Validity

Item analysis has consistently revealed a two factor structure (severity or intensity and interference) in more than 36 studies of the BPI across multiple languages for both cancer and non-cancer pain populations 23 . Construct validity of the BPI has been supported for the generic assessment of pain as well as specifically for low back pain, rheumatoid arthritis 17 , and Osteoarthritis 20 . In a sample of patients with arthritis, the BPI pain severity score correlated (r=0.74) with the bodily pain scale of the SF-36, a generic measure of pain intensity, and (r=0.77) with the Chronic Pain Grade Intensity scale, another generic pain intensity measure. The BPI Interference scale from this same sample correlated (r=.81) with the Chronic Pain Grade disability scale, and (r=.69) with the HAQ disability index, a disease specific measure of functional interference 17 .

Ability to detect change

The BPI has demonstrated responsively to change in response to many forms of pharmacologic and non-pharmacologic treatments 23 . In chronic pain states generally, an improvement of 30% or 2–3 points improvement is considered to be a clinically meaningful change 24–26 . In a pooled analysis across 12-weeks of treatment from four randomized controlled trials of duloxetine for FM, the BPI “average pain” and the “Pain Severity Score” was anchored against the Patient Global Impression of Improvement scale (PGI-I). Anchor-based MCIDs for the “average” pain and for the (PGI-I) were calculated based upon the difference in mean change from baseline to endpoint resulting values of 2.1 and 2.2 points respectively. This amount of change was associated with 32% and 34% reductions in pain from the baseline scores 27

References

The User Manual for the BPI contains a reference listing of 72 studies supporting the valid use of the BPI across a wide variety of chronic pain conditions including FM 23 .

Critical appraisal of overall value to rheumatology community

Strengths

The BPI was designed to monitor change in pain (and its impact) over time. Numerous studies support its validity to function in this capacity.

Caveats, cautions

The BPI is an industry standard for the generic assess of both cancer and non-cancer pain conditions and contains few flaws in terms of psychometrics, ease of administration, or utility. Far more is known about the psychometrics of the Pain Severity scale and the Pain Interference scale than about the other features of the questionnaire (pain relief, body map etc). These other features are often not reported in trials using this instrument. Reports specifically focused upon the psychometric evaluation of the BPI in FM are not available; however FM is classified as a chronic non-cancer musculoskeletal pain condition and the validity of the BPI is supported for the generic assessment of pain intensity and interference.

Clinical usability

The BPI is recommended for use in clinical settings to monitor the severity and impact of pain generically.

Research usability

The BPI is recommended as tool of choice for the assessment of pain in clinical pain trials 28 . It is easily administered and has low patient burden

Multidimensional Fatigue Inventory

General Description

Purpose

The Multidimensional Fatigue Inventory (MFI-20) was introduced 1995 29 as a measure of fatigue severity. Fatigue is perhaps the most common complaint heard by clinicians. Apart from the everyday use of the term to describe normal tiredness; it can be used to indicate the presence of disease 29 . Thus the MFI-20 was developed to function as an index of disease, as a diagnostic criterion, or as an outcome variable when a treatment is being evaluated.

Content

The MFI-20 possesses 5 factor analytically confirmed subscales assessing general fatigue, physical fatigue, reduced activity, reduced motivation, and mental fatigue. The MFI differs from other multidimensional fatigue measures by purposely retaining a relatively short list of items, and by eliminating somatic items.

Number of items

The MFI-20 contains 20 items.

Response options/scales

The MFI-20 uses the same repose set for each of the 20 items. The respondent is asked to mark an x in 1 of 5 boxes arranged linearly and anchored by “yes, that is true” at one pole to “no, that is not true” at the opposite pole. Scoring of scales requires some items to be reversed such that a higher score on each scale is indicative of greater fatigue.

Recall period for items

The timeframe is somewhat non-specific as the questionnaire queries for symptoms occurring “lately”.

Endorsements / Examples of use

The MFI-20 has been used in numerous clinical populations including cancer 30 , Sjogren’s Syndrome 31 craniopharyngioma 32 , myelodysplastic patients 33 , chronic fatigue syndrome 29 , FM 34 , and general chronic pain 35 . It has also been validated for use in non-clinical samples including psychology students, medical students, Army recruits, and junior physicians 29 .

Practical Application

How to obtain

The MFI-20 is available from the author at the following address: