Skip to main content

Table 2 Quality Appraisal Criteria. (adapted from Pesudovs et al., 2007)

From: A rapid state-of the-art review of client-reported outcomes measures used to assess dogs’ clinical signs and quality of life during chemotherapy

Quality item and definition

Criteria (as applied in Table 4)

1. Purpose/intended population

Specification of purpose pre-study and if intended population has been studied.

clear statement of aims and target population, as well as intended population being studies in adequate depth, Only one or generic sample,

X Not reported

2. Actual content (face validity)

Extent to which the content meets the pre-study aims and population. Subjective/qualitative evaluation of whether the questionnaire appears to measure what it’s supposed to measure.

Content appears relevant to the intended population, Some relevant content areas missing,

X Content area irrelevant to the intended population

3. Item identification

Items selected are relevant to the target population.

Evidence of consultation/involvement of clients, stakeholders, and experts (through focus groups/one-to-one interview) and review of literature, Some evidence of consultation,

X No consultation/involvement in item identification

4. Item selection

Determining of final items to include in the instrument.

Rasch or factor analysis employed, missing items and floor/ceiling effects taken into consideration. Statistical justification for removal of items, Some evidence of above analysis,

X Not reported.

5. Uni-dimensionality

Demonstration that all items fit within an underlying construct.

Rasch analysis or factor loading for each construct. Factor loadings > 0.4 for all items,

Cronbach’s alpha coefficient used to determine correlation with other items in instrument. Value > 0.7 and < 0.9, X Not reported.

6. Response scale

Scale used to complete the measure.

Response scale noted, and adequate justification given,

Response scale provided with no justification for selection, X Not reported.

7. Convergent validity

Assessment of the degree of correlation between existing measure (of similar construct) with the new measure. This may not always be possible if there are no similar measures available.

Tested against appropriate measure, Pearson’s correlation coefficient between 0.3 and 0.9, Inappropriate measure, but coefficient between 0.3 and 0.9 or tested and correlates < 0.3 or > 0.9, X Not reported

8. Discriminant validity

Degree to which an instrument diverges from another instrument that it should not be similar to.

Tested against appropriate measure, Pearson’s correlation coefficient < 0.3,

Inappropriate measure, but coefficient < 0.3,

X Not reported or tested and correlates > 0.3.

9. Predictive validity

Ability for a measure to predict a future event.

Tested against appropriate measure and value > 0.3, Inappropriate measure but coefficient > 0.3,

X Not reported or correlates < 0.3.

10. Test-rest reliability

Statistical technique used to estimate components of measurement error by testing comparability between two applications of the same test at different time points.

Pearson’s r value or Intra Class Coefficients (ICC) > 0.8, Measured but Pearson’s r value or ICC < 0.8,

X Not reported.

11. Responsiveness

Extent to which an instrument can detect clinically important differences over time.

Discussion of responsiveness and change over time. Score changes > minimally important difference (MID) over time,

Some discussion but no measure of MID,

X Not reported.