|Year : 2017 | Volume
| Issue : 1 | Page : 7-13
Reliability and validity of anonymous web-based surveys on back pain-related disability
Grzegorz Miekisiak1, Dariusz Łątka2, Adam Sulewski3, Łukasz Kubaszewski3, Paweł Jarmużek4
1 Department of Neurosurgery, Specialist Medical Center, Polanica-Zdroj, Poland
2 Department of Neurosurgery, Regional Medical Center, Opole, Poland
3 Department of Orthopedics and Traumatology, University of Medical Sciences, Poznan, Poland
4 Department of Neurosurgery, Regional Neurosurgery and Neurotrauma Center, Zielona Gora, Poland
|Date of Web Publication||29-May-2017|
Department of Neurosurgery, Specialist Medical Center, ul. Jana Pawla II 2, 57-320 Polanica-Zdroj
Source of Support: None, Conflict of Interest: None
Purpose: To evaluate the reliability and validity of anonymous web-based surveys on back pain-related disability by comparing psychometric properties of identical Web-based and paper-based PROMs.
Methods: The tested instrument was the Core Outcome Measure Index (COMI), a PROM specific to the low back pain. The WBS was open and anonymous, the Paper-Based Survey (PBS) was administered in several hospitals. Besides COMI both surveys contained assorted questions enabling testing of key psychometric properties.
Results: A total of 2318 respondents completed the WBS, 2285 were included in the study, the response rate was 40.60%. 169 subjects completed the PBS, data of 164 was included. The properties evaluated with a single test administration, i.e.: floor and ceiling effect, internal consistency, exploratory factor analysis, and convergent validity were very similar in both groups. The test-retest validity requiring repeated test administration was negatively affected by the low rate of returning respondents in the WBS group.
Conclusions: The comparison of two methods of survey administration shows that the open anonymous web-based surveys are valid and reliable sources of data.
Keywords: Low back pain, patient-reported outcome measures, psychometric properties, web survey
|How to cite this article:|
Miekisiak G, Łątka D, Sulewski A, Kubaszewski &, Jarmużek P. Reliability and validity of anonymous web-based surveys on back pain-related disability. J Spinal Stud Surg 2017;1:7-13
|How to cite this URL:|
Miekisiak G, Łątka D, Sulewski A, Kubaszewski &, Jarmużek P. Reliability and validity of anonymous web-based surveys on back pain-related disability. J Spinal Stud Surg [serial online] 2017 [cited 2022 Dec 2];1:7-13. Available from: https://www.jsss-journal.com/text.asp?2017/1/1/7/207208
| Introduction|| |
The Internet plays an increasingly important role in the management of low back pain. It evolved from being the source of medical information for patients into a medium allowing for unidirectional communication. The online treatment programs have been launched with promising results., The unsurpassed access to large populations allows for conducting survey-based research.
Web-based surveys (WBSs) are a relatively new mode of data acquisition. Over time, their popularity has grown and new applications have been found. WBS offers some advantages over conventional paper or face-to-face methods such as easy access to large populations in short time, immediate verification of form validity, and significant reduction of the burden associated with data processing. In addition, WBS offers additional unique functionalities such as dynamic changes of order/layout or precise analysis of time spent on answering particular questions. The computer adaptive tests have been designed to improve the measurement accuracy. As they have not been around for long, some concerns need to be addressed before their introduction into routine use. Almost by definition, the internet allows anonymity which makes the control of respondents impossible, and this may result in multiple same-user entries and/or nonserious responses. There is also an uncertainty whether the internet sample is representative of the population, but the main concern is the validity of results of WBS as compared to paper-and-pencil questionnaires.
Patient-reported outcome measures (PROMs) play an increasingly important role in today's clinical practice. They provide patients' perspective on their symptoms, functional status, and health-related quality of life. Initially developed for research purposes, over time, they have changed the way medicine is practiced and their use is now advocated by some regulatory bodies. As the number of PROMs available to measure different concepts is growing rapidly, objective assessment of their validity becomes paramount. The attributes being evaluated are the quality of content and measurement properties. The assessment of the latter is basically a clinical study, sometimes involving large numbers of subjects and designed to evaluate several parameters. The methodology of this process has been well described in the literature, including comprehensive guidelines established by task groups, such as Scientific Advisory Committee of the Medical Outcomes Trust  or COSMIN. In essence, the competent instruments should be valid, reliable, and responsive. Validity can be defined as the ability to measure the desired concept, reliability is a measure of reproducibility, while consistency and responsiveness reflect an aptitude to detect important changes over time. Psychometric studies designed to evaluate these properties can be fairly complex and time-consuming as they require recruitment of a large number of subjects. The resulting administrative burden is a major obstacle in these efforts. This is especially true when a new measure is being evaluated, and small alteration in content with subsequent validation can be costly, both in terms of time and effort.
Web-based PROMs are a new development, conceived to take advantage of new opportunities posed by the web. The list of application for these instruments is growing, including web surveys used in the musculoskeletal and pain research. The objective of this study was to evaluate the validity and reliability of data coming from open, anonymous web-based studies concerning the back pain-related disability. This was accomplished by testing psychometric properties of the online version of the Core Outcome Measures Index (COMI) questionnaire, a tool used to assess the low back pain-related disability, in comparison with the paper-based survey (PBS) in two independent groups of subjects.
| Materials and Methods|| |
The core outcome measures index
The COMI is a self-administered questionnaire, designed to measure the influence of back pain on activities of daily living. It comprises seven items evaluating five domains (pain, difficulties in everyday life, symptom-specific well-being, general quality of life, and the social and work disability).
The first two items assess the back and leg pain using a numeric rating scale ranging from 0 to 10. Remaining items are rated on 5-point Likert-type scales. The social and work disability questions refer to the last 4 weeks before evaluation, and the rest pertains to the last 7 days. The overall COMI score is calculated by averaging the values for each of five domains after re-scoring them in 0–10 scale. For the pain domain, the higher of the two values is used, and for disability, it is an arithmetic mean of social and work disability. The COMI is best known as the main PROM of the Spine Tango Surgical Registry.
The WBS used in this study comprised 19 questions divided into four screens [Figure 1] and [Figure 2]. Besides COMI, it contained four items asking for basic demographic information and the Oswestry Disability Index (ODI) Questionnaire., The fifth screen contained a brief interpretation of data, explaining the level of disability. It served as an incentive to complete the questionnaire. It also contained information encouraging retesting at a later time. The estimated time to complete response to the questionnaire was 12 min.
|Figure 2: A screenshot of the Core Outcome Measures Index part of the surve|
Click here to view
The questionnaire booklet contained the COMI and the ODI along with some basic questions regarding demographics (e.g., sex, age). In addition, a Roland–Morris Disability Questionnaire and three Likert-type questions regarding the pain and the use of pain medications were included. These additional items were disregarded in a present study. Inclusion criteria were chronic low back pain (with or without radiation to the leg), age of 18 years or more, and good comprehension of the Polish language. COMI with any missing answer was rejected and no more than one missing answer was allowed for the ODI. Patients from nine departments from across the country participated in the study. Immediately after collecting the questionnaire, they were handed the same booklet to retest on a condition that there were no therapeutic interventions between the administrations. The PBS was carried out as a part of another project designed to produce a validated language version of the COMI for low back pain, and the results have already been published.
The study was approved by the Ethics Committee.
Floor and ceiling effect was evaluated by calculating the percentage of respondents who obtained the lowest and highest score possible. When too large, it can negatively affect the measures as it is not possible to detect any improvement or deterioration at the extremes. Hence, the desired value for floor/ceiling effect was <15%–20%., Values above 70% were considered to affect the results negatively. Internal consistency, which measures the degree to which particular items of the questionnaire are correlated, is an important quality of instruments measuring single underlying concept, such as the COMI. In this study, evaluation of internal consistency was made by measuring the Cronbach's alpha, a reliable and arguably most widely used indicator. High values of Cronbach's alpha signify good internal consistency; however, too high value implies redundancy of some items. Nunnally and Bernstein proposed a criterion of 0.70–0.90 as the most desired range for internal consistency.
Exploratory factor analysis with principal components extraction was performed on all items to examine the latent dimensions of the scale. The optimum number of factors was determined by the number of Eigenvalues >1. Item loadings on each factor ≥0.4 were considered satisfactory for inclusion in that factor.
Construct validity is the extent to which a measure assesses the theoretical construct it has been designed for. One subtype of construct validity is convergent validity. It can be defined as the relationship between constructs that are theoretically similar. In the present study, the association between COMI and ODI was evaluated by measuring the Spearman's rho (ρ) corrected for ties. The following thresholds for validity coefficients were accepted: r > 0.8 as excellent, 0.61–0.8 very good, 0.41–0.6 good, 0.21–0.4 fair, and 0–0.2 poor.
The response rate is an important attribute of a WBS. High value carries the risk of significant nonresponse bias; however, this relationship is not obvious. In the present study, the response rate was established as the proportion of users who completed the survey from among all the visitors who visited the web page containing the questionnaire. This was obtained from the Google Analytics Service (Google Inc., Mountain View, CA, USA).
The test-retest reliability is a measure of the instruments' reproducibility, meaning the ability to yield the same result over time under the same evaluation conditions. In the present study, the interval was arbitrarily set as >24 h and <30 days. The intraclass correlation coefficient (ICC) was used to evaluate this form of reliability. The possible values for the ICC are within the range 0.00–1.00, values of 0.60–0.80 are considered to indicate good reliability, and above 0.80 excellent. Knowing the ICC, the minimum detectable change (MDC 95%) indicating the minimum change of score that is considered a “real change” by the patients is calculated.
The significance of differences between means for independent variables, such as age, was measured with the t-test. Chi-square test was used to compare proportions (female to male ratio, floor/ceiling effect).
| Results|| |
Demographics, floor and ceiling effect, data distribution, and response rate of the web-based survey
From September 2, 2013, to June 7, 2014, a total of 2318 respondents completed the WBS. Thirty-three participants were excluded as they were <18 years old, thus leaving the total of 2285 for analysis. One hundred and sixty-nine subjects filled in the PBS. After discarding questionnaires with too many missing values, the final number of subjects was 164. The mean age was significantly lower in the WBS group, and there were relatively fewer women. The demographics are presented in [Table 1].
|Table 1: Demographic profile and floor and ceiling effect for tested groups|
Click here to view
Floor and ceiling effect was similarly negligible in both groups [Table 1]. The mean value of COMI was higher in the PBS group. The Kolmogorov–Smirnov test demonstrated that the data were negatively skewed in both groups and were normally distributed only in the PBS group. In the period of survey, 5710 sessions and 2318 complete submissions have been recorded. Thus, the response rate for the WBS was 40.60%.
Internal consistency, exploratory factor analysis, and convergent validity
The COMI had a good internal consistency when tested in both groups. The value of Cronbach's alpha was nearly identical: 0.8377 (lower 95% confidence interval [CI]: 0.8289) for the WBS and 0.8497 (lower 95% CI: 0.8179) for the PBS. The effect of dropping particular items on the alpha index was similar in both groups [Table 2].
The exploratory factor analysis yielded single factor structure in both groups. It accounted for 55.44% of variance in the WBS group and 57.32% in the other one. The COMI had very good convergent validity with the ODI in both groups [Figure 3]. The Spearman “ρ” was 0.752 (95% CI: 0.734–0.770) and 0.715 (95% CI: 0.631–0.782) for the WBS and PBS, respectively [Table 2].
|Figure 3: Convergent validity of the Core Outcome Measures Index and the Oswestry Disability Index with different modes of application.|
Click here to view
Ninety-three out of 169 eligible patients (55%) in the PBS group were retested within the desired period. In the WBS, this ratio was much lower; only 16 out of 2285 subjects (0.7%) submitted the questionnaire twice, as per the rules defined. This poor rate in the latter group had a negative impact on the quality of data. The ICC in the PBS group was 0.8862 (95% CI: 0.8333–0.923), while for the WBS group, it was 0.6126 (95% CI: 0.2071–0.8428) [Table 2]. The resulting MDC 95% was then 1.78 and 3.72, respectively. The mean COMI values for both applications are shown in [Figure 4].
|Figure 4: The mean values of the Core Outcome Measures Index score on test and retest.|
Click here to view
| Discussion|| |
The popularity of web surveys has grown with the internet. This method of data collection provides numerous advantages over other modes such as face-to-face or telephone interviews. It reduces time and cost of data handling, as the respondents submit the results directly into the online database. Likewise, the internet makes distribution of surveys quick and easy. Recently, a new interest has developed in the use of web-based PROMs., The number of possible applications is increasing. They can be used to monitor outcomes in online therapeutic programs  and to assess disability in web surveys. As most of them have been originally designed for paper-based and/or face-to-face administration, the use of new medium requires thorough research comparing the results of batteries of psychometric tests as performed in the present study.
One of the key advantages of WBS is its easy access to desired populations, who are difficult to reach through other mediums. In the present study, more than 2200 participants were recruited within a relatively short period (little over 9 months). Using the internet as a medium, one can recruit subjects who do not seek any formal treatment and thus are “invisible” to the healthcare system. It is plausible to assume that the group actually seeing a physician would represent a more severely affected subset. This was observed in the present study as the COMI was significantly higher in the PBS group. Another benefit of the WBS is a much faster data processing as the data are inserted directly into the database. The use of online form validation using the client-side scripting, for example, the jQuery library helps improve the quality of gathered data, denying missing or inappropriate answers.
However, the risk of selections bias and questions of reliability and validity are main disadvantages and limitations of WBSs. The present study tried to address these issues by comparing the results of psychometric tests of an established self-reported outcome measure using the WBS versus PBS.
The response rate of 40.60% was higher than reported values for web-based questionnaires but a little lower than PBSs. Consistent with the literature, the respondents in the WBS group were significantly younger. The age-related recruitment bias in the extant studies could have a negative effect; however, bias was not relevant to the present work evaluating psychometric properties. The difference in the proportion of male to female in the groups was not statistically significant. The floor/ceiling effect was equally insignificant in both groups, having no effect on the results. The internal consistency, expressed by the Cronbach's alpha, was slightly superior in the WBS group, with a narrower 95% CI owing to the greater number of samples. The benefit of the increased number of participants is also apparent in the assessment of convergent validity as it allowed for a greater precision in the evaluation of Spearman's coefficient of rank correlation (ρ). Interestingly, for extremely high COMI and ODI values, the correlation was random [Figure 1]. This was not observed in the PBS group as only two respondents reached the maximum COMI score. Another statistical method, the exploratory factor analysis, was used to assess variability among observers. It was aimed to detect latent variables, unobserved in the PBS group. It revealed a one-dimensional structure, similar in both groups. A strong agreement between both the groups disproves one of the preconceptions regarding the web surveys that their reliability is decreased by nonserious respondents. On the other hand, anonymity may improve the honesty in the respondents, either by reducing the social desirability in particular in case of highly sensitive questions.
An essential element of instruments' evaluation is test-retest reliability. In this aspect, the WBS was far less reliable than the PBS, mainly due to a small number of participants, who filled in the questionnaire twice within desired period, not allowing for sufficient precision of measurement. This is a clear drawback of the anonymous nature of WBSs. Clearly, an effective incentive, encouraging retesting would help resolve this issue.
| Conclusion|| |
Psychometric testing of the web-based COMI shows that the open, anonymous WBSs on back pain disability produce valid and reliable data. The COMI, a primary instrument used in this study, is well suited for web-based administration.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Macea DD, Gajos K, Daglia Calil YA, Fregni F. The efficacy of web-based cognitive behavioral interventions for chronic pain: A systematic review and meta-analysis. J Pain 2010;11:917-29.
Schulz PJ, Rubinell S, Hartung U. An internet-based approach to enhance self-managementof chronic low back pain in the Italian-speaking population of Switzerland: Results from a pilot study. Int J Public Health 2007;52:286-94.
Wilk V, Palmer HD, Stosic RG, McLachlan AJ. Evidence and practice in the self-management of low back pain: Findings from an Australian internet-based survey. Clin J Pain 2010;26:533-40.
Wyatt JC. When to use web-based surveys. J Am Med Inform Assoc 2000;7:426-30.
Kim JG, Choi B. Measurement precision for Oswestry back pain disability questionnaire versus a web-based computer adaptive testing for measuring back pain. J Back Musculoskelet Rehabil 2015;28:145-52. [Doi: 10.3233/BMR-140502].
Gosling SD, Vazire S, Srivastava S, John OP. Should we trust web-based studies? A comparative analysis of six preconceptions about internet questionnaires. Am Psychol 2004;59:93-104.
Black N. Patient reported outcome measures could help transform healthcare. BMJ 2013;346:f167.
McDonald R, Kristensen SR, Zaidi S, Sutton M, Todd S, Konteh F, et al
. Evaluation of the Commissioning for Quality and Innovation Framework Final Report; 2013.
Aaronson N, Alonso J, Burnam A, Lohr KN, Patrick DL, Perrin E, et al.
Assessing health status and quality-of-life instruments: Attributes and review criteria. Qual Life Res 2002;11:193-205.
Terwee CB, Mokkink LB, Knol DL, Ostelo RW, Bouter LM, de Vet HC. Rating the methodological quality in systematic reviews of studies on measurement properties: A scoring system for the COSMIN checklist. Qual Life Res 2012;21:651-7.
Beaton DE, Hogg-Johnson S, Bombardier C. Evaluating changes in health status: Reliability and responsiveness of five generic health status measures in workers with musculoskeletal disorders. J Clin Epidemiol 1997;50:79-93.
Mannion AF, Porchet F, Kleinstück FS, Lattig F, Jeszenszky D, Bartanusz V, et al.
The quality of spine surgery from the patient's perspective: Part 2. Minimal clinically important difference for improvement and deterioration as measured with the Core Outcome Measures Index. Eur Spine J 2009;18 Suppl 3:374-9.
Miekisiak G, Banach M, Kiwic G, Kubaszewski L, Kaczmarczyk J, Sulewski A, et al.
Reliability and validity of the Polish version of the Core Outcome Measures Index for the neck. Eur Spine J 2014;23:898-903.
Zweig T, Mannion AF, Grob D, Melloh M, Munting E, Tuschel A, et al.
How to Tango: A manual for implementing Spine Tango. Eur Spine J 2009;18 Suppl 3:312-20.
Fairbank JC, Pynsent PB. The Oswestry Disability Index. Spine (Phila Pa 1976) 2000;25:2940-53.
Miekisiak G, Kollataj M, Dobrogowski J, Kloc W, Libionka W, Banach M, et al.
Validation and cross-cultural adaptation of the Polish version of the Oswestry Disability Index. Spine (Phila Pa 1976) 2013;38:E237-43.
Eysenbach G. Improving the quality of web surveys: The Checklist for Reporting Results of Internet E-Surveys (CHERRIES). J Med Internet Res 2004;6:e34.
Miekisiak G, Kollataj M, Dobrogowski J, Kloc W, Libionka W, Banach M, et al.
Cross-cultural adaptation and validation of the Polish version of the Core Outcome Measures Index for low back pain. Eur Spine J 2013;22:995-1001.
Damasceno LH, Rocha PA, Barbosa ES, Barros CA, Canto FT, Defino HL, et al.
Cross-cultural adaptation and assessment of the reliability and validity of the Core Outcome Measures Index (COMI) for the Brazilian-Portuguese language. Eur Spine J 2012;21:1273-82.
Andresen EM. Criteria for assessing the tools of disability outcomes research. Arch Phys Med Rehabil 2000;81 12 Suppl 2:S15-20.
McHorney CA, Tarlov AR. Individual-patient monitoring in clinical practice: Are available health status surveys adequate? Qual Life Res 1995;4:293-307.
Terwee CB, Bot SD, de Boer MR, van der Windt DA, Knol DL, Dekker J, et al.
Quality criteria were proposed for measurement properties of health status questionnaires. J Clin Epidemiol 2007;60:34-42.
Nunnally JC, Bernstein IH. Psychometric Theory. New York: McGraw-Hill; 1994.
Pickering PM, Osmotherly PG, Attia JR, McElduff P. An examination of outcome measures for pain and dysfunction in the cervical spine: A factor analysis. Spine (Phila Pa 1976) 2011;36:581-8.
Van Ommeren M. Validity issues in transcultural epidemiology. Br J Psychiatry 2003;182:376-8.
DeVon HA, Block ME, Moyle-Wright P, Ernst DM, Hayden SJ, Lazzara DJ, et al.
A psychometric toolbox for testing validity and reliability. J Nurs Scholarsh 2007;39:155-64.
Feise RJ, Michael Menke J. Functional rating index: A new valid and reliable instrument to measure the magnitude of clinical change in spinal conditions. Spine (Phila Pa 1976) 2001;26:78-86.
Groves RM. Nonresponse rates and nonresponse bias in household surveys. Public Opin Q 2006;70:646-75.
Shrout PE, Fleiss JL. Intraclass correlations: Uses in assessing rater reliability. Psychol Bull 1979;86:420-8.
Braithwaite D, Emery J, De Lusignan S, Sutton S. Using the internet to conduct surveys of health professionals: A valid alternative? Fam Pract 2003;20:545-51.
Huang HM. Do print and web surveys provide the same results? Comput Human Behav 2006;22:334-50.
Gakhar H, McConnell B, Apostolopoulos AP, Lewis P. A pilot study investigating the use of at-home, web-based questionnaires compiling patient-reported outcome measures following total hip and knee replacement surgeries. J Long Term Eff Med Implants 2013;23:39-43.
Wilson J, Arshad F, Nnamoko N, Whiteman A, Ring J, Roy B. Patient-reported outcome measures: An on-line system empowering patient choice. J Am Med Inform Assoc 2014;21:725-9.
Carpenter KM, Stoner SA, Mundt JM, Stoelb B. An online self-help CBT intervention for chronic lower back pain. Clin J Pain 2012;28:14-22.
Yamada K, Matsudaira K, Takeshita K, Oka H, Hara N, Takagi Y. Prevalence of low back pain as the primary pain site and factors associated with low health-related quality of life in a large Japanese population: A pain-associated cross-sectional epidemiological survey. Mod Rheumatol 2014;24:343-8.
Burkey J, Kuechler WL. Web-based surveys for corporate information gathering: A bias-reducing design framework. Prof Commun IEEE Trans 2003;46:81-93.
Van Gelder MM, Bretveld RW, Roeleveld N. Web-based questionnaires: The future in epidemiology? Am J Epidemiol 2010;172:1292-8.
Shih TH, Fan X. Comparing response rates from web and mail surveys: A meta-analysis. Field Methods 2008;20:249-71.
Klovning A, Sandvik H, Hunskaar S. Web-based survey attracted age-biased sample with more severe illness than paper-based survey. J Clin Epidemiol 2009;62:1068-74.
Buchanan T. Potential of the Internet for personality research. Psychol. Exp. Internet, San Diego: Academic Press; 2000. p. 121-40.
Tourangeau R, Couper MP, Steiger DM. Humanizing self-administered surveys: Experiments on social presence in web and IVR surveys. Comput Human Behav 2003;19:1-24.
[Figure 1], [Figure 2], [Figure 3], [Figure 4]
[Table 1], [Table 2]