Skip Navigation LinksHome > November/December 2010 - Volume 28 - Issue 6 > Paradata: A New Data Source From Web-Administered Measures
CIN: Computers, Informatics, Nursing:
doi: 10.1097/NCN.0b013e3181f698fd
Continuing Education

Paradata: A New Data Source From Web-Administered Measures


Free Access
Continued Education
Article Outline
Collapse Box

Author Information

Author Affiliations: School of Nursing, Hashemite University, Zarqa, Jordan (Dr Sowan); and Institute for Educators in Nursing and Health Professions, University of Maryland School of Nursing, Baltimore (Dr Jenkins).

Disclaimer: Authors declares no conflict of interest.

Corresponding author: Louise S. Jenkins, PhD, RN, FAHA, University of Maryland School of Nursing, Suite 311N, 655 W Lombard St, Baltimore, MD 21201 (

Collapse Box


Web administration of measures offers numerous advantages as well as some drawbacks; the efficiency of collecting data in this way is dramatic. An important by-product of Web administration of measures is the option of creating paradata that offer information about how respondents access a measure (server-side paradata) and navigate within the online environment (client-side paradata) to complete the measure. Paradata can play a critical role in developing and piloting measures as well as refining the measurement process. Uses of paradata in Web-administered measures include (1) informing the choice of response formats, (2) examining the extent of changing response options, (3) examining the extent of following a prescribed sequence in completing a measure, (4) tracking the response process, (5) aiding in designing a Web-administered measure and its layout, and (6) assisting in determining the most appropriate log-in procedure. Because of the potential value of this new type of useful data to researchers in nursing and health, this article focuses on paradata within the context of Web-administered measures. More specifically, the article focuses on the definition, generation, and uses of paradata, as well as the ethical issues and other concerns in obtaining and using paradata. Uses of paradata to test the usability of information systems used in nursing and health practices are also included.

Web-administered measures were first used as innovative and promising methodology for measuring attributes and collecting research data in the mid-1990s.1 The increase in computer literacy, proliferation of using e-mail as a dominant method of communication, and continuous improvement in computer hardware and software support the increased use of Web-administered measures. The value of introducing new measurement modes includes the possibility of utilizing new capabilities and research opportunities that were not available in previous approaches.2

An important by-product of Web administration of measures is the option of creating paradata, which are computer-generated, selected pieces of information about various aspects of how respondents accessed the measure, as well as information about what actions they took while responding to the measure. This information reflects the usability of the design features of Web-administered measures, which has a great potential in the piloting phase of the measure.

The use of quality measures is critical for advancing nursing science. Reliability (consistency across items, time, or raters) and validity (the instrument measures what it purports to measure) are key concepts in nursing measurement focusing on minimizing random and systematic errors. Unlike the conventional types of measures (eg, paper-and-pencil), the usability of the design features of Web-administered measures contributes fundamentally to the reliability and validity of the measure. Paradata can play a critical role in creating and piloting user-friendly measures by testing different design features of Web-administered measures. These features affect accessing and navigating the instrument by the respondent and include the operating system used to create the measure, the choice of response format, the log-in procedure selected, the use of multimedia features such as automatic skip patterns, following a prescribed sequence, instrument layout, the compatibility between the respondent computer and the program used to create the measure, and the use of images and graphs. Constructing a respondent-friendly design of Web-administered measures entails creating a measure with minimal errors in coverage, nonresponse, and measurement.3,4

Because of the potential value of paradata, this article focuses on (1) defining paradata, (2) generating and collecting paradata, (3) uses of paradata, and (4) the implications and concerns, including ethical concerns, about using paradata. A review of the advantages and disadvantages of Web-administered measures is also offered. In addition, a separate section is included that addresses the use of paradata in testing usability of clinical information systems and the role of such data in improving safety and providing quality care.

Back to Top | Article Outline


A measure is an instrument or a device designed to quantify a specific attribute.5 In this article, we refer to measures that are either being developed specifically for Web administration or modifications of existing (often in paper-and-pencil format) measures that are being pilot tested and refined for Web administration. If refining is the purpose, attention to obtaining appropriate permissions and abiding by copyright laws is imperative. Measures administered via the Web have the advantages of being cost-effective and efficient in terms of time and resources, as they allow instant distribution and timely return and facilitate reaching geographically distant populations.6,7 One study reported that the cost of the Web-administered measure used was 38% less than that of the corresponding mailed survey.8

The Web provides capabilities and distinct advantages that are not available in other modes of administration for measures. These advantages can be used to increase respondent-instrument interaction and aid in reducing random and systematic measurement errors. The use of multimedia features of Web-administered measures to generate easy-to-navigate designs (ie, automatic skip patterns) may decrease individual variations in responding to measures and thereby decrease random errors. Furthermore, these features may enhance data validity by different means, such as dynamic error-checking capability and prohibiting respondents from providing out-of-range values or accidentally skipping one or more items. In addition, deciphering handwriting is not a problem in data entry and analysis. Data are directly entered and stored into a database file for analysis, which eliminates subsequent data-entry and coding errors.

An additional and attractive advantage of Web-administered measures is the possibility of yielding paradata that can track respondent actions in responding to a measure during the data collection process. Because of their potential benefits and the type of information they provide, specifically in the piloting phase of a measure, paradata are the focus of this article.

Back to Top | Article Outline


The term paradata was coined by Couper9 as a means to evaluate the usability of the design of Web-administered measures. "Paradata are auxiliary data about the process of data collection and include keystroke files and time stamps."10(p18) Paradata are primarily used to provide information about the process of accessing a Web-administered measure as well as the behaviors of respondents in answering the items. There are two types of paradata: (1) server-side paradata that provide information about how respondents access the measure itself and (2) client-side paradata that include information about how respondents navigate the individual items on a measure once it is accessed.11

Server-side paradata are recorded as a log (or computer) file on the server used to store the measure during the process of data collection.11 These paradata are captured at the level of the entire measure rather than at the level of the individual items. Examples of server-side paradata captured include the number of visits to the measure, time spent in each visit, respondents' identifiers (if used) such as username/password or personal identification number (PIN), and the Internet protocol (IP) address used to access the measure. Server-side paradata could also record respondent browser name and operating system version. These last two elements of paradata could be used to estimate the speed of downloading the measure which, in turn, may affect the response time and the respondent decision to complete the measure. Therefore, these paradata may provide information about the response burden in terms of accessing and responding to the measure using different Web browsers and operating systems.

Client-side paradata relate to collection of information at the item level.11 The behavioral patterns in answering the items are recorded in a log file, not visible to respondents, on the Web page containing the measure and accompany the related items. Client-side paradata are used to detect items with measurement problems, provide insights about the relationship between respondent characteristics and response-format preference, track the process of responding to items and connect it to the data quality, and/or to perform usability testing of certain design features of the measure.12 In some instances, a technical problem may prohibit a complete download of a measure when a respondent clicks the hyperlink to access the measure. Using client-side paradata, a download test can be programmed as a JavaScript function (Oracle, Redwood Shores, CA) to examine if a complete download of the measure was achieved before responding to the items to prevent the loss of the respondents' answers.13

Client-side paradata can also provide information such as how many times respondents changed their responses to a specific item, how long it took them to respond to each item and to the entire instrument, whether the intended sequence in answering the items was followed, and if respondents followed skip pattern directions.11,12 Such information may be particularly helpful for researchers to use in piloting or refining Web-administered measures.

Back to Top | Article Outline


Server-side paradata can be generated as an automatic by-product of administering a measure via the Web, whereas client-side paradata require programming using special scripting language, such as JavaScript, for collection.12 Client-side paradata can be recorded when the HTML Web page of an instrument contains JavaScript and when the instrument items are set up with JavaScript triggers to generate the paradata. Using JavaScript, only meaningful actions can be recorded. These actions may include clicking a radio button response format, selecting a response option from a drop-down menu, writing in a text area, accessing a hyperlink, or submitting the completed questionnaire.11

The first step in generating client-side paradata is to decide on the meaningful actions in the measure, which may include answering specific questions in the measure that are of interest to the researcher. A second step is to code the information that may result from these actions, such as the name of the question in the measure (ie, q2 for question 2) and the values of the response options of this question (ie, 1, 2, for a question with two radio button options). After that, a researcher should identify and link HTML-JavaScript code to each response option or piece of information that will be recorded.11

Paradata are collected with the starting and ending time (in milliseconds) it took the respondent to perform each function and are stored in a string format. Paradata can be extracted to a processor file where data of interest can be analyzed.12 Information about using JavaScript for recording client-side paradata can be found in the Client-Side Paradata Project Web page at∼u0034437/public/csp.htm. This Web site provides two output files of paradata that can be obtained by filling out and submitting a Web survey by the visitor of this Web page in addition to a description of the latest version of client-side paradata.

Back to Top | Article Outline


One of the major uses of paradata is to examine the effect of the design of a Web-administered measure on measurement errors and data quality. Paradata can be used to inform the choice of the response formats, examine the extent of changing response options and the extent of following a prescribed sequence in completing the measure, track the response process, examine a respondent-friendly design of a Web-administered measure, design the instrument layout, and to help select the appropriate log-in procedure. These uses of paradata are particularly important for piloting Web-administered measures.

Back to Top | Article Outline
The Choice of Response Formats

Web-administered measures typically use three ways to indicate a response to items: (1) radio buttons, (2) drop-down boxes, and (3) text areas. Each of these has advantages and drawbacks; accordingly, the choice should be guided by the types of items used as well as the type of the respondents.14 Radio buttons allow visible response options and support designing a format that is similar to traditional mail instruments.6 Drop-down boxes are unique to Web-administered measures and used for items with long lists. Text areas, which are commonly used for open-ended items, are not unique to Web-administered measures. However, when using text areas in Web-administered measures, the size of the box should be large enough to accommodate the required information to avoid the need to scroll to read the input.6

In two studies, paradata were used to examine the effect of using drop-down boxes versus radio buttons on the time and rate of instrument completion. Results revealed that drop-down boxes were more difficult to use as compared with radio buttons and took significantly longer time to provide an answer because they involve two mouse clicks.11,14

Back to Top | Article Outline
Changing Item Responses

Paradata were also used to investigate how many times respondents changed their answers to items and how many times each answering option was changed.11 Results revealed that changing answers was more frequent for items with radio button options than for items with drop-down boxes. This was attributed to greater time being required to read and select an option from a drop-down box than clicking a radio button.

Changing a response option can be related to three main factors: (1) respondent characteristics (eg, initial misreading or misunderstanding of an item or dissatisfaction with the option first selected), (2) measurement characteristics (ie, a problem in the item or the response formats), and/or (3) contextual effect resulting in distractions while completing a measure. Since there is limited research control of the contextual effect, these factors should be considered a standard "margin of error" that should be always included in interpreting the results obtained from analyzing paradata about respondents' actions in completing the measure.

Back to Top | Article Outline
Following a Prescribed Sequence in Completing the Measure

Not following the sequence of item presentation while completing the measure may reflect some problems in the design of a Web-administered measure or may be related to the type of response format(s) used. Using paradata to examine the relationship between the use of specific response formats and following the intended sequence in completing a questionnaire, it was found that the use of drop-down boxes produced significantly better adherence to the prescribed order than radio buttons.11 On the flip side, and as noted earlier, the use of drop-down boxes increases the amount of time for completing a measure.

Back to Top | Article Outline
Tracking the Response Process

In mailed questionnaires, it is hard to determine whether some subjects are true nonresponders or whether they received the measure.15 Researchers may use various approaches to follow up contacts for this purpose and still may not be able to classify those who did not receive the measure from those who had received the measure and decided not to respond. Using paradata offers a substantial advantage in tracking the response process to accurately classify the missing data and response behaviors.

Through the use of paradata, Bosnjak and Tuten15 identified seven distinct response patterns in research studies using Web-administered measures. Table 1 provides the types and definitions of these response patterns. These authors argue that this categorization of the response process in Web-administered measures is more accurate than the one used in conventional-type measures (eg, paper-and-pencil) that depends on classifying respondents as complete participants (no missing data), unit nonresponders (those who did not respond), and item nonresponders (submitted the questionnaire with missing responses to some items).

Table 1
Table 1
Image Tools

From a research design perspective, unit nonresponders and lurking dropouts (Table 1) can be treated equally as cases with complete missing data. Similarly, answering dropouts and item nonresponders are cases with some missing data. However, from a measurement perspective, there is a difference between answering dropouts and item nonresponders. Answering dropouts could be related to respondent characteristics that may introduce a random source of error that affects the reliability of the measure (ie, respondent motivation or interest in the topic) and, in turn, threatens validity, whereas item nonresponders may be related to items with measurement problems that introduce a systematic source of error that does not alter reliability but does affect the validity of the measure. Similarly, unit nonresponders may reflect less motivated subjects or those facing technical difficulties that prohibited participation.

Clearly, there are different contextual factors that may affect the response pattern, specifically answering dropouts, item nonresponders, and item nonresponder dropouts. These may include respondent multitasking, experiencing interruptions while accessing or completing a measure, respondent emotional response to an item, and/or not understanding an item. Although paradata can be used to categorize the response patterns, since paradata are created from actions/behaviors of respondents as they access and navigate through completion of a measure, these data cannot explain the contextual factors affecting the response process. However, client-side paradata can be used to minimize the long time lags due to interruptions by designing and activating a pop-up screen to remind respondents to continue filling out the instrument after a certain time of inactive status.13

Back to Top | Article Outline
Examining a Respondent-Friendly Design

A respondent-friendly design of Web-administered measures has a great potential in decreasing errors of coverage, nonresponse, and measurement.3,4 A respondent-friendly design aims to construct a measure in a way that will increase the likelihood that respondents will receive the instrument and provide the answers in the way anticipated by its developer.16 Paradata can be of particular importance in designing a respondent-friendly measure.

Paradata can be used to examine the accessibility of a Web-administered measure through examining the compatibility between respondents' browsers and operating systems and the computer program used to design the measure. Also, this can be achieved through applying a download test that records whether a respondent received the complete form of the instrument and the time it took a respondent to download the instrument. This, in turn, may decrease the coverage and nonresponse errors.

The use of paradata is also important in conducting usability testing of different design features that are used to create a respondent-friendly, Web-administered measure. Unlike other types of measures, the use of multimedia and advanced design features facilitate applying different navigational aids such as symbols or images that may enhance completion by being attractive to use and increase the response rate. On the other hand, if not carefully designed, these aids may confuse the respondent or create incompatibility problems that may increase response burden, introduce a source of systematic error, and increase the nonresponse and coverage errors.6,17 Differences among computers in terms of their connection speeds, browsers, and amount of memory may introduce differences in the visual layout of the measure. In a study comparing plain versus fancy versions of the same instrument, it was recommended to restrict the use of graphical features. Respondents of the fancy version had significantly more missing data and lower response rate and required 37% more time (P < .05) for completing the instrument.18

The use of paradata cannot substitute for sound theoretical grounding of a measure and the careful attention to all aspects of instrument development. Likewise, although paradata can offer helpful information in the process of adopting/refining an existing measure, there are many other relevant considerations such as permissions and copyright and previous work on the measure.5

Back to Top | Article Outline
Designing the Instrument Layout

Web-administered measures can support three layout design options: (1) screen-by-screen (screen-based or dynamic/interactive), (2) scroll-based or flat-file instrument,19 and (3) multiple-items-per-page layout.20 A screen-based layout presents one item per page and is mainly used for the purpose of maintaining the order of items during the response process, whereas in scroll-based layout, respondents can view all items at one time. In the presence of a slow data transmission, advanced design features such as automated skip pattern are difficult to program with scroll-based layout in contrast to the screen-based layout.19 The multiple-items-per-page layout groups related items on the same page. A careful selection of the instrument layout is critical to prevent "the loss of the context" that may result from separating related items into different screens.21

In a study that examined the difference in response rate between screen- and scroll-based layouts with a sample of social sciences faculty, scroll-based design resulted in a higher submission rate and lower mean time of completion, but higher item nonresponse. Although the screen-based layout made it easy for the respondents to cognitively process the tool and answer the items, the authors recommended using multiple-item-per-screen layout as the "middle solution" between the screen- and scroll-based layouts.19 Similarly, other studies recommended using a multiple-item-per-screen layout.8 This layout resulted in less item nonresponse and faster completion time versus the single-item-per-screen layout.20

Back to Top | Article Outline
The Effect of Using Different Log-in Procedures

Unlike the conventional-type measures, log-in procedures are used in Web-administered measures to limit access to intended respondents and to prevent multiple completions by the same individual. Log-in procedures can influence data quality in terms of the number of items answered, the amount of time spent to complete the measure, and the amount of information provided to sensitive items. There are three possible log-in procedures to access a Web-administered measure: (1) automatic (no access code to be keyed in), (2) semiautomatic (the use of one access code), and (3) manual (the use of two access codes). Access codes may include a PIN code or a username-password combination.22 There is a possibility of access error messages in semiautomatic and manual log-in procedures, specifically those consisting of digits and numbers. One study reported that some subjects faced difficulties in accessing the instrument because they mistakenly typed the digit zero "0" for the letter "O" and the digit one "1" for the letter "l."8

Crawford et al23 examined the effect of using manual versus automatic log-in procedures on response burden, nonresponse rate, and time spent to complete the measure. The automatic log-in resulted in less response burden in accessing the measure as evidenced by increasing the response rate. On the other hand, automatic access may result in a lower sense of confidentiality for the respondent and may decrease respondent motivation to proceed in the responding process.23 It may also result in more influence from social desirability if anonymity of responses is in question. In turn, this may affect the amount of cognitive effort respondents are willing to expend in providing accurate responses. Therefore, the authors recommended using a manual log-in procedure to enhance data quality.

The possibility of data security breaches is higher in automatic log-in as compared with semiautomatic and manual log-in. Heerwegh and Loosveldt22 reported nine attempts to access their measure using nonexisting PINs that were captured using server-side paradata. Therefore, it is important to weigh the advantages and disadvantages of various log-in procedures and select the one that is most suitable for the intended respondents.

Back to Top | Article Outline


Selected types of paradata may also be used to check for multiple completions of the measure by the same respondent, which affects the validity of the collected data and also violates the "independence of observation," a common assumption for statistical analyses. While measuring quality of life, Bell and Kahn24 used paradata to identify multiple completions by connecting IP addresses with the respondent's age and sex. Entries that matched in these variables were considered multiple completions and were excluded from the analysis.

Back to Top | Article Outline


Despite the numerous advantages of Web-administered measures, the adoption of this methodology for data collection is limited in nursing and healthcare research, and its use is rarely discussed in nursing literature.25 This may be related to the lack of sufficient skills and support to develop and collect data using this kind of measure and/or to insufficient knowledge about their various advantages.26 With paradata, skill in generating and using the log files needed, particularly for generating client-side information, may be a challenge.

Back to Top | Article Outline


Information systems and computerized applications are increasingly used in nursing and health to provide quality care that is evidence based, patient centered, and cost-effective. On the other hand, clinical information systems are risky investments. Poorly designed systems may compromise the quality of care by increasing errors and cost of care and decrease user productivity.27 Clinicians' acceptance and adoption of information systems are based on implementing tools that are efficient and easy-to-use and fit into the workflow.28 Therefore, usability assessment of such systems is critical for overall system success. Many of these systems are designed with built-in log files that record automatic data that are useful to understand the navigation behaviors of the users. Paradata can serve this purpose and provide indications regarding competencies needed by users for effective use of the system.

Rozic-Hristovski and colleagues29 used log files to understand the information-seeking pattern of users of a medical library Web site. The analysis of paradata helped future development of the Web site and decreased the number of clicks to access important information. Similarly, Müller et al30 evaluated the frequency of using Health on the Net medical media search engine by medical professionals. Analysis of log files revealed that users expressed general terms and broad concepts for queries rather than precise terms, which produced poor search results.

Cimino and colleagues31 used log files to evaluate patients' access to their electronic medical records. Data provided indications about the way patients think of their health and the most frequently reviewed data and functions used by patients. In another study, Borgne-Uguen et al32 evaluated the use of shared patient records within a healthcare network by healthcare professionals and the extent of information exchange between different medical specialties. Analysis of log files demonstrated (1) the limited use of patients' records by a small group of healthcare professionals, and (2) a "looser hierarchy" between healthcare and social services, in that data entered by some professionals were not used/viewed by others.

Jalloh and Waitman33 used log files to understand search queries used by clinicians to select an order from a computerized provider order-entry system (CPOE). Keystroke logs of the selected orders accompanied by the timing of each query performed by users were recorded and analyzed. Based on this information, queries were optimized by placing most frequently selected orders at the top of the list, which resulted in 16.3% reduction of order selection time. Log files have also been used to evaluate the usability of two user interface designs for delivering decision support materials within a CPOE system.34 Results showed that highlighting the availability of context-sensitive educational materials through visible hyperlinks significantly increased the utilization rate of such materials.

Most importantly, paradata have been used to improve the safety of delivering medications. Smart infusion pumps, a necessary component in the medication management system, contain modifiable built-in log files recorded in the pump memory. The logs generated record data that trigger dosage warning limits (ie, programming errors), duplicate drug therapy with timing of the action, and user's response to the alerts (reprogram, stop infusion). Log files can be used to monitor compliance in using drug libraries, identify high-alert medications, and assess current practices to improve the safety of delivering intravenous medications. Fanikos et al35 reviewed 863 alerts generated from programming anticoagulant infusions using smart infusion pumps. Most common alerts found were underdose and overdose. Alerts resulted in reprogramming of infusion pumps in 43% of the cases. Similarly, drug-duplication log files have been implemented into a CPOE system to improve the safety of the ordering process.36 Data generated by log reminders showed that a total of 11 298 orders (1.26%) involved drug-duplication reminders of 896 131 orders.

Paradata can also provide important information regarding the completeness and quality of nursing documentation using electronic health records (EHRs). Deficiencies in nursing documentation are well documented.37 Incomplete documentation can reflect, in addition to workload, inappropriate design features in the documentation system such as hidden menus that require restructuring of pages or the need for education about the documentation process or EHR use. Alerts and reminders using log files can be utilized to enhance documentation completeness, especially for important fields such as patient allergies information. In addition, tracking the sequence of documenting different aspects of care using paradata may provide indications on the diagnostic reasoning process nurses use to provide care. Based on these data, the sequence of registry fields can be arranged to match the diagnostic reasoning process used by nurses. Studies can be conducted to assess the relationship between diagnostic reasoning manifested by sequence of documentation and quality of care. The path nurses use to document care can also be an indicator for documentation completeness, if certain paths may yield more complete documentation than others. Time of documentation using different interface designs can also be tested using paradata. Furthermore, paradata can be used to test the usability of providing educational materials or linkage to literature within EHRs or to reflect changes in nursing diagnoses and interventions based on abnormal vital signs or laboratory tests.

Paradata may also have specific implications in online nursing education. Learning management software (eg, Blackboard, Learning System, Blackboard, Washington, DC) and distance learning systems (eg, Tegrity, Campus, Tegrity, Santa Clara, CA) used in nursing education have the capability of automatically recording server-side paradata. These data include statistical information about the number of "hits" or access to each learning modules or submodules by each student; and the number of accessing the learning modules per time of the day, day of the week, and per month. These paradata may help educators analyze the learning patterns of students and assist making informed decisions about the effectiveness of the instructional design in an online learning environment. Educators may use these data to investigate the usability of specific educational materials. Furthermore, these data may be used to examine specific educational outcomes, such as student achievements and satisfaction, as well as provide early assessment of students who are facing technological difficulties in the learning process.

Back to Top | Article Outline


A major concern in obtaining and using paradata centers around ethical issues involved, particularly if paradata are collected without the knowledge of respondents. Server-side paradata are automatic by-products of educational software used in online education as well as of Web-administered measures. However, this feature is generally not widely publicized among faculty involved in online research and education. In addition, students who are using educational software, respondents who are completing a measure, and healthcare providers who are using patient care computerized systems are typically not informed of the collection of such data. Researchers collecting paradata should consult with their institutional review board for the protection of the rights of human subjects.

Since disclosure to respondents is likely to be a part of the informed consent process, it seems that the most expedient time to collect paradata might be in the pilot-testing phase of Web administration of a measure. This is particularly true if there is a possibility that participant knowledge about the collection of paradata might bias responses to the measure. In any case, researchers should use standard security features when collecting any types of data using Web-measures. This may include the use of Secure Sockets Layer encryption for the survey link and survey pages and different backups to prevent the loss of the data. In addition, in all cases of collecting paradata, no respondent identifiers should be linked to these data unless it is necessary to answer the research questions. All identifiers should be destroyed after analyzing the data, and unauthorized access to the data should be prohibited.

Back to Top | Article Outline


While some server-side paradata may be automatically generated within some patient care computerized systems or educational software or when using Web-administered measures, some expertise is needed to create, analyze, and appropriately use paradata, particularly when dealing with client-side paradata. Lack of interest in obtaining information about respondent behaviors while completing a measure and the extra expense and experience required in programming paradata are among the main reasons for minimal utilization of such data.

Back to Top | Article Outline


Using Web-administered measures introduces unique challenges that need to be addressed. These measures may not suit all kinds of respondents and may introduce challenges in obtaining acceptable response rates and in maintaining a representative sample of the target population. The variations in Internet coverage and computer literacy are the main respondent-related factors that may introduce measurement errors in using this mode of administration.18 Internet access usually exists predominantly among special settings, such as universities, special professional groups, and business organizations. Only major institutions provide Internet access, e-mail accounts, and technical support to their employees/students as part of their large computerized communication system.38

At the outset, whether online data collection is appropriate for a specified group of respondents is a major concern that must be addressed. If the answer is yes, then using paradata may provide indications about the quality of the design of a Web-administered measure and also about the appropriateness of the measure to the respondents. This can be achieved by gaining an understanding of the response process and response patterns and by testing different features of the measure. In turn, this may enhance the response rate but cannot ensure a representative sample of the target population. Thus, use of paradata during pilot testing should contribute to strengthening the measurement process by helping in the design of a respondent-friendly measure that may decrease the response burden.

Furthermore, the domain of interest, researcher preference and experience in developing Web-administered measures, subject confidentiality, and data integrity are among the major concerns in using measures administered via the Web.25 Moreover, researchers cannot control the environment for responding, as some participants may complete the instrument in a quiet atmosphere, and others may do so in a noisy computer laboratory. On the other hand, some of these issues in using Web-administered measures can be easily handled by developing a well-designed format and presentation and by considering respondents' characteristics and abilities. This may increase the appropriateness of the delivery of the instrument to the characteristics of the subjects and facilitate access and navigation.

Another issue in using Web-administered measures is the mode effect that Groves4 classified as a source of measurement error. Lozar-Manfreda and Vehovar17 identified two unique effects of Web-administered measures: channel capacity and context effect. Channel capacity refers to constructing a measure using features that are not applicable to other measurement methods because they are either too expensive or not possible. These features may include the use of automatic skip pattern, automatic error checking, drop-down menu, and the requirement for downloading. Context effect refers to (1) the visual presence of the computer, which produces a negative effect for subjects with computer literacy concerns and a concern about subject privacy; (2) the specific task of completing a Web-administered measure, which refers to the tendency of the lack of concentration in filling out the measure; as during use of the Internet, people tend to do more than one task at a time; and (3) the specific social interaction on the Internet, which refers to the reduction in social desirability.

Instruments having acceptable evidence of reliability and validity in one administration mode may not maintain the same psychometric properties if administered in a different mode.25 This is an issue of alternate forms (equivalence) reliability that changes with format, and the reliability decrements can, in turn, impact validity. Dillman and colleagues39 studied the effect of different administration modes on measurement differences and found that telephone participants were more likely than Web participants to select the extreme positive response categories. Another study showed that collecting data using Web-administered measures resulted in fewer measurement errors with less missing data.40 Krosnick and Chang41 compared the data collected from a Web administration versus telephone questionnaire and found that the data collected through the Internet contained less random errors than telephone-based data as demonstrated by a higher internal consistency reliability coefficient. Two different studies compared the reliability and validity of Web-administered and paper-and-pencil format instruments. Results revealed the equivalence of data collected by both administration modes.42,43

Finally, Web-administered measures should be tested thoroughly to minimize measurement errors and to overcome technical and incompatibility problems.6,7 Schleyer and Forrest8 highly recommend that the testing process should not be limited to pilot testing but should also include "scrutinizing early returns." These authors tested their measure using different browsers, operating systems, Internet service providers, and types of Internet access. No technical problems were found in the pilot test. However, after receiving 130 completed questionnaires, the researchers noticed that all participants selected only two response options (of four options) for one of the items. After tracking the source of the problem, it was found that there was an error in the software that stored answers incorrectly.

Back to Top | Article Outline


It is worth noting that any computerized application used in nursing and health practices, including clinical information systems, can produce paradata to track users' navigation behaviors. In the case of Web-administered measures, principles and directions for developing user-friendly, Web-administered measures are available.7,18 Different software templates with diverse features exist to help design such measures. Promoting the pros and limiting the cons of Web-administered measures require expertise in constructing such measures and the utilization of their distinctive capabilities. More validation using research to refine the principles of designing these measures and to decrease measurement errors is needed.

In nursing and healthcare, Web-administered measures are in the early stages of development. Paradata are "vital partners" to other sorts of data obtained from administering measures via the Web.44 The use of paradata can be of major benefit for evaluating the aspects of a Web-administered measure. The primary goal of measurement is utilizing instruments with high estimates of reliability and validity. Paradata can contribute to this process. Ethical issues when collecting paradata should be addressed in consultation with standards for the protection of the rights of human subjects as determined by an institutional review board and must be handled as appropriate. The effort to create and analyze paradata, particularly in the process of pilot-testing measures, has potential to offer valuable information to researchers in nursing and healthcare.

Failures of clinical information systems are well documented and are mainly related to inability of the systems to meet user expectations in terms of being easy to use and flexible. At the system design phase, paradata can help develop an efficient, user-friendly, and intuitive design that reflects routine of care implemented by users. In addition, these data can be used to modify deficiencies in a developed system. Future uses of paradata, particularly if retrieval becomes easier, offer promise for additional applications with potential benefits within nursing and healthcare.

Back to Top | Article Outline


1. Kehoe C, Pitkow, J. Surveying the territory: GVU's five WWW user surveys. World Wide Web J. 1996;1(3):77-84.

2. Castellan N. Computers and computing in psychology: twenty years of progress and still a bright future. Behav Res Meth Instrum Comput. 1991;23:106-108.

3. Couper M. Web surveys: a review of issues and approaches. Public Opin Q. 2000;64:464-494.

4. Groves R. Survey Errors and Survey Costs. New York: John Wiley & Sons; 1989.

5. Waltz C, Strickland O, Lenz E. Measurement in Nursing and Health Research. 3rd ed. New York: Springer Publishing Company; 2005.

6. Dillman D. Mail and Internet Surveys: The Tailored Method. 2nd ed. New York: John Wiley & Sons; 2000.

7. Lazar J, Preece J. Designing and implementing Web-based survey. J Comput Inform Syst. 1999;39(4):63-67.

8. Schleyer T, Forrest J. Methods for the design and administration of Web-based surveys. J Am Med Inform Assoc. 2002;7:416-425.

9. Couper M. Usability evaluation of computer assisted research instruments. Soc Sci Comput Rev. 2000;18:384-396.

10. Couper M. New technologies and survey data collection: challenges and opportunities. 2002. Paper presented at: International Conference on Improving Surveys; Copenhagen, Denmark. Accessed January 1, 2008.

11. Heerwegh D. Describing response behavior in Web-surveys using client side paradata. 2002. Paper presented at: International Workshop on Web-surveys October 17, 2002; Mannheim, Germany. Accessed February 3, 2008.

12. Heerwegh D. Uses of client side paradata in Web surveys. 2004. Paper presented at: International Symposium in Honour of Paul Lazarsfeld; Brussels, Belgium. Accessed March 4, 2008.

13. Heerwegh D. The CSP Project Web page. 2003.∼u0034437/public/csp.htm. Accessed March 2, 2008.

14. Heerwegh D, Loosveldt G. Radio buttons or select menus: an evaluation of the effect of response formats on data quality in Web surveys. Soc Sci Comput Rev. 2002;20:471-484.

15. Bosnjak M, Tuten T. Classifying response behaviors in Web-based surveys. J Comput Mediat Commun. 2001;6(3).

16. Dillman D, Bowker D. The Web questionnaire challenge to survey methodologists. 2002. Accessed January 10, 2008.

17. Lozar-Manfreda K, Vehovar V. Do mail and Web surveys provide same results. 2002. Paper presented at: Development in Social Science Methodology Conference. Accessed January 15, 2008.

18. Dillman D, Tortora R, Bowker D. Influence of plain versus fancy design on response rate for Web surveys. 1998. Proceedings of Survey Methods Sections Annual Meeting of the American Statistical Association; Dallas, TX. Accessed February 22, 2008.

19. Vehovar V, Lozar-Manfreda K. Design issues in Web survey. Proceeding of the Survey Research Methods Section. Alexandra, VA: American Statistical Association; 2000:983-988.

20. Couper M, Traugott M, Lamias M. Effective survey administration on the Web. Paper presented at: Midwest Association for Public Opinion Research Conference; 1999; Chicago, IL.

21. Dillman D, Christian L. Survey mode as a source of instability in responses across surveys. Field Meth. 2005;17:30-52.

22. Heerwegh D, Loosveldt G. Web surveys: the effect of controlling survey access using PIN numbers. Soc Sci Comput Rev. 2002;20:10-21.

23. Crawford S, Couper M, Lamias M. Web survey: perceptions of burden. Soc Sci Comput Rev. 2001;19:146-162.

24. Bell DS, Kahn CE. Health status assessment via the World Wide Web. Proceedings of the American Medical Informatics Association Annual Fall Symposium. Washington, DC: American Medical Informatics Association; 1996:338-342.

25. Strickland O, Moloney M, Dietrich A, et al. Measurement issues related to data collection on the World Wide Web. Adv Nurs Sci. 2003;26:246-256.

26. Duffy M. The Internet as a research and dissemination resource. Health Promot Int. 2000;15:349-353.

27. Murff H, Kannfry J. Physician satisfaction with two order entry systems. J Am Med Inform Assoc. 2002;8(5):499-511.

28. Koppel R, Metly J, Cohen A, et al. Role of computerized physician order entry systems in facilitating medication errors. JAMA. 2005;293(10):1179-1203.

29. Rozic-Hristovski A, Hristovski D, Todorovski L. Users' information-seeking behavior on a medical library Website. J Med Libr Assoc. 2002;90(2):210-217.

30. Mu¨ller H, Boyer C, Gaudinat A, et al. Analyzing Web log files of the health on the Net HONmedia search engine to define typical image search tasks for image retrieval evaluation. Stud Health Technol Inform. 2007;129(pt 2):1319-1323.

31. Cimino J, Li J, Mendonca E, et al. An evaluation of patient access to their electronic medical records via the World Wide Web. Proc AMIA Annu Symp. 2000:151-155.

32. Borgne-Uguen F, Goffpronost M, Trellu H, et al. 2007 Evaluation of the uses of medical records within a health assistance network. Accessed July 30, 2008.

33. Jalloh O, Waitman L. Improving computerized provider order entry (CPOE) usability by data mining users' queries from access logs. Proc AMIA Annu Symp. 2006:379-383.

34. Rosenbloom S, Geissbuhler A, Dupont W, et al. Effect of CPOE user interface design on user-initiated access to educational and patient information during clinical care. J Am Med Inform Assoc. 2005;12:458-473.

35. Fanikos J, Fiumara K, Baroletti S, et al. Impact of smart infusion technology on administration of anticoagulants (unfractionated heparin, argatroban, lepirudin, and bivalirudin). Am J Cardiol. 2007;99(7):1002-1005.

36. Long AJ, Chang P, Li YC, et al. The use of a CPOE log for the analysis of physicians' behavior when responding to drug-duplication reminders. Int J Med Inform. 2008;77(8):499-506.

37. Florin J, Ehrenberg A, Ehnfors M. Quality of nursing diagnoses: evaluation of an educational intervention. Int J Nurs Terminol Classif. 2005;16(2):33-43.

38. Daley E, McDermott R, Brown K, et al. Conducting Web-based research: a lesson in internet design. Am J Health Behav. 2003;27:116-124.

39. Dillman D, Phelps G, Tortora R, et al. Response rate and measurement differences in mixed mode survey using mail, telephone, interactive voice response, and the Internet. Paper presented at: AAPOR Annual Conference; 2001; Montreal, Quebec; Canada.

40. Stanton J. An empirical assessment of data collection using the Internet. Pers Psychol. 1998;51:709-725.

41. Krosnick J, Chang L. 2001. A comparison of the random digit dialing telephone survey methodology with Internet survey methodology as implemented by Knowledge Networks and Harris Interactive. Accessed June 30, 2008.

42. Davis R. Web-based administration of a personality questionnaire: comparison with traditional methods. Behav Res Meth Instrum Comput. 1999;31:572-577.

43. Buchanan T, Smith J. Using the Internet for psychological research: personality testing on the World-Wide Web. Brit J Psychol. 1999;90:125-144.

44. Jeavons A. Paradata: concepts and applications. Proceedings of the ESOMAR Worldwide Internet Conference Net Effects 4.Barcelona, Spain. The Netherlands: ESOMAR Publications; 2001.


Log files; Paradata; Measurement errors; Web-administered measures

© 2010 Lippincott Williams & Wilkins, Inc.



Article Level Metrics

Search for Similar Articles
You may search for similar articles that contain these same keywords or you may modify the keyword list to augment your search.