An exploration of the strengths and weaknesses of using text messaging as a tool for self-report data collection in psychological research
Self-report psychological research has stepped out of the laboratory and into people’s mailboxes, telephones, and computers. Consequent strides forward in terms of research scalability, reach, ecological validity, and practical relevance have been hampered by potholes of diminished participant engagement, low response rates, and problems with construct validity. These complications have received substantial scholarly attention, but few have considered the possible remedy of further expanding the psychological researcher’s methodological repertoire (Fahrenberg, Myrtek, Pawlik, & Perrez, 2007).
The technology available for psychological research has dramatically changed in the past decade (Dillman, Smyth, & Christian, 2014; Stern, Bilgen & Dillman, 2014), yet, reviews of self-report data collection tools, particularly the hardware available to support research, are rare and often dated (Ebner-Priemer & Kubiak, 2007). This may be because researchers tend to settle on a certain research approach or methodology, and are resistant to considering a change in approach (Campbell, Mutran, & Parker, 1986). Tellingly, Fahrenberg, Myrtek, Pawlik, and Perrez, (2007) note that participant willingness to use new data collection modes outstrips the willingness of psychological researchers to utilise it in their research. This is not a problem if the well-established data collection modes are adequate. For example, pen and paper is one of the historically oldest, and still one of the most widely used, methods for self-report data collection using surveys (Bolger, Davis, & Rafaeli, 2003; Dillman et al., 2014). However, a lack of consideration for methodological innovation is an issue when the current tools for data collection have shortcomings.
Ecological Momentary Assessment is a group of research methods typically involving the collection of data in a naturalistic setting, often through a repeated measures design. The defining characteristic of Ecological Momentary Assessment is that it occurs outside of laboratory conditions (Ebner-Priemer, Kubiak, & Pawlik, 2009; Fahrenberg, Myrtek, Pawlik, & Perrez, 2007). Though it can consist of either objective (e.g. physiological measures of things such as blood pressure) or self-report measures, the discussion here centres on self-report research. An example of a self-report Ecological Momentary Assessment design would be to ask participants to report their mood as they go about their daily lives.
Although Ecological Momentary Assessment offers unique, meaningful insight into the situational and temporal aspects of psychological processes, it has a reputation for being difficult and expensive to carry out (Bolger et al., 2003; Ebner-Priemer, Kubiak, & Pawlik, 2009). Examination of data collection methodology is potentially the key to the continued growth of Ecological Momentary Assessment research (Shiffman, Stone, & Hufford, 2008). The challenge faced by psychological researchers is to establish whether new self-report data collection modes could open up opportunities for data collected outside of the laboratory, and to assess if they have methodological limitations of their own, in particular help to bring down the methodological barriers to Ecological Momentary Assessment research. If this research mode is to support Ecological Momentary Assessment, it needs to be portable, accessible, convenient, and inexpensive. Paper diaries are portable, but have problems including inability to time stamp responses, and thus susceptibility to undetectable and undesirable retrospective responding (Bolger et al., 2003). Another popular option is digital devices, notably palm-top computers, pre-loaded with survey software (Shiffman et al., 2008). This method time stamps responses, but has its own limitations, such as the devices being expensive, and difficulties gaining physical access to participants in order to provide them with devices and training on how they should be used.
Another clear candidate for self-report hardware is the mobile telephone. With 96% global penetration (ITU, 2015), mobile telephones constitute one of the most exciting and potentially transformational self-report data collection modes available to psychology researchers. Essentially, much of the world’s population are incidentally carrying self-report data collection devices with them in their daily lives.
Mobile phones (also known as cell phones) offer three avenues of communication with participants: Short Message Service (SMS), voice call, and applications (apps, typically available only on smart phones). All three can, and have been used to collect self-report information (e.g. Conner & Reid, 2012; Palen & Salzman, 2002; Tsai, et al., 2007). This is why my PhD dissertation focussed on SMS. Due to the possibility of automating sending, receiving, and aggregation of SMS, it is less time consuming and more scalable than voice calls: sending an SMS to 1 or 100 people requires the same amount of time and effort, but conducting 100 voice interviews entails far more effort than a single interview. As SMS functionality is native to all mobile telephones, it does not require participants to download a study-specific app. This avoids compatibility issues across smartphones (e.g. an Apple app will not run on an Android phone), and reaches participants who own more basic non-smart phones. It further avoids the need for an internet connection, which in turn circumvents the need for participants to be near a wi-fi connection (or use their own data plans, which may raise issues of quota overusage).
Thus far, we have established the need for methodologically focussed research, particularly to support Ecological Momentary Assessment. We have identified mobile telephones, and specifically SMS, as a viable candidate tool for self-report data collection. As a relative newcomer to psychological researcher’s repertoire, the methodological properties of SMS are largely unknown. We next need to specify a disciplinary framework that can anchor investigation of the strengths and weaknesses of SMS as a tool for self-report data collection.
Though uptake of new research modes is slow, the rise of mixed-methods designs which use pre-existing modes in concert (Dillman et al., 2014) shows that researchers are starting to think about surmounting methodological limitations by way of the modes they choose to use. Before a limitation of a mode can be addressed, it needs to be identified. If the data collection mode has been used for some time, information can be gained from observing how the mode has performed in the past, through meta-analyses of methodological properties such as response rates (e.g. Cook, Heath, & Thompson, 2000; Fox, Crask, & Kim, 1988; Shih, 2008). However, the sheer variety of self-report psychological research designs makes it difficult to synthesise a coherent, generalizable view of factors impacting on the efficacy of SMS as used in a research context. A particular response rate for a particular study may be influenced by many factors, including but not limited to response mode, the incentive offered, or the duration or topic of the study (e.g. Heberlein & Baumgartner, 1978; Lee & Renzetti, 1990). Previous literature is also a challenging source of information if a data collection mode is new, as there is a limited pool of completed studies to draw from.
Though reflection on previous usage of a particular data collection mode is a useful first step, it needs to be followed by rigorous, methodologically-focussed investigation of the strengths and weaknesses of that mode. A conceptually coherent scientific background is required to guide such an investigation. Though at times theory will be borrowed from computer science, science communication, and sociology, I grounded my work primarily within the discipline of psychology. Aside from the allure of sticking with what I knew, psychology has a long, if intermittent, history of research into methodological considerations in self-report data collection. Paradigm shifts from face-to-face interviews to telephone interviews, and from paper surveys to online surveys, have been accompanied by peaks of interest in methodology (Dillman et al., 2014). Such research tends to focus on past performance via structured reviews or meta-analyses (e.g. Cook et al., 2000; Shih, 2008). There are some rare but heartening examples of methodologically-focussed investigations to evaluate particular properties of tools for data collection. For example, Lim, Sacks-Davis, Aitken, Hocking, and Hellard (2010) conducted a randomised controlled trial of the efficacy of three data collection tools (paper, online and SMS) for collecting sexual health information from young individuals.
Alongside the comparatively small self-report research methods literature, a parallel psychology literature explores the impact of other methodological concerns, such as question wording (e.g. Fisher, 1993; Warnecke et.al, 1997), question order effects (e.g. McFarland, 1981; Moore, 2002) and cross-language measurement invariance on how participants engage with and respond to research (e.g. Brislin, 1970). The underlying logic, design and analytical techniques of the broader research methods literature are applicable to the systematic investigation of specific properties of a research mode (Dillman et al., 2014; Schwarz, Strack, Hippler, & Bishop, 1991; Vandenberg & Lance, 2000). Following from this literature, a structure for investigating the properties of a data collection tool becomes clear: look back to see how the mode has been used, and then draw from the parallel psychological literature for theory, design, and analytical techniques to guide subsequent correlational and experimental investigation. This is the approach taken in this dissertation.
My dissertation focussed on three (and later, four; though it could have expanded into an untenable never-ending mass of enquiry) key research questions. Although the specific wording of the questions was decided upon at a relatively late stage, one study to address the essence of each question was planned in advance. Investigation began by exploring each question in the order presented here, however, these first studies soon fell out of sequence due to time differences in the ethical application and data collection processes. Further, they quickly revealed more knowledge gaps and additional research questions, which coupled with some unexpected opportunities for working with specific groups (such as the Canberra Deaf community), led to a proliferation of studies that were later grouped back under the four research questions.
The final section was a typical example of this. The single paper initially planned to address how SMS compared to other data collection modes was SMS = Send My Survey: Short Message Service for Longitudinal Research. After the initial literature review, it became clear that standards of measurement invariance were far stronger in the cross-language than cross-mode literatures. Because there was no room to build this consideration in to Send My Survey, Applying cross-language principles to cross-mode measurement invariances was developed. This required lengthy process of translating an English-language instrument which had previously worked well when administered via SMS (the Ruminative Thought Styles Questionnaire) into a language that would provide a bilingual sample of reasonable size within the available undergraduate population (Chinese). By the time this process was complete, an unexpected opportunity to work with a collaborator with a subscription to self-report App software led to a third study being developed, run, and written-up. While conducting data collection for this third study, there was also scope to address some of the other four key research questions.
And so it could have ballooned onwards and upwards, and I could have happily chased methodological questions until the metaphorical cows came home and old age came knocking with my frantically typing hands. But I was firmly informed by my colleagues (and long-suffering supervisor) that a PhD has to stop at some point. Digging deep rendered things down to just four questions, which formed the chapters of the thesis (and hence the skeleton of these sections). The questions were as follows.
1) How is SMS currently being used for research?
Understanding how SMS is being used for research will give context to the ensuing investigation, and highlight knowledge gaps. This question goes beyond psychology research because the way in which researchers use SMS in similar fields (such as epidemiology, sociology or medicine) could prove informative. It also does not specify self-report research. SMS is used for a highly diverse number of purposes in everyday life, and it stands to reason that it would also be useful for more than self-report data collection in research. Enumerating these alternative uses is informative for two reasons. First, it will allow examination of how common it is for researchers to use SMS specifically to collect self-report data, against how common it is for them to employ SMS in some way. Secondly, use of SMS for other purposes in research could provide useful information when it comes to self-report data collection. For example, the literature surrounding the use of SMS as a reminder to take medication or attend an appointment may have insights applicable to using SMS as a reminder prompt to complete a self-report questionnaire.
2) Are people able, ready and willing to become research participants using SMS?
The use of SMS as a self-report research mode requires successful recruitment of participants, and their subsequent engagement with SMS. It may be the case that participants are technologically unable or simply unwilling to participate in self-report psychological research using SMS. If this is true, questions surrounding completeness and validity of their responses become moot, because attempts to collect data via SMS would be fruitless. Alternatively, there may be some technological barriers, and scope to convince hesitant participants to provide self-report data via SMS. If this is true, the focus in following investigation should be on the techniques and research designs that may surmount these barriers and persuade participants to at least try participating in self-report data via SMS, before moving on to examining the resultant data quality. Finally, there may be no technological barriers, and participants may be willing to respond to self-report research questions using SMS. If this is true, then subsequent investigation can jumps straight to questions of data quality or validity. In this way, establishing ability, readiness and willingness to participate in research via SMS sets the research agenda for subsequent exploration of the properties of SMS as a tool for self-report psychological research.
3) How should a researcher design an SMS self-report study?
This research question draws from answers to questions (1) and (2), and begins to address the practicalities of conducting research via SMS. The question is first addressed with questions and methods the previous literature suggests may be effective, and a focus on how to work with participants: examining how SMS performs (and how this performance may be improved) in terms of recruitment, response rates and response behaviour. Then, the focus turns to the potential limits of SMS not previously explored in the literature, primarily how much information can be exchanged via SMS in a self-report research setting.
4) How does SMS compare with other tools for data collection?
This was initially planned as a subset of the third question, but results regarding participant perceptions of SMS emphasised the importance of situating SMS amongst other available research modes. It became clear that viewing the strengths and weaknesses of SMS in isolation is provides only part of the picture. The modern researcher can choose from many modes for collecting self-report data (Dillman, Smyth, & Christian, 2014). Why should a researcher consider SMS when they are already comfortable using online surveys, or automated telephone interviews? Having investigated the strengths and weaknesses of SMS in isolation, the final important question is therefore the strengths and weaknesses of SMS as a tool for data collection, relative to other research modes.
And so I embarked on the first systematic, methodologically-focussed investigation of the strengths and weaknesses of using SMS as a tool for self-report psychological research.
My aim was to make the methodological properties of this relatively new research mode explicit, and situate it in the context of the other modes available to psychology researchers, in particular those seeking to conduct Ecological Momentary Assessment. Also, to end up legitimately updating my business card to have "Dr." on it.
Aim # 2 was achieved (call me Dr. Walsh, yes, yeeees). Aim #1 is a work in progress. May the methodological meanderings to come interest, entertain, or bemuse you.
Bolger, N., Davis, A., & Rafaeli, E. (2003). Diary methods: Capturing life as it is lived. Annual Review of Psychology, 54(1), 579–616. http://doi.org/10.1146/annurev.psych.54.101601.145030
Brislin, R. W. (1970). Back-Translation for Cross-Cultural Research. Journal of Cross-Cultural Psychology, 1(3), 185–216. http://doi.org/10.1177/135910457000100301
Campbell, R. T., Mutran, E., & Parker, R. N. (1986). Longitudinal Design and Longitudinal Analysis: A Comparison of Three Approaches. Research on Aging, 8(4), 480–502. http://doi.org/10.1177/0164027586008004003
Conner, T. S., & Reid, K. A. (2012). Effects of intensive mobile happiness reporting in daily life. Social Psychological and Personality Science, 3(3), 315 – 323.
Cook, C., Heath, F., & Thompson, R. L. (2000). A Meta-Analysis of Response Rates in Web- or Internet-Based Surveys. Educational and Psychological Measurement, 60(6), 821–836. http://doi.org/10.1177/00131640021970934
Dillman, D. A., Smyth, J. D., & Christian, L. M. (2014). Internet, Mail, and Mixed-Mode Surveys (Fourth). New Jersey: John Wiley & Sons.
Ebner-Priemer, U. W., & Kubiak, T. (2007). Psychological and Psychophysiological Ambulatory Monitoring. European Journal of Psychological Assessment, 23(4), 214–226. http://doi.org/10.1027/1015-57126.96.36.199
Ebner-Priemer, U. W., Kubiak, T., & Pawlik, K. (2009). Ambulatory Assessment. European Psychologist, 14(2), 95–97. http://doi.org/10.1027/1016-9040.14.2.95
Fahrenberg, J., Myrtek, M., Pawlik, K., & Perrez, M. (2007). Ambulatory Assessment - Monitoring Behavior in Daily Life Settings. European Journal of Psychological Assessment, 23(4), 206–213. http://doi.org/10.1027/1015-57188.8.131.52
Fisher, R. J. (1993). Social desirability bias and the validity of indirect questioning. Journal of consumer research, 303-315.
Fox, R., Crask, M., & Kim, J. (1988). Mail survey response rate a meta-analysis of selected techniques for inducing response. Public Opinion Quarterly, 467–491. Retrieved from http://poq.oxfordjournals.org/content/52/4/467.short
Heberlein, T. A., & Baumgartner, R. (1978). Factors affecting response rates to mailed questionnaires: a quantitative analysis of the published literature. American Sociological Review, 43(4), 447 – 462.
ITU (2015) ICT Facts and Figures – The world in 2015. Accessed June 2015, http://www.itu.int/en/ITU-D/Statistics/Pages/facts/default.aspx
Lee, R. M., & Renzetti, C. M. (1990). The Problems of Researching Sensitive Topics: An Overview and Introduction. American Behavioral Scientist, 33(5), 510–528. http://doi.org/10.1177/0002764290033005002
Lim, M. S. C., Sacks-Davis, R., Aitken, C. K., Hocking, J. S., & Hellard, M. E. (2010). Randomised Controlled Trial of Paper, Online and SMS Diaries for Collecting Sexual Behavior Information from Young People. Journal of Epidemiology and Community Health, 64(10), 885–889. http://doi.org/10.1136/jech.2008.085316
McFarland, S. G. (1981). Effects of question order on survey responses. Public Opinion Quarterly, 45(2), 208-215.
Moore, D. W. (2002). Measuring new types of question-order effects: Additive and subtractive. Public Opinion Quarterly, 80-91. Palen, L., Salzman, M., & Youngs, E. (2000). Going Wireless : Behavior & Practice of New Mobile Phone Users.
Schwarz, N., Strack, F., Hippler, H.J., & Bishop, G. (1991). The impact of administration mode on response effects in survey measurement. Applied Cognitive Psychology, 5(3), 193–212. http://doi.org/10.1002/acp.2350050304
Shiffman, S., Stone, A. a, & Hufford, M. R. (2008). Ecological momentary assessment. Annual Review of Clinical Psychology, 4, 1–32. http://doi.org/10.1146/annurev.clinpsy.3.022806.091415
Shih, T.H. (2008). Comparing Response Rates from Web and Mail Surveys: A Meta-Analysis. Field Methods, 20(3), 249–271. http://doi.org/10.1177/1525822X08317085
Stern, M. J., Bilgen, I., & Dillman, D. A. (2014). The State of Survey Methodology Challenges, Dilemmas, and New Frontiers in the Era of the Tailored Design. Field Methods, 26(3), 284-301.
Tsai, C. C., Lee, G., Raab, F., Norman, G. J., Sohn, T., Griswold, W. G., & Patrick, K. (2007). Usability and Feasibility of PmEB: A Mobile Phone Application for Monitoring Real Time Caloric Balance. Mobile Networks and Applications, 12(2-3), 173–184. doi:10.1007/s11036-007-0014-4
Warnecke, R. B., Johnson, T. P., Chavez, N., Sudman, S., O'rourke, D. P., Lacey, L., & Horm, J. (1997). Improving question wording in surveys of culturally diverse populations. Annals of epidemiology, 7(5), 334-342.
Vandenberg, R. J., & Lance, C. E. (2000). A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, 3(1), 4–70. http://doi.org/10.1177/109442810031002