The Internet and the World Wide Web (WWW) revolutionized information and data availability. The early WWW was one-directional; information spread from one website to a multitude of readers. Following this ‘static’ phase, Web 2.0 added user-generated content and social interaction. Social media both increase the scope of available data and provide new channels for sharing information. Facebook, Instagram, Twitter, and blogs are examples with impact on all aspects of modern personal life (e.g. Meetups and online dating) and business (e.g. Amazon) in both developed and developing countries. They create global awareness and interaction.
In medicine, the paternalistic patient–physician relationship has changed as patients inform themselves using information from many online sources including other patients. Health professionals themselves use the WWW to interact with their patients and with their peers. The WWW increasingly accumulates information from and about the public. Using electronic rather than print media, these methods very rapidly disseminate information from a single source to many persons. They are a platform for peer-to-peer (many-to-many) communication and a rich source of data for monitoring population health. Social media support research in two ways – analysis of publicly available WWW information not primarily intended for research and, second, through direct interaction, surveys, and experiments between the public and the researchers.
ROLE OF SOCIAL MEDIA FOR WORKPLACE–LUNG DISEASE INTERACTIONS
Social media's impact upon research about workplace–lung disease interactions has been limited even though it may be particularly valuable in view of the broad scope of relevant information about jobs and environmental exposures (workplace or community) in addition to aspects such as symptoms and biomedical mechanisms. Because the area is new [1▪▪], research publications have been scarce compared with some other areas of allergy and pulmonology. The major areas of application are summarized in Table 1.
Information dissemination is efficient and flexible. Knowledge may be widely disseminated from a single source to a very large number of persons, or shared among professional or worker groups in a democratic many–many relationship. The literature review found few meaningful studies of process or outcome of disseminating occupational lung disease information. Websites often use social media campaigns to direct attention to their sites.
Social media are flexible and substitute videos for traditional text (e.g. YouTube). Our search identified 18 YouTube videos dealing with occupational asthma. These range from traditional lecture format to more interactive approaches. Presenters include university occupational experts (e.g. one of the editors of this issue ) as well as complementary medicine providers such as an Ayuverdic medicine approach for occupational asthma .
EVALUATION OF DISSEMINATION EFFECTIVENESS
Evaluating the effectiveness of information dissemination is challenging as the target group is not well delineated. However, a study  of web-based information about maritime occupational health illustrates research methods, which are also applicable to occupational lung disease. Accessibility of information from social media is one of the most important of the seven metrics described. Others include quantitative measures such as actionability (if information directly leads to action) and readability. The ability to access the health information directly from the landing page and the presence of downloadable resources are also assessed.
Effective evaluation techniques for diffuse educational sharing methods are urgently needed, particularly as materials are often produced with a very ad hoc approach. A systematic approach to designing, implementing, and evaluating a social media intervention for occupational lung disorders was published by Pounds et al.[5▪▪]. A study of ‘social marketing’ to promote respiratory protection among farmers included YouTube videos. They describe a stepwise approach: define the project focus carefully, understand the audience, list specific objectives, apply behavioral theory constructs to design behavior change strategies, and evaluate for both objectives and budget. They separated the respiratory protection device learning into four segments – getting the fit, choosing the right mask, caring for it, and understanding risks. Despite frequent views of the videos, the response rates to evaluation questionnaires were poor.
The NIOSH Science Blog was evaluated by Sublet et al. of NIOSH. When studied, there were 23 000 subscribers; 75 participated in a survey. They were generally appreciative of the blog and provided anecdotes about its utility. The respondents were most frequently from the healthcare industry; none was from other high risks such as mining or construction. This limited analysis describes the utility of a credible blog site, but also demonstrates the need to creatively market to reach those at greatest need.
Two studies provide a framework for assessing credibility of the shared material. One applies to mariners health and the other two study tweets about asthma [4,7].
Social media permit information sharing ranging from technical to personal feelings. A 2011 Pew Survey  found that 80% of Internet users (i.e. 59% of all adults) have looked online for health information and 34% (i.e. 25% of all adults) have read personal commentaries on social media sites. They create unique opportunities for peer-to-peer communication across geographic and professional barriers. Messages may be broadcast broadly, hoping that some interested persons will join, or they may be restricted to individuals who are invited to participate. These are democratic communication methods in which any participant can generate and receive communications. Participants share insights and feelings of those directly impacted by a health or exposure condition rather than relying on technical or professional sources.
Different types of social media provide different options; for example, blogs are created for one-to-many communication. As early as 2010, 11% of Internet users over age 30 maintained a personal blog . Listservs are an older approach in which distribution is usually limited to those who have joined the group, but all members can usually post messages. Forums also provide many-to-many communication and are commonly public. Listservs do not always convey the same sense of community as forums. Some type of membership is required; this may be simple such as a username or detailed profile information may be required. All three provide mostly asynchronous communication in contrast to chat rooms, which are mostly synchronous. Many of these social media sites aim to provide social interaction and are sponsored through advertising from different sources. The emphasis of specific commercial sites may be on obtaining data of marketing use ; for example, the section on reactive airways dysfunction syndrome shares information predominantly about medication use.
DATA ACQUISITION FOR SOCIAL AND/OR EPIDEMIOLOGIC RESEARCH
Research data may be collected by implementing social media-based questionnaires by recruiting research participants for traditional studies. Researchers sometimes critique social media because of uncertainties about the source population and participation rate; they may ignore the potential benefits of wide availability and efficiency. Even traditional interview and written questionnaires depend upon gaining access by implicit or explicit relationships with the particular group studied. Advantages and disadvantages of social media are summarized in Table 2.
Recruitment and selection targets are more difficult to specify as data are typically passively acquired. However, targeted recruitment is possible using geographic limits or other filters. More active recruitment with systems such as Amazon Mechanical Turk (AMT) may employ specific criteria, for example, the presence of asthma and working outside the home for income in a study of work-related asthma [11▪]. Further, the scope and efficiency of collecting items permit acquiring a very large number of records and then only using those meeting appropriate criteria.
Traditionally, an interviewer may seek clarification or encourage completeness at the time of time data acquisition. However, social media are more subject to inconsistent, or ambiguous, or missing data without such direct interaction.
In many traditional methods, the same individual may be interviewed repetitively to assess temporal changes. With social media data, this is not directly possible as there is no direct interaction with the public. However, longitudinal data may sometimes be deduced from posts over time even if not initiated by the researcher [1▪▪].
Data format structure
Traditionally, data are collected in narrowly defined prespecified fields with well delineated input types and defined valid ranges (e.g. ‘Enter age in years’ defines exactly where in the form information will appear and defines a range of valid responses). Conversely, social media typically provide both structured and free text data. Many sites such as Twitter and Facebook provide access to these data through application programming interfaces (APIs). For example, the Twitter Search API provides tweets containing specific keywords and retrieves tweets from up to 7 days before. Limits are set for the free data that are made available.
In addition to information directly included in the content or the associated metadata, systematic analysis of tweet content has been shown to be reasonably accurate in deducing personal characteristics such as occupation, age, and socioeconomic status [1▪▪]. The tweeters’ consistent use of a personal handle permits integrating many posts.
Free text analysis
Manual coding is expensive, requires considerable expertise, and does not scale up to the large data sets available from social media. Fortunately, natural language processing (NLP) techniques can extract and classify relevant information from the free text [12–14]. Many off-the-shelf components provide high-quality processing (e.g. Stanford CoreNLP , GATE , or OpenNLP (https://opennlp.apache.org/). Combined with the available vocabularies in medicine (e.g. unified medical language processing systems ), these techniques make sophisticated analysis possible. For example, we used NLP and clustering to describe topics discussed in asthma-related tweets .
AMT was used in a study [11▪] of work-related asthma by recruiting persons who had asthma and had worked outside the home for income. AMT effectively obtained a national sample within 2 days. The survey included questions about four work–asthma relationships: asthma caused by work (occupational asthma), worsened by work (work exacerbated asthma), interference with work activity (handicap), and workplace changes that may improve functional status (accommodation). The results showed that all four work–asthma interactions were frequent −54% reported accommodation interactions, 36% reported adverse impact on work ability, 73% reported exacerbations, and 5% reported causation. Allowing free text input, unbiased by investigators’ preconceptions, showed that perfumes and odors were quite significant. Thus, the broader crowd sourced approach yielded information not commonly found from traditional methods focusing on either causation or exacerbation using ‘standard’ questions.
Despite the potential, there has been minimal application of Twitter for occupational health or respiratory/allergy matters. The thorough recent review of Sinnenberg et al.[1▪▪] found that Twitter was used most extensively for public health, infectious disease, and behavioral medicine/psychiatry. Furthermore, few of the studies harness the capability to use data other than the literal content
Participatory research and collecting exposome data
In participatory research, participants choose how and when to participate and may also more directly express their goals.
The exposome constitutes the sum total of all exposures (chemical, biologic, psychosocial, and social) of an individual . Integrating such physiologic systems and continuous exposure monitors through social media may markedly enhance occupational asthma and chronic obstructive pulmonary disease (COPD) surveillance and research. Continuous sensors for exposures and physiologic measures on a population basis are becoming available. Physiologic data are acquirable via social media sources for tracking asthma and COPD. The BKSpiro uses Android phones to measure lung function using audio data captured blowing directly on the phone . As compared to clinical data, errors were below 10%.
Crowd source data collection has been widely used in many nonpulmonary areas. Smartphones measure occupational noise exposure . Freifeld et al. review crowdsourced applications for tracking in public health such as an app to track asthma attacks (asthmapolis). Crowdsourced Twitter pictures were used to identify overweight people, finding strong geographic correlation with US obesity rates . Ferdous et al. used a smartphone app to predict perceived stress in the workplace based upon phone usage patterns and a state vector machine approach. Rajeswari and Anantharaman  used a survey to measure stress levels among software developers.
Assessing public concerns
In addition to active data acquisition using social media specifically designed for research purposes, passive approaches are useful. Two studies illustrate how social media provide valuable insights into perceptions and concerns of the general public.
Web activity concerning silicosis was tracked over 11 years by Bragazzi et al.[26▪▪]. They used Google Trends, Wikipedia traffic volumes, Google News, Google Scholar, YouTube, and Twitter in addition to tracking scientific publications. The study demonstrated significant temporal correlation across these media. Both informal and formal (dendrogram) analyses showed clustering of interest; the various measures of scientific production and media coverage clustered together, and public searching behavior was in a distinct cluster.
Harber and Leroy [7,27] obtained a large sample of tweets associated with obstructive lung disease with the Twitter Search API (72 000 tweets with ‘asthma’ and 16 427 with ‘#asthma’) . Two separate analyses show congruent results. Automated analyses of general characteristics and manual coding of content of random subsets were conducted for both subsets.
The majority of such tweets were not generated by individuals but by institutions, many of which were commercial in nature. For example, 59% with ‘#asthma’ contain URLs, suggesting they are institutional rather than personal . Retweets were extremely common. Personal and unique tweets, not retweeted and not containing a URL, constituted only 12% of tweets containing ‘#asthma’ and 37% containing ‘asthma’.
The first 43 805 tweets were assessed using a semiautomated lexical analysis approach to determine the domains of greatest interest . Symptoms, nondrug treatment, drug treatment, and children were commonly identified. Prevention was identified in 2087, environmental concerns in 1651, and occupation in only 226. Even though the American Thoracic Society estimates that 25% of asthmatic patients have work exacerbation and 15% have work causation, workplace factors were rarely considered [28,29].
The studies also groups the likely sources, categorizing into several-news articles, personal, nongovernmental organizations, government organizations, professional organizations, professional individuals, and commercial sources. Of these, sources that have greater face credibility (governmental, nongovernmental, and professionals) were considerably less frequent than the other types .
Social media support new technologies for knowledge generation in addition to just reducing dependence on human coders. For example, data mining may identify previously unsuspected associations. This approach is widely used in commercial ventures, but has not yet been used for occupational lung disease. Such knowledge may include information not visible to the user (metadata). For example, Twitter provides information describing the sender of a message, such as location, number of linked/liked persons, and so on. Depending on the social media and their business model, different amounts of information are available to researchers free of charge. For example, Facebook and LinkedIn put stringent limitations on the information, and only ‘public’ information can be gathered outside of one's own circle.
The enormous size and scope of information and the availability of big data algorithms bring many advantages. However, they are counterbalanced by several limitations. For example, blindly seeking pairwise associations will lead to a large number of false positive findings, and the variable quality requires caution. Some ethical issues remain unresolved.
Occupational lung disease research lags behind many other areas of medicine in its use of social media. Structured intervention trials and direct comparisons of social media versus traditional data collection approaches are needed as well as analytic tools for trend analysis, NLP, and occupational lung disease ontologies. Widespread use of smartphones and social media permits obtaining high-quality data and providing interventions close to the worker/patient as well as detailed information gathered directly from the public. Smartphones can provide precise location and support creative approaches to assess environmental factors. In addition, social media may be leveraged to facilitate translation of research to practice.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
REFERENCES AND RECOMMENDED READING
Papers of particular interest, published within the annual period of review, have been highlighted as:
▪ of special interest
▪▪ of outstanding interest
1▪▪. Sinnenberg L, Buttenheim A, Padrez K, et al. Twitter as a tool for health research: a systematic review. Am J Public Health 2016. e1–e8. e-pub.
This is a thorough review of applications of Twitter for health-related research. Both the explicit content and metadata are useful for research studies. Only one study involved pulmonary or allergy medicine.
2. Tarlo S. Discussing occupational asthma risks and treatments [You Tube]. MD Magazine TV, 2014 [updated 11/9/16; cited 2016]. Available from: https://www.youtube.com/watch?v=1cD9ImMgKJE
. [Accessed 16 November 2016].
4. Guitton MJ. Online maritime health information: an overview of the situation. Int Marit Health 2015; 66:139–144.
5▪▪. Pounds L, Duysen E, Romberger D, et al. Social marketing campaign promoting the use of respiratory protection devices among farmers. J Agromedicine 2014; 19:316–324.
This report provides a systematic approach for developing and objectively evaluating social media-based dissemination for occupational lung disease prevention.
6. Sublet V, Spring C, Howard J, et al. Does social media improve communication? Evaluating the NIOSH science blog. Am J Ind Med 2011; 54:384–394.
7. Leroy G, Harber P, Revere D. Conference on Grey Literature. Public sharing of medical advice using social media: an analysis of Twitter. Amsterdam; 2015.
8. Fox S. The Social Life of Health Information. Washington, DC:Pew Research Center; 2011.
9. Lenhart A, Purcell K, Smith A, et al. Part 3: social media. Social Media and Young Adults Pew Internet and American Life Project [Internet]. Washington, DC:Pew Research Center; 2010.
11▪. Harber P, Leroy G. Assessing work-asthma interaction with Amazon Mechanical Turk. J Occup Environ Med 2015; 57:381–385.
This study employed AMT to quickly obtain a national sample for a study about four types of work–asthma interactions. These were causation, exacerbation, work interference, and accommodation.
13. Russ DE, Ho KY, Colt JS, et al. Computer-based coding of free-text job descriptions to efficiently identify occupations in epidemiological studies. Occup Environ Med 2016; 73:417–424.
14. Burstyn I, Slutsky A, Lee DG, et al. Beyond crosswalks: reliability of exposure assessment following automated coding of free-text job descriptions for occupational epidemiology. Ann Occup Hyg 2014; 58:482–492.
15. Manning CD, Surdeanu M, Bauer J, et al. The Stanford CoreNLP Natural Language Processing Toolkit. 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2014; 55–60.
16. Cunningham H, Maynard D, Bontcheva K, et al. GATE: a framework and graphical development environment for robust NLP tools and applications. 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02); 2002 July; Philadelphia; pp. 168–175.
18. Leroy G, Koolippurackal J, Swami S, et al. Reviewing asthma-related grey literature and personal opinions on Twitter using LDA and CTM clustering. AMIA Fall Symposium. November 2016; Chicago; p 148.
19. Wild CP. The exposome: from concept to utility. Int J Epidemiol 2012; 41:24–32.
20. Tran HA, Ngo QT, Pham HH. An application for diagnosing lung diseases on Android phone. 6th International Symposium on Information and Communication Technology. 2015; Hue City, Viet Nam:328–334.
21. Kardous CA, Shaw PB. Evaluation of smartphone sound measurement applications. J Acoust Soc Am 2014; 135:EL186–EL192.
22. Freifeld CC, Chunara R, Mekaru SR, et al. Participatory epidemiology: use of mobile phones for community-based health reporting. PLoS Med 2010; 7:e1000376.
23. Weber O, Mejova Y. Crowdsourcing health labels: inferring body weight from profile pictures. 6th International Conference on Digital Health. 2016; Montreal, QC, Canada:ACM, 105–109.
24. Ferdous R, Osmani V, Mayora O. Smartphone app usage as a predictor of perceived stress levels at workplace. 9th International Conference on Pervasive Computing Technologies for Healthcare (PervasiveHealth). Istanbul, Turkey; 2015; 225–228.
25. Rajeswari KS, Anantharaman RN. Development of an instrument to measure stress among software professionals: factor analytic study. SIGMIS. 2013; Philadelphia, Pennsylvania:ACM, 34–43.
26▪▪. Bragazzi N, Dini N, Toletone A, et al. Leveraging big data for exploring occupational diseases-related interest at the level of scientific community, media coverage and novel data streams: the example of silicosis as a pilot study. PLoS One 2016; 11:
The investigators innovatively use multiple social media metrics to assess interest in silicosis. Interest of the public as well as scientific reports was assessed. Correlations among social media types and temporal trends were identified. This article is an excellent example of types of available information.
27. Harber P, Leroy G. Social media (Twitter) for assessing concerns about obstructive airway disease. Am J Respir Crit Care Med 2016; 193:A-2009.
28. Henneberger PK, Redlich CA, Callahan DB, et al. An official American Thoracic Society statement: work-exacerbated asthma. Am J Respir Crit Care Med 2011; 184:368–378.
29. Tarlo SM, Balmes J, Balkissoon R, et al. Diagnosis and management of work-related asthma: American College Of Chest Physicians Consensus Statement. Chest 2008; 134 (3 Suppl):1S–41S.