+1 Recommend
1 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Solution to Detect, Classify, and Report Illicit Online Marketing and Sales of Controlled Substances via Twitter: Using Machine Learning and Web Forensics to Combat Digital Opioid Access

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.



          On December 6 and 7, 2017, the US Department of Health and Human Services (HHS) hosted its first Code-a-Thon event aimed at leveraging technology and data-driven solutions to help combat the opioid epidemic. The authors—an interdisciplinary team from academia, the private sector, and the US Centers for Disease Control and Prevention—participated in the Code-a-Thon as part of the prevention track.


          The aim of this study was to develop and deploy a methodology using machine learning to accurately detect the marketing and sale of opioids by illicit online sellers via Twitter as part of participation at the HHS Opioid Code-a-Thon event.


          Tweets were collected from the Twitter public application programming interface stream filtered for common prescription opioid keywords in conjunction with participation in the Code-a-Thon from November 15, 2017 to December 5, 2017. An unsupervised machine learning–based approach was developed and used during the Code-a-Thon competition (24 hours) to obtain a summary of the content of the tweets to isolate those clusters associated with illegal online marketing and sale using a biterm topic model (BTM). After isolating relevant tweets, hyperlinks associated with these tweets were reviewed to assess the characteristics of illegal online sellers.


          We collected and analyzed 213,041 tweets over the course of the Code-a-Thon containing keywords codeine, percocet, vicodin, oxycontin, oxycodone, fentanyl, and hydrocodone. Using BTM, 0.32% (692/213,041) tweets were identified as being associated with illegal online marketing and sale of prescription opioids. After removing duplicates and dead links, we identified 34 unique “live” tweets, with 44% (15/34) directing consumers to illicit online pharmacies, 32% (11/34) linked to individual drug sellers, and 21% (7/34) used by marketing affiliates. In addition to offering the “no prescription” sale of opioids, many of these vendors also sold other controlled substances and illicit drugs.


          The results of this study are in line with prior studies that have identified social media platforms, including Twitter, as a potential conduit for supply and sale of illicit opioids. To translate these results into action, authors also developed a prototype wireframe for the purposes of detecting, classifying, and reporting illicit online pharmacy tweets selling controlled substances illegally to the US Food and Drug Administration and the US Drug Enforcement Agency. Further development of solutions based on these methods has the potential to proactively alert regulators and law enforcement agencies of illegal opioid sales, while also making the online environment safer for the public.

          Related collections

          Most cited references 40

          • Record: found
          • Abstract: not found
          • Article: not found

          The Role of Science in Addressing the Opioid Crisis

            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Quality of Online Pharmacies and Websites Selling Prescription Drugs: A Systematic Review

            Background Online pharmacies are companies that sell pharmaceutical preparations, including prescription-only drugs, on the Internet. Very little is known about this phenomenon because many online pharmacies operate from remote countries, where legal bases and business practices are largely inaccessible to international research. Objective The aim of the study was to perform an up-to-date and comprehensive review of the scientific literature focusing on the broader picture of online pharmacies by scanning several scientific and institutional databases, with no publication time limits. Methods We searched 4 electronic databases up to January 2011 and the gray literature on the Internet using the Google search engine and its tool Google Scholar. We also investigated the official websites of institutional agencies (World Health Organization, and US and European centers for disease control and drug regulation authorities). We focused specifically on online pharmacies offering prescription-only drugs. We decided to analyze and report only articles with original data, in order to review all the available data regarding online pharmacies and their usage. Results We selected 193 relevant articles: 76 articles with original data, and 117 articles without original data (editorials, regulation articles, or the like) including 5 reviews. The articles with original data cover samples of online pharmacies in 47 cases, online drug purchases in 13, consumer characteristics in 15, and case reports on adverse effects of online drugs in 12. The studies show that random samples with no specific limits to prescription requirements found that at least some websites sold drugs without a prescription and that an online questionnaire was a frequent tool to replace prescription. Data about geographical characteristics show that this information can be concealed in many websites. The analysis of drug offer showed that online a consumer can get virtually everything. Regarding quality of drugs, researchers very often found inappropriate packaging and labeling, whereas the chemical composition usually was not as expected in a minority of the studies’ samples. Regarding consumers, the majority of studies found that not more than 6% of the samples had bought drugs online. Conclusions Online pharmacies are an important phenomenon that is continuing to spread, despite partial regulation, due to intrinsic difficulties linked to the impalpable and evanescent nature of the Web and its global dimension. To enhance the benefits and minimize the risks of online pharmacies, a 2-level approach could be adopted. The first level should focus on policy, with laws regulating the phenomenon at an international level. The second level needs to focus on the individual. This approach should aim to increase health literacy, required for making appropriate health choices, recognizing risks and making the most of the multitude of opportunities offered by the world of medicine 2.0.
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              E-Cigarette Surveillance With Social Media Data: Social Bots, Emerging Topics, and Trends

              Background As e-cigarette use rapidly increases in popularity, data from online social systems (Twitter, Instagram, Google Web Search) can be used to capture and describe the social and environmental context in which individuals use, perceive, and are marketed this tobacco product. Social media data may serve as a massive focus group where people organically discuss e-cigarettes unprimed by a researcher, without instrument bias, captured in near real time and at low costs. Objective This study documents e-cigarette–related discussions on Twitter, describing themes of conversations and locations where Twitter users often discuss e-cigarettes, to identify priority areas for e-cigarette education campaigns. Additionally, this study demonstrates the importance of distinguishing between social bots and human users when attempting to understand public health–related behaviors and attitudes. Methods E-cigarette–related posts on Twitter (N=6,185,153) were collected from December 24, 2016, to April 21, 2017. Techniques drawn from network science were used to determine discussions of e-cigarettes by describing which hashtags co-occur (concept clusters) in a Twitter network. Posts and metadata were used to describe where geographically e-cigarette–related discussions in the United States occurred. Machine learning models were used to distinguish between Twitter posts reflecting attitudes and behaviors of genuine human users from those of social bots. Odds ratios were computed from 2x2 contingency tables to detect if hashtags varied by source (social bot vs human user) using the Fisher exact test to determine statistical significance. Results Clusters found in the corpus of hashtags from human users included behaviors (eg, #vaping), vaping identity (eg, #vapelife), and vaping community (eg, #vapenation). Additional clusters included products (eg, #eliquids), dual tobacco use (eg, #hookah), and polysubstance use (eg, #marijuana). Clusters found in the corpus of hashtags from social bots included health (eg, #health), smoking cessation (eg, #quitsmoking), and new products (eg, #ismog). Social bots were significantly more likely to post hashtags that referenced smoking cessation and new products compared to human users. The volume of tweets was highest in the Mid-Atlantic (eg, Pennsylvania, New Jersey, Maryland, and New York), followed by the West Coast and Southwest (eg, California, Arizona and Nevada). Conclusions Social media data may be used to complement and extend the surveillance of health behaviors including tobacco product use. Public health researchers could harness these data and methods to identify new products or devices. Furthermore, findings from this study demonstrate the importance of distinguishing between Twitter posts from social bots and humans when attempting to understand attitudes and behaviors. Social bots may be used to perpetuate the idea that e-cigarettes are helpful in cessation and to promote new products as they enter the marketplace.

                Author and article information

                J Med Internet Res
                J. Med. Internet Res
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                April 2018
                27 April 2018
                : 20
                : 4
                1 Division of Infectious Disease and Global Public Health Department of Anesthesiology, School of Medicine University of California San Diego La Jolla, CA United States
                2 Department of Electrical and Computer Engineering Jacobs School of Engineering University of California San Diego La Jolla, CA United States
                3 IBM Global Business Services Washington, DC United States
                4 Centers for Disease Control and Prevention Atlanta, GA United States
                Author notes
                Corresponding Author: Tim Mackey tmackey@ 123456ucsd.edu
                ©Tim Mackey, Janani Kalyanam, Josh Klugman, Ella Kuzmenko, Rashmi Gupta. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 27.04.2018.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                Original Paper
                Original Paper


                Comment on this article