Project description
In 2016, the public began to hear about Twitter being used to influence public opinion. Though the influence aspect was surfacing, the word fakenews had not entered the social lexicon. But, in 2022, the word fakenews and the effect of such fully entered the public consciousness. Today, social media is now a powerful influence in many aspects of modern-day life. Though media and technology have always been used to disseminate ideologies in history, the speed and the spread of which takes shape on social media today is unprecedented. One example of disinformation’s powerful impact on social media is its ability to sway public opinion. Election meddling significantly impacts the democratic processes. This new social condition is further complicated by online astroturfing efforts, where computer programs (aka automated malicious social bots) that “take on inauthentic personas and are controlled by unknown entities” are specifically designed to deceive and create the appearance of a grassroots movement.
ECHO was developed in 2020 to track disinformation spread on Twitter - it live-tracked virality, visualized influence dynamics, and identified astroturfing efforts during the 2020 U.S. Presidential Election. We strategically combined the Twitter Premium search API and the Standard search API to keep data collection costs affordable while achieving high data fidelity. The tracking mechanism also included the identification of malicious Twitterbot campaigns, and real-time visualized the disinformation spread and its associated top influencers. Ninety percent of the dataset retained the full original data fidelity of the Twitter landscape at the time.
Dataset:
The project concluded with a significant dataset consisting of 204 disinformation topics from April 28, 2020, to February 20, 2021, 2,623,792 Tweets, 1,030,066 users, and 2,365,660 incidences of user interactions. There are 103,879 accounts found displaying a high probability of being fully automated social bots carrying out astroturfing efforts designed to deceive and create the appearance of a grassroots movement. We also captured these Tweets before Twitter performed mass deletion of major disinformation spreaders in January 2021 after the Capitol insurrection. The record contains the now-deleted @realdonaldtrump and 70,000 QAnon accounts and the Tweets they posted.
Visualization:
Each disinformation topic includes three sets of charts: a Seismograph that monitors live Twitter activities on the topic; a set of Influencer Bar Charts that show the timestamps top influencers appear on the network and the likelihood an influencer is a Twitter bot; and an animated Social Diagram which maps out user interactions on the topic. Live data was downloaded from Twitter to the backend database every 15 minutes; analytics and visualizations were then refreshed on the frontend accordingly.
Process:
Step 1: Curation of Journalist-verified Disinformation
Each week, we manually curated a handful set of election-related journalist-verified disinformation from nonpartisan fact-checking websites, such as PolitiFact, Snopes, FactCheck, Truth or Fiction, Hoax Slayer, and Urban Legends.
Step 2: Data Collection
Topics were then live queried on Twitter using keyword iteration methods and the effective use of API-specific operators.
Step 3: Data Analysis
We then performed data analyses to find top influencers in the topic. A top influencer was defined as those who receive Influence Scores of greater than the top five percentile of all tweets collected in a topic. Our Influence Score was calculated by the sum of all retweets, replies, quotes, and faves a Twitter account receives. We also ran “botness" calculations by using the Botometer Pro API developed by Indiana University's Observatory on Social Media (OSoMe). For each topic, we indicated when these top influencers appear and how likely the user was a human versus a Twitter bot.
Step 4: Data Visualization
Our tracking page charted these analyses and updated them as new Twitter activities occurs. All activities were then registered on the charts as soon as backend calculations are completed. A time countdown bar on the top of the page indicated time left before the next time data query occured.
Keywords:
2020 US election, disinformation, fake news spread, Twitter, social media, astroturfing, tracking, visualization, truth-telling, historical artifact, public access, American history, Tweet database
ECHO was developed in 2020 to track disinformation spread on Twitter - it live-tracked virality, visualized influence dynamics, and identified astroturfing efforts during the 2020 U.S. Presidential Election. We strategically combined the Twitter Premium search API and the Standard search API to keep data collection costs affordable while achieving high data fidelity. The tracking mechanism also included the identification of malicious Twitterbot campaigns, and real-time visualized the disinformation spread and its associated top influencers. Ninety percent of the dataset retained the full original data fidelity of the Twitter landscape at the time.
Dataset:
The project concluded with a significant dataset consisting of 204 disinformation topics from April 28, 2020, to February 20, 2021, 2,623,792 Tweets, 1,030,066 users, and 2,365,660 incidences of user interactions. There are 103,879 accounts found displaying a high probability of being fully automated social bots carrying out astroturfing efforts designed to deceive and create the appearance of a grassroots movement. We also captured these Tweets before Twitter performed mass deletion of major disinformation spreaders in January 2021 after the Capitol insurrection. The record contains the now-deleted @realdonaldtrump and 70,000 QAnon accounts and the Tweets they posted.
Visualization:
Each disinformation topic includes three sets of charts: a Seismograph that monitors live Twitter activities on the topic; a set of Influencer Bar Charts that show the timestamps top influencers appear on the network and the likelihood an influencer is a Twitter bot; and an animated Social Diagram which maps out user interactions on the topic. Live data was downloaded from Twitter to the backend database every 15 minutes; analytics and visualizations were then refreshed on the frontend accordingly.
Process:
Step 1: Curation of Journalist-verified Disinformation
Each week, we manually curated a handful set of election-related journalist-verified disinformation from nonpartisan fact-checking websites, such as PolitiFact, Snopes, FactCheck, Truth or Fiction, Hoax Slayer, and Urban Legends.
Step 2: Data Collection
Topics were then live queried on Twitter using keyword iteration methods and the effective use of API-specific operators.
Step 3: Data Analysis
We then performed data analyses to find top influencers in the topic. A top influencer was defined as those who receive Influence Scores of greater than the top five percentile of all tweets collected in a topic. Our Influence Score was calculated by the sum of all retweets, replies, quotes, and faves a Twitter account receives. We also ran “botness" calculations by using the Botometer Pro API developed by Indiana University's Observatory on Social Media (OSoMe). For each topic, we indicated when these top influencers appear and how likely the user was a human versus a Twitter bot.
Step 4: Data Visualization
Our tracking page charted these analyses and updated them as new Twitter activities occurs. All activities were then registered on the charts as soon as backend calculations are completed. A time countdown bar on the top of the page indicated time left before the next time data query occured.
Keywords:
2020 US election, disinformation, fake news spread, Twitter, social media, astroturfing, tracking, visualization, truth-telling, historical artifact, public access, American history, Tweet database
Principal Investigator: Jiayi Young (Associate Professor, Department of Design, University of California, Davis)
Political Science: Amber Boydstun (Professor, Political Science, University of California, Davis)
Engineering: Scott Gong (M.S. student), Taha Bouchoucha (Ph.D student), Zhi Ding (Professor, Electrical and Computer Engineering, University of California, Davis)
Graphic Design: Olivia Kotlarek, Gale Okumura (advisory)
Consultants: Luca Hammer, Shih-Wen Young
Political Science: Amber Boydstun (Professor, Political Science, University of California, Davis)
Engineering: Scott Gong (M.S. student), Taha Bouchoucha (Ph.D student), Zhi Ding (Professor, Electrical and Computer Engineering, University of California, Davis)
Graphic Design: Olivia Kotlarek, Gale Okumura (advisory)
Consultants: Luca Hammer, Shih-Wen Young