Skip to main content

How University Experts Use IT and Supercomputers to Fight COVID-19 Misinformation

Zeroes and ones form a vortex with light at the center
(Image credit: iStock/Mike_Kiev)

When it comes to information about COVID-19, you can have too much of a good thing, says Dr. Ian Brooks. “[There is an] overwhelming amount of information that is accompanying the pandemic, some of it good, some of it bad, some of it misinformation,” says Brooks, director of the Center for Health Informatics at the University of Illinois.

To help fight against the confusion caused by this “infodemic” and the onslaught of fake news, Brooks’ and his center have teamed with the World Health Organization and Pan-American Health Organization to research misinformation through data and IT. This is one of many instances in which higher ed researchers are using big data, algorithms, and supercomputing to help the public health response to the pandemic. 

Traversing The Twitterverse  

At the University of Illinois, Brooks leads a team of 31 volunteers working on seven projects. These projects are managed remotely through EduSourced, a software platform designed to provide career readiness through experiential and project-based learning. 

The team’s focus so far has been on social media communication. “We’re seeing what people, particularly in the Americas are saying online, what they’re searching for through Google search trends,” says Brooks. His team has access to a Twitter resource that provides 100% of tweets.

To narrow down the data set of more than 100 million tweets, Brooks and his team use advanced techniques. “We’ve been using some built-in analytics, some automated sentiment analysis, and automated emotional analysis based on machine learning and AI techniques,” he says. 

Specifically, they have analysed topics such as the different terms people use to describe face coverings. “In English, it’s one or two terms; in Spanish it’s almost a dozen different terms that people are using,” Brooks says. The team has also analyzed the information that is being distributed by public health officials. “We looked at all the posts from the 48 ministries of health in the Americas, that’s all the countries and territories, and what they were saying compared to what the people in their countries were saying and found that there was quite a lot of difference, so that was communicated back to the ministries of health.” 

The hope is that this type of information will help public health ministries develop more effective public health strategies in the future. 

David Comisford, Founder & CEO of EduSourced, says his company’s software is perfect for this type of work because it allows students to get hands-on experiential training. “Our system has some preset content and settings, so you can really start plugging in your projects with students and hit the go button,” Comisford says. “That’s part of the value of our system.”  

University Supercomputing  

Before researchers can effectively analyze tweets, they need access, and that’s where supercomputing comes in. The Texas Advanced Computing Center at The University of Texas at Austin is able to analyse roughly 40 million tweets per day. Researchers at TACC have combined their efforts with groups at the University of Southern California and Georgia State University to create collections of pandemic-related tweets going back to January. That work has resulted in GitHub repositories, from which researchers can access raw-data and large-scale analyzes. 

“Our project is focusing on collecting tweets that might be relevant to COVID-19,” says Dr. Weijia Xu, who leads the center’s Data Mining & Statistics group. “We have used a list of keywords including ‘COVID,’ ‘coronavirus,’ ‘Chinese virus,’ ‘school closure,’ ‘school closed,’ ‘food scarcity,’ ‘water contamination,’ and ‘reopen business.’” 

“Our goal for this project is to accumulate a set of tweets to support research work, such as potential misinformation in the social media and information flow,” he adds. “This research has the potential to improve the effectiveness and accuracy of passing publication information.” 

TACC is home to the Frontera supercomputer, which was launched in 2019, and is the 8th-most powerful in the world and the most powerful on a university campus. In addition to building databases of tweets, TACC supercomputers are being used to model the effects of COVID-19 interventions, simulate molecular behavior of the virus, screen new treatments for the disease, and visualize and interactively share data with decision-makers and collaborators to speed progress against the virus. 

Staying On Top of Your Campus Reponse 

Concerns about how misinformation spreads are not limited to health officials. With many college campuses attempting to reopen with students this fall, it’s important for administrators to make sure accurate and effective public health messages are distributed through campus. 

Brooks says the same public messaging principles apply on campus as elsewhere. “You have to be open,” he says. “You have to be consistent. You have to be honest. There are lots of things that we still don’t know and it’s okay to say that. It’s better to say that than it is to give a false impression that everything is totally chaos or everything is under control.”