New Designs to Fix the Search Engine Bias Problem

ast summer, researchers from Google began reimagining how future search engines might function, summarizing their findings in a paper titled Rethinking Search: Making Domain Experts out of Dilettantes. In that paper, Google researchers explored how innovations in AI––specifically natural language generation––can allow the search engine to play the role of an expert, aggregating information from a variety of sources into one cohesive response to a user’s question. The prospects of such tools are exciting, but there is an underappreciated danger that must be addressed – bias.

The Promise of Natural Language Processing

Companies have been intrigued by the possibilities of incorporating natural language processing (NLP) into searches. Apple’s Siri and Amazon’s Alexa exemplify existing attempts to have a search engine succinctly answer a user’s specific question by pulling from web sources. To date, these systems remain limited – for some queries, they will simply pull from Tweets or Wikipedia pages if the text contains the same syntax as the user’s question.

Google’s vision for future search engines goes a step further. They don’t just pull from one source online. Rather, researchers imagine that this new system––model-based information retrieval––will use advanced NLP systems to synthesize various sources online to provide a custom answer to a question. The Rethinking Search paper illustrates this ambition. When prompting a search engine about the health benefits and risks of red wine, for instance, the AI “would give you a coherent and authoritative answer laying out the evidence for both benefits and risks.”

NLP’s Lingering Bias Problem

NLP technology for these next-gen search engines is still in development. OpenAI’s GPT-3 model leads the field – trained on terabytes of data from the internet and books, the system responds to prompts by the user with remarkably human-like text. GPT-3’s beta was released privately in 2020, and it was recently made public in November of 2021.

While the NLP technical foundations exist for the future search engines envisioned by the Rethinking Search paper, the researchers’ vision remains aspirational. They stress the need for a revamped data cataloging system that can mark which sources are authoritative and which sources shouldn’t be trusted. They also recommend that NLP systems eliminate bias to minimize the reflection of pervasive societal prejudices.

Addressing bias in GPT-3 is paramount. According to a new report released by DisinfoLab, an undergraduate research lab at the College of William & Mary’s Global Research Institute, GPT-3 suffers from high rates of bias toward various identity groups. Not only can GPT-3’s use in a search engine promote biased claims, but it will also likely contribute to the spread of mis- and disinformation. Biased searches lead to biased sources.

DisinfoLab investigated four types of identity bias: gender; sexual orientation and sexuality; race, ethnicity, and nationality; and religion. Across 1,645 text predictions, GPT-3 produced generations that were negative with respect to the subject group 43.83% of the time. For comparison, Google Search generations were negative with respect to the subject group 30.15% of the time.

While both negativity rates are high and should be mitigated, GPT-3’s generation of 13.68% more negative predictions than Google is meaningful, especially at scale. Google processes an estimated 3.5 billion search queries per day. A GPT-3 powered search engine receiving similar traffic (as would be the goal for next-gen search engines) would translate to 478,800,000 more biased generations every day than existing today’s iteration of Google Search. Before building the next generation of search tools off such models, AI and search engine companies alike must address the prejudices and inaccuracies enshrined in their algorithms.

Mitigating Bias

DisinfoLab has proposed several recommendations for mitigating bias in GPT, including active moderation of bias-laden phrases, reevaluating the model’s data training set for future development, and consulting individuals from various at-risk identity groups on mitigating such biases. Each of these measures will contribute to an NLP model better suited to building the next generation of search engines. Still, tech companies should recognize the limits of such models in the context of search. While it is possible to mitigate bias in results, eliminating it is currently beyond us.

First, identity biases are sensitive and complex topics. No AI-powered search engine can serve as an “authoritative source” on such issues. Given existing bias in GPT-3, models that seek to will likely offer a result that is reductionist at best, and outright biased at worst. Second, even beyond identity biases, NLP models may be unable to develop comprehensive and factual results. An MIT Tech Review Article on Rethinking Search, draws awareness to obstacles rooted in NLP model’s training data. These data sets may contain little information on niche topics, leaving the model with a limited number of sources to pull from. Furthermore, data sets tend to be predominantly in English, limiting the capability of such models in non-English queries. With these limitations in mind, developers in this space should carefully evaluate queries to determine when to process the query using next-gen search models vs. existing search models.

Advancements in AI may revolutionize the way we search the web. While the prospects are exciting, we must ensure that these changes are for the better, not for the worse. We need to address bias in NLP models like GPT-3 while recognizing that some topics may be hurt by rethinking search.

About

Aaraj Vij

Aaraj Vij is the Co–Founder of VerbaAI LLC, a software startup dedicated to strengthening our information environment. He previously directed W&M DisinfoLab, leading multidisciplinary research on social media, artificial intelligence, and foreign malign influence campaigns.

About

Thomas Plant

Thomas Plant is an Associate Product Manager at Accrete AI, co–founder of William & Mary’s DisinfoLab, and former Fulbright Scholar. Tom is also a member of the World in 2050 Brain Trust.

About

Jeremy Swack

Jeremy Swack, Technical Director of DisinfoLab, is a sophomore studying Computer Science and Data Science at the College of William and Mary. Using his background in machine learning and data visualization, he researches data driven methods to analyze and predict the spread of disinformation.

About

Megan Hogan

Megan Hogan, Founding Co-Director of DisinfoLab, is a Research Analyst at the Peterson Institute for International Economics, where she works on economic sanctions and tax issues.

The views presented in this article are the author’s own and do not necessarily represent the views of any other organization.

a global affairs media network

www.diplomaticourier.com

Global

Technology