We all want to win our N.C.A.A. tournament bracket pools. Could Microsoft’s new Bing chatbot help?
After all, chatbots seem able to do anything these days. The Bing bot is frequently impressive (if occasionally creepy) and is based on the new GPT-4 artificial intelligence system. Unlike its cousin ChatGPT, Bing has access to information about this year’s teams, and it seems more willing to make predictions.
Our colleague Sarah Lyall engaged in a long exchange with Bing about college basketball, but it wouldn’t churn out an entire bracket’s worth of predictions in a single response. So, we asked for its picks round by round, region by region.
What about Duke vs. Oral Roberts?
Duke vs. Oral Roberts: I would pick Duke as they are a No. 5 seed and have been on a nine-game winning streak that included an ACC Tournament title. Oral Roberts is a No. 12 seed and the Summit League Tournament champion, but they have not faced many tough opponents this season. Duke also has an advantage in offensive rebounding and height, which could make it hard for Oral Roberts to keep up with them.
Then we filled out our own bracket, using Bing’s responses. Here’s what it recommended for the men’s bracket:
The bot won’t respond the same way every time, and the phrasing of a question matters. Simply asking which team is more likely to win each game might yield a bracket with no upsets. Instead, we asked Bing to try to win a bracket pool while accounting for potential upsets. We also kept reminding it to use information about this year’s teams, since it would often refer to past seasons or players despite its ability to incorporate more current details.
The resulting men’s bracket did contain upsets, including No. 3-seeded Baylor winning the championship. Here’s how the bot explained its pick of Baylor overcoming the tournament’s top seed, Alabama:
Let’s say that Alabama plays Baylor in the Elite Eight of the 2023 NCAA men’s basketball tournament. Alabama is a 1-seed, while Baylor is a 3-seed. Make a selection for this game, as if you were filling out a bracket in an office pool, attempting to win the pool while accounting for potential upsets. Be sure to use information about the teams and their seedings in the 2023 tournament, not previous years!
This is a tough one, but I would go with Baylor to upset Alabama and reach the Final Four. Baylor has a lot of experience and toughness, and they have a coach who has won a national title before. Alabama is a very good team that won the SEC tournament and has a high-powered offense, but they might be vulnerable to Baylor’s defense and rebounding.
(Bing also said that Baylor had won two of its previous three meetings against Alabama, but the reverse is true.)
Bing picked a Final Four without any No. 1 seeds; that has happened three times since seeding began (in 1980, 2006 and 2011). Is the bot correct that we’re due for another? Is this a good strategy to win a pool this year? We can only wait and see.
Perhaps wisely, the bot picked a more traditional Final Four in the women’s bracket, where upsets have been less common. And it made the safest possible pick for the winner: undefeated South Carolina.
The Bing chatbot wasn’t exactly designed to compete with expert forecasts or mathematical tournament prediction models. Microsoft has said that its system has struggled to keep up with live sports information. The chatbot frequently cited outdated or incorrect details about teams, even if its overall impressions seemed valid.
So taking Bing’s advice — with its dash of unpredictability — is probably just as good as other amateur bracket strategies, like picking which teams’ mascots would beat the others.
Speaking of which, we did, in fact, ask Bing which teams’ mascots would beat the others.
South Carolina (Gamecocks) vs. Norfolk State (Spartans): I think the Spartans would win this matchup. They are trained warriors who can fight with swords and shields, while the Gamecocks are just roosters who might peck at their ankles.
Purdue (Boilermakers) vs. Texas Southern (Tigers): A boilermaker is a person who makes or operates steam boilers. A tiger is a large and powerful feline predator. In a fight, I think the tiger would win because it is faster, stronger and more agile than a human with a boiler.
Miami (Hurricanes) vs. Drake (Bulldogs): A hurricane wins. A hurricane is a violent storm with strong winds and rain that can cause massive destruction. A bulldog is a medium-sized dog with a wrinkled face and a short muzzle that was originally bred for bull-baiting. In a fight, I think the hurricane would win because it has more force and impact than a dog with strong jaws.
You can be the judge of these judgment calls. (You can also ask similar questions to ChatGPT and get similarly humorous responses; the Bing chatbot is not yet widely available to the public.)
Methodology
All of our conversations were with the Bing chatbot on its “Balanced” conversation style setting. Through experimentation, we crafted queries that would keep the chatbot’s responses in a consistent format, force it to make selections for each matchup, allow it to make upset picks and encourage it to use information about teams’ current seasons (though it often mixed information from this season and previous seasons).
A typical query was formatted as follows:
Hey, Bing. I’ll list the first-round matchups in the South region of the 2023 NCAA men’s basketball tournament. I have included their seeds in the 2023 tournament. Make selections for each game, as if you were filling out a bracket in an office pool, attempting to win the pool while accounting for potential upsets. Be sure to use information about the teams and their seedings in the 2023 tournament, not previous years!
(1) Alabama vs. (16) Texas A&M-Corpus Christi
(8) Maryland vs. (9) West Virginia
(5) San Diego State vs. (12) Charleston
(4) Virginia vs. (13) Furman
(6) Creighton vs. (11) N.C. State
(3) Baylor vs. (14) U.C. Santa Barbara
(7) Missouri vs. (10) Utah State
(2) Arizona vs. (15) Princeton
We recorded the chatbot’s selections for winners in each matchup. Then we took those winners and asked it about the matchups that would occur in the next round of the tournament, given its selections. We repeated this round by round and region by region for both the men’s and women’s tournaments.
We asked the chatbot for its picks in the “First Four” games of each tournament before they were played. We advanced its selected winners into the first round of the tournament.
There is no guarantee that the chatbot will make the same selections even when asked the same questions in the same format. And wording questions differently may also produce different results.