Tag: machine learning

Content Creators in the Adult Industry Want a Say in AI Rules

[ad_1]

A group of sex industry professionals and advocates issued an open letter to EU regulators on Thursday, claiming that their views are being overlooked in vital discussions on policing AI technology despite also being implicated in AI’s momentous rise.

In response to European internet regulations, a collective of adult industry members—including sex workers, erotic filmmakers, sex tech enterprises, and sex educators—urged the European Commission to include them in future negotiations shaping AI regulations, according to the letter, seen by WIRED.

The group includes erotic filmmaker Erika Lust’s company as well as the European Sex Workers’ Rights Alliance campaign group, and is signed the Open Mind AI initiative. The group aims to alert the commission of what it says is a “critical gap” in discussions on AI regulation. Those coordinating the campaign say that current discussion strategy risks excluding first-hand perspectives on adult content and overregulating an already-marginalized community.

“AI is evolving every day [and] we see new developments at every corner,” said Ana Ornelas, a Berlin-based erotic author and educator who goes by the pseudonym Pimenta Cítrica, and who is one of the leaders of the initiative. “It is natural that people will turn to this new technology to satisfy their fantasies.”

But deepfakes are now a major AI threat. Ninety six percent of them feature nonconsensual “porn,” mostly of women and girls. It is “extremely harmful” to those targeted, as well as to porn performers, says Ornelas. “It’s a threat both to their human integrity and their livelihood,” she adds. “But the way the landscape is posed, adult content creators, sex workers, and educators are getting the shorter end of the stick on both sides of the spectrum.” She says that she fears banishing all adult content will sweep legitimately created content away with nonconsensual material and push people to AI models with no filters at all.

On August 1, the European Commission introduced what it called the world’s first comprehensive legislation on AI. The aim, it said, is to cultivate responsible use of AI across the bloc. It followed earlier EU legislation policing illegal and harmful activities on digital platforms. But the initiative’s organizers say regulators don’t understand the adult industry, risking censorship, draconian measures, and misunderstandings.

“We can offer the right insight to policymakers so they can regulate in a way that safeguards fundamental rights, freedom, and fosters a more sex-positive online environment,” says Ornelas. The European Commission did not immediately respond to a WIRED request for comment.

Sex workers and porn performers have already reported censorship and discrimination linked to global legislation clamping down on sex trafficking and banks limiting their services. Adult industry members, including sex educators, have also had to grapple with suspensions and removals from tech platforms.

“There’s a lack of awareness of how policies impact our livelihoods,” says Paulita Pappel, an adult filmmaker and an organizer of the initiative. “We are facing discrimination, and if regulators are trying to protect the rights of people, it would be nice if they could protect the digital rights of everyone.”

[ad_2]

Lydia Morrish

Source link

September 19, 2024
An AI Bot Named James Has My Old Local News Job

[ad_1]

It always seemed difficult for the newspaper where I used to work, The Garden Island on the rural Hawaiian island of Kauai, to hire reporters. If someone left, it could take months before we hired a replacement, if we ever did.

So, last Thursday, I was happy to see that the paper appeared to have hired two new journalists—even if they seemed a little off. In a spacious studio overlooking a tropical beach, James, a middle-aged Asian man who appears to be unable to blink, and Rose, a younger redhead who struggles to pronounce words like “Hanalei” and “TV,” presented their first news broadcast, over pulsing music that reminds me of the Challengers score. There is something deeply off-putting about their performance: James’ hands can’t stop vibrating. Rose’s mouth doesn’t always line up with the words she’s saying.

When James asks Rose about the implications of a strike on local hotels, Rose just lists hotels where the strike is taking place. A story on apartment fires “serves as a reminder of the importance of fire safety measures,” James says, without naming any of them.

James and Rose are, you may have noticed, not human reporters. They are AI avatars crafted by an Israeli company named Caledo, which hopes to bring this tech to hundreds of local newspapers in the coming year.

“Just watching someone read an article is boring,” says Dina Shatner, who cofounded Caledo with her husband Moti in 2023. “But watching people talking about a subject—this is engaging.”

The Caledo platform can analyze several prewritten news articles and turn them into a “live broadcast” featuring conversation between AI hosts like James and Rose, Shatner says. While other companies, like Channel 1 in Los Angeles, have begun using AI avatars to read out prewritten articles, this claims to be the first platform that lets the hosts riff with one another. The idea is that the tech can give small local newsrooms the opportunity to create live broadcasts that they otherwise couldn’t. This can open up embedded advertising opportunities and draw in new customers, especially among younger people who are more likely to watch videos than read articles.

Instagram comments under the broadcasts, which have each garnered between 1,000 and 3,000 views, have been pretty scathing. “This ain’t that,” says one. “Keep journalism local.” Another just reads: “Nightmares.”

When Caledo started seeking out North American partners earlier this year, Shatner says, The Garden Island was quick to apply, becoming the first outlet in the country to adopt the AI broadcast tech.

I’m surprised to hear this, because when I worked as a reporter there last year, the paper wasn’t exactly cutting edge—we had a rather clunky website—and appeared to me to not be in a financial position to be making this sort of investment. As the newspaper industry struggled with advertising revenue decline, the oldest and currently the only daily print newspaper on Kauai, The Garden Island, had shrunk to only a couple reporters listed on its website, tasked with covering every story on an island of 73,000. In recent decades, the paper has been passed around between several large media conglomerates—including earlier this year, when its parent company Oahu Publications’ parent company, Black Press Media, was purchased by Carpenter Media Group, which now controls more than 100 local outlets throughout North America.

[ad_2]

Guthrie Scrimgeour

Source link

September 11, 2024
What You Need to Know About Grok AI and Your Privacy

[ad_1]

But X also makes it clear the onus is on the user to judge the AI’s accuracy. “This is an early version of Grok,” xAI says on its help page. Therefore chatbot may “confidently provide factually incorrect information, missummarize, or miss some context,” xAI warns.

“We encourage you to independently verify any information you receive,” xAI adds. “Please do not share personal data or any sensitive and confidential information in your conversations with Grok.”

Grok Data Collection

Vast amounts of data collection are another area of concern—especially since you are automatically opted in to sharing your X data with Grok, whether you use the AI assistant or not.

The xAI’s Grok Help Center page describes how xAI “may utilize your X posts as well as your user interactions, inputs and results with Grok for training and fine-tuning purposes.”

Grok’s training strategy carries “significant privacy implications,” says Marijus Briedis, chief technology officer at NordVPN. Beyond the AI tool’s “ability to access and analyze potentially private or sensitive information,” Briedis adds, there are additional concerns “given the AI’s capability to generate images and content with minimal moderation.”

While Grok-1 was trained on “publicly available data up to Q3 2023” but was not “pre-trained on X data (including public X posts),” according to the company, Grok-2 has been explicitly trained on all “posts, interactions, inputs, and results” of X users, with everyone being automatically opted in, says Angus Allan, senior product manager at CreateFuture, a digital consultancy specializing in AI deployment.

The EU’s General Data Protection Regulation (GDPR) is explicit about obtaining consent to use personal data. In this case, xAI may have “ignored this for Grok,” says Allan.

This led to regulators in the EU pressuring X to suspend training on EU users within days of the launch of Grok-2 last month.

Failure to abide by user privacy laws could lead to regulatory scrutiny in other countries. While the US doesn’t have a similar regime, the Federal Trade Commission has previously fined Twitter for not respecting users’ privacy preferences, Allan points out.

Opting Out

One way to prevent your posts from being used for training Grok is by making your account private. You can also use X privacy settings to opt out of future model training.

To do so select Privacy & Safety > Data sharing and Personalization > Grok. In Data Sharing, uncheck the option that reads, “Allow your posts as well as your interactions, inputs, and results with Grok to be used for training and fine-tuning.”

Even if you no longer use X, it’s still worth logging in and opting out. X can use all of your past posts—including images—for training future models unless you explicitly tell it not to, Allan warns.

It’s possible to delete all of your conversation history at once, xAI says. Deleted conversations are removed from its systems within 30 days, unless the firm has to keep them for security or legal reasons.

No one knows how Grok will evolve, but judging by its actions so far, Musk’s AI assistant is worth monitoring. To keep your data safe, be mindful of the content you share on X and stay informed about any updates in its privacy policies or terms of service, Briedis says. “Engaging with these settings allows you to better control how your information is handled and potentially used by technologies like Grok.”

[ad_2]

Kate O’Flaherty

Source link

September 9, 2024
AI-Fakes Detection Is Failing Voters in the Global South

[ad_1]

But it’s not just that models can’t recognize accents, languages, syntax, or faces less common in Western countries. “A lot of the initial deepfake detection tools were trained on high quality media,” says Gregory. But in much of the world, including Africa, cheap Chinese smartphone brands that offer stripped-down features dominate the market. The photos and videos that these phones are able to produce are much lower quality, further confusing detection models, says Ngamita.

Gregory says that some models are so sensitive that even background noise in a piece of audio, or compressing a video for social media, can result in a false positive or negative. “But those are exactly the circumstances you encounter in the real world, rough and tumble detection,” he says. The free, public-facing tools that most journalists, fact checkers, and civil society members are likely to have access to are also “the ones that are extremely inaccurate, in terms of dealing both with the inequity of who is represented in the training data and of the challenges of dealing with this lower quality material.”

Generative AI is not the only way to create manipulated media. So-called cheapfakes, or media manipulated by adding misleading labels or simply slowing down or editing audio and video, are also very common in the Global South, but can be mistakenly flagged as AI-manipulated by faulty models or untrained researchers.

Diya worries that groups using tools that are more likely to flag content from outside the US and Europe as AI generated could have serious repercussions on a policy level, encouraging legislators to crack down on imaginary problems. “There’s a huge risk in terms of inflating those kinds of numbers,” she says. And developing new tools is hardly a matter of pressing a button.

Just like every other form of AI, building, testing, and running a detection model requires access to energy and data centers that are simply not available in much of the world. “If you talk about AI and local solutions here, it’s almost impossible without the compute side of things for us to even run any of our models that we are thinking about coming up with,” says Ngamita, who is based in Ghana. Without local alternatives, researchers like Ngamita are left with few options: pay for access to an off the shelf tool like the one offered by Reality Defender, the costs of which can be prohibitive; use inaccurate free tools; or try to get access through an academic institution.

For now, Ngamita says that his team has had to partner with a European university where they can send pieces of content for verification. Ngamita’s team has been compiling a dataset of possible deepfake instances from across the continent, which he says is valuable for academics and researchers who are trying to diversify their models’ datasets.

But sending data to someone else also has its drawbacks. “The lag time is quite significant,” says Diya. “It takes at least a few weeks by the time someone can confidently say that this is AI generated, and by that time, that content, the damage has already been done.”

Gregory says that Witness, which runs its own rapid response detection program, receives a “huge number” of cases. “It’s already challenging to handle those in the time frame that frontline journalists need, and at the volume they’re starting to encounter,” he says.

But Diya says that focusing so much on detection might divert funding and support away from organizations and institutions that make for a more resilient information ecosystem overall. Instead, she says, funding needs to go towards news outlets and civil society organizations that can engender a sense of public trust. “I don’t think that’s where the money is going,” she says. “I think it is going more into detection.”

[ad_2]

Vittoria Elliott

Source link

September 2, 2024
Stadiums Are Embracing Face Recognition. Privacy Advocates Say They Should Stick to Sports

[ad_1]

Thousands of people lined up outside Citi Field in Queens, New York, on Wednesday to watch the Mets face off with the Orioles. But outside the ticketing booth, a handful of protesters handed out flyers. They were there to protest a recent Major League Baseball program, one that’s increasingly common in professional sports: using facial recognition on fans.

Facial recognition companies and their customers argue that these systems save time, and therefore money, by shortening lines at stadium entrances. However, skeptics argue that the surveillance tools are never totally secure, make it easier for police to get information about fans, and fuel “mission creep” where surveillance technology becomes more common or even required.

The MLB’s facial recognition program, dubbed Go-Ahead Entry, lets participating fans go on a separate security line, usually shorter than the other queues. Fans download the MLB Ballpark app, submit a selfie, and have their face matched at an in-person camera kiosk at a stadium’s entrance.

Six MLB teams are participating in Go-Ahead Entry, including the Philadelphia Phillies, Cincinnati Reds, Houston Astros, Kansas City Royals, San Francisco Giants, and Washington Nationals.

Some MLB teams, including the Mets, have their own facial recognition programs for express entry. The Mets have been using the facial recognition company Wicket for its Mets Entry Express program since 2021. The Cleveland Guardians, similarly, have been using technology from the company Clear at its ballpark, Progressive Field, since 2019.

Neither the Mets, MLB, nor Wicket immediately responded to WIRED’s requests for comment.

The National Football League has also started using Wicket facial recognition for express entry. NFL spokesperson Brian McCarthy said in an X post that the league-wide program, at least currently, is only available to “team/game-day personnel, vendors, and media”—not fans. The Cleveland Browns and Tennessee Titans, however, do have facial recognition entry systems that fans can use. (The news of the NFL’s expanded use of face recognition still caused confusion on Facebook and X, where some people thought facial recognition would be required at the stadiums for all 32 NFL teams.)

At Citi Field on Wednesday, the Mets Entry Express Line was used scarcely, perhaps five people every five minutes or so. There was never a line. The main security lines, though longer in comparison, took only about five minutes.

The protesters at Citi Field represented some of the 11 organizations that consigned an open letter arguing against the use of facial recognition systems at stadiums, including Fight for the Future, the Electronic Privacy Information Center, and Amnesty International. The letter argues that “not only does facial recognition pose unprecedented threats to people’s privacy and safety, it’s also completely unnecessary.” The activists outside Citi Field on Wednesday passed out flyers to passersby with information about Go-Ahead Entry, declaring in all caps, “WE CALL FOUL ON FACIAL RECOGNITION AT SPORTING EVENTS.” This wasn’t their first protest on the issue; organizers with Fight for the Future also staged a protest last year at Citizens Bank Park, home of the Phillies, to agitate against its introduction of facial recognition.

[ad_2]

Caroline Haskins

Source link

August 21, 2024
I Used ChatGPT’s Advanced Voice Mode. It’s Fun, and Just a Bit Creepy

[ad_1]

I leave ChatGPT’s Advanced Voice Mode on while writing this article as an ambient AI companion. Occasionally, I’ll ask it to provide a synonym for an overused word, or some encouragement. Around half an hour in, the chatbot interrupts our silence and starts speaking to me in Spanish, unprompted. I giggle a bit and ask what’s going on. “Just a little switch up? Gotta keep things interesting,” says ChatGPT, now back in English.

While testing Advanced Voice Mode as part of the early alpha, my interactions with ChatGPT’s new audio feature were entertaining, messy, and surprisingly varied, though it’s worth noting that the features I had access to were only half of what OpenAI demonstrated when it launched the GPT-4o model in May. The vision aspect we saw in the livestreamed demo is now scheduled for a later release, and the enhanced Sky voice, which Her actor Scarlett Johanssen pushed back on, has been removed from Advanced Voice Mode and is no longer an option for users.

So, what’s the current vibe? Right now, Advanced Voice Mode feels reminiscent of when the original text-based ChatGPT dropped, late in 2022. Sometimes it leads to unimpressive dead ends or devolves into empty AI platitudes. But other times the low-latency conversations click in a way that Apple’s Siri or Amazon’s Alexa never have for me, and I feel compelled to keep chatting out of enjoyment. It’s the kind of AI tool you’ll show your relatives during the holidays for a laugh.

OpenAI gave a few WIRED reporters access to the feature a week after the initial announcement but pulled it the next morning, citing safety concerns. Two months later, OpenAI soft-launched Advanced Voice Mode to a small group of users and released GPT-4o’s system card, a technical document that outlines red-teaming efforts, what the company considers to be safety risks, and mitigation steps the company has taken to reduce harm.

Curious to give it a go yourself? Here’s what you need to know about the larger rollout of Advanced Voice Mode, and my first impressions of ChatGPT’s new voice feature, to help you get started.

So, When’s the Full Rollout?

OpenAI released an audio-only Advanced Voice Mode to some ChatGPT Plus users at the end of July, and the alpha group still seems relatively small. The company plans to enable it for all subscribers sometime this fall. Niko Felix, a spokesperson for OpenAI, shared no additional details when asked about the release timeline.

Screen and video sharing were a core part of the original demo, but they are not available in this alpha test. OpenAI plans to add those aspects eventually, but it’s also not clear when that will happen.

If you’re a ChatGPT Plus subscriber, you’ll receive an email from OpenAI when the Advanced Voice Mode is available to you. After it’s on your account, you can switch between Standard and Advanced at the top of the app’s screen when ChatGPT’s voice mode is open. I was able to test the alpha version on an iPhone as well as a Galaxy Fold.

My First Impressions of ChatGPT’s Advanced Voice Mode

Within the very first hour of speaking with it, I learned that I love interrupting ChatGPT. It’s not how you would talk with a human, but having the new ability to cut off ChatGPT mid-sentence and request a different version of the output feels like a dynamic improvement and a standout feature.

Early adopters who were excited by the original demos may be frustrated to get access to a version of Advanced Voice Mode that’s restricted with more guardrails than anticipated. For example, although generative AI singing was a key component of the launch demos, with whispered lullabies and multiple voices attempting to harmonize, AI serenades are absent from the alpha version.

[ad_2]

Reece Rogers

Source link

August 13, 2024
New Jersey’s $500 Million Bid to Become an AI Epicenter

[ad_1]

New Jersey itself is a home to many large pharmaceutical companies—and if these companies use AI to design new drugs, nearby data centers are vital, Sullivan says.

“If you’re three people at a desk trying to develop the next Google, the next Tesla—in the AI space or in any space—this computing power is scarce. And it’s very valuable. It’s essential,” Sullivan says. So, in addition to any permanent jobs created by these companies, the tax incentives could lead to further growth and innovation for smaller startups, he claims. “The potential for economic impact is off the charts.”

Still, skeptical policy experts say the AI carveout may just be a new bow on an older idea, coming as the AI boom creates a rapid increase in demand for data centers. “There’s just this history of [tax incentive] deals building up the necessary infrastructure for these tech firms and not paying off for the taxpayer,” says Pat Garofalo, director of state and local policy at the American Economic Liberties Project, a nonprofit organization that calls for government accountability. The loss in tax revenue “is often astronomical” when compared to each job created, Garofalo says.

A 2016 report by Tarczynska showed that governments often forego more than $1 million in taxes for each job created when subsidizing data centers that are built by large companies, and many data centers create between 100 and 200 permanent jobs. The local impact may be small, but The Data Center Coalition, an industry group, paints a different picture: Each job at a data center supports more than six jobs elsewhere, a 2023 study it commissioned found.

In other states, a backlash against data centers is growing. Northern Virginia, home to a high concentration of data centers that sit close to Washington, DC, has seen political shifts as people oppose the centers’ growing presence. In May, Georgia’s governor vetoed a bill that would have halted tax breaks for two years as the state studied the energy impact of the centers, which are rapidly expanding near Atlanta.

This hasn’t deterred Big Tech companies’ expansion: In May, Microsoft announced it would build a new AI data center in Wisconsin, making a $3.3 billion investment and partnering with a local technical college to train and certify more than 1,000 students over the next five years to work in the new data center or IT jobs in the region. Google said just a month earlier it would build a $2 billion AI data center in Indiana, which is expected to create 200 jobs. Google will get a 35-year sales tax exemption in return if it makes an $800 million capital investment.

In Europe, the same contradictory approach is playing out: Some cities, including Amsterdam and Frankfurt, where companies have already set up data centers, are pushing new restrictions. In Ireland, data centers now account for one-fifth of the energy used in the country—more than all of the nation’s homes combined—raising concerns over their impact on the climate. Others are seeking out the economic opportunity: The Labour Party in the UK promised to make it easier to build data centers before emerging victorious in the recent UK election.

[ad_2]

Amanda Hoover

Source link

July 25, 2024
Google DeepMind’s Game-Playing AI Tackles a Chatbot Blind Spot

[ad_1]

Several years before ChatGPT began jibber-jabbering away, Google developed a very different kind of artificial intelligence program called AlphaGo that learned to play the board game Go with superhuman skill through tireless practice.

Researchers at the company have now published research that combines the abilities of a large language model (the AI behind today’s chatbots) with those of AlphaZero, a successor to AlphaGo also capable of playing chess, to solve very tricky mathematical proofs.

Their new Frankensteinian creation, dubbed AlphaProof, has demonstrated its prowess by tackling several problems from the 2024 International Math Olympiad (IMO), a prestigious competition for high school students.

AlphaProof uses the Gemini large language model to convert naturally phrased math questions into a programming language called Lean. This provides the training fodder for a second algorithm to learn, through trial and error, how to find proofs that can be confirmed as correct.

Earlier this year, Google DeepMind revealed another math algorithm called AlphaGeometry that also combines a language model with a different AI approach. AlphaGeometry uses Gemini to convert geometry problems into a form that can be manipulated and tested by a program that handles geometric elements. Google today also announced a new and improved version of AlphaGeometry.

The researchers found that their two math programs could provide proofs for IMO puzzles as well as a silver medalist could. Out of six problems total, AlphaProof solved two algebra problems and a number theory one, while AlphaGeometry solved a geometry problem. The programs got one problem in minutes but took up to several days to figure out others. Google DeepMind has not disclosed how much computer power it threw at the problems.

Google DeepMind calls the approach used for both AlphaProof and AlphaGeometry “neuro-symbolic” because they combine the pure machine learning of an artificial neural network, the technology that underpins most progress in AI of late, with the language of conventional programming.

“What we’ve seen here is that you can combine the approach that was so successful, and things like AlphaGo, with large language models and produce something that is extremely capable,” says David Silver, the Google DeepMind researcher who led work on AlphaZero. Silver says the techniques demonstrated with AlphaProof should, in theory, extend to other areas of mathematics.

Indeed, the research raises the prospect of addressing the worst tendencies of large language models by applying logic and reasoning in a more grounded fashion. As miraculous as large language models can be, they often struggle to grasp even basic math or to reason through problems logically.

In the future, the neural-symbolic method could provide a means for AI systems to turn questions or tasks into a form that can be reasoned over in a way that produces reliable results. OpenAI is also rumored to be working on such a system, codenamed “Strawberry.”

There is, however, a key limitation with the systems revealed today, as Silver acknowledges. Math solutions are either correct or incorrect, allowing AlphaProof and AlphaGeometry to work their way toward the right answer. Many real-world problems—coming up with the ideal itinerary for a trip, for instance—have many possible solutions, and which one is ideal may be unclear. Silver says the solution for more ambiguous questions may be for a language model to try to determine what constitutes a “right” answer during training. “There’s a spectrum of different things that can be tried,” he says.

Silver is also careful to note that Google DeepMind won’t be putting human mathematicians out of jobs. “We are aiming to provide a system that can prove anything, but that’s not the end of what mathematicians do,” he says. “A big part of mathematics is to pose problems and find what are the interesting questions to ask. You might think of this as another tool along the lines of a slide rule or calculator or computational tools.”

Updated 7/25/24 1:25 pm ET: This story has been updated to clarify how many problems AlphaProof and AlphaGeometry solved, and of what type.

[ad_2]

Will Knight

Source link

July 25, 2024
AI Can’t Replace Teaching, but It Can Make It Better

[ad_1]

Khanmigo doesn’t answer student questions directly, but starts with questions of its own, such as asking whether the student has any ideas about how to find an answer. Then it guides them to a solution, step by step, with hints and encouragement.

Notwithstanding Khan’s expansive vision of “amazing” personal tutors for every student on the planet, DiCerbo assigns Khanmigo a more limited teaching role. When students are working independently on a skill or concept but get hung up or caught in a cognitive rut, she says, “we want to help students get unstuck.”

Some 100,000 students and teachers piloted Khanmigo this past academic year in schools nationwide, helping to flag any hallucinations the bot has and providing tons of student-bot conversations for DiCerbo and her team to analyze.

“We look for things like summarizing, providing hints and encouraging,” she explains.

The degree to which Khanmigo has closed AI’s engagement gap is not yet known. Khan Academy plans to release some summary data on student-bot interactions later this summer, according to DiCerbo. Plans for third-party researchers to assess the tutor’s impact on learning will take longer.

AI Feedback Works Both Ways

Since 2021, the nonprofit Saga Education has also been experimenting with AI feedback to help tutors better engage and motivate students. Working with researchers from the University of Memphis and the University of Colorado, the Saga team pilot in 2023 fed transcripts of their math tutoring sessions into an AI model trained to recognize when the tutor was prompting students to explain their reasoning, refine their answers, or initiate a deeper discussion. The AI analyzed how often each tutor took these steps.

Tracking some 2,300 tutoring sessions over several weeks, they found that tutors whose coaches used the AI feedback peppered their sessions with significantly more of these prompts to encourage student engagement.

While Saga is looking into having AI deliver some feedback directly to tutors, it’s doing so cautiously because, according to Brent Milne, the vice president of product research and development at Saga Education, “having a human coach in the loop is really valuable to us.”

Experts expect that AI’s role in education will grow, and its interactions will continue to seem more and more human. Earlier this year, OpenAI and the startup Hume AI separately launched “emotionally intelligent” AI that analyzes tone of voice and facial expressions to infer a user’s mood and respond with calibrated “empathy.” Nevertheless, even emotionally intelligent AI will likely fall short on the student engagement front, according to Brown University computer science professor Michael Littman, who is also the National Science Foundation’s division director for information and intelligent systems.

No matter how humanlike the conversation, he says, students understand at a fundamental level that AI doesn’t really care about them, what they have to say in their writing, or whether they pass or fail subjects. In turn, students will never really care about the bot and what it thinks. A June study in the journal Learning and Instruction found that AI can already provide decent feedback on student essays. What is not clear is whether student writers will put in care and effort, rather than offload the task to a bot, if AI becomes the primary audience for their work.

“There’s incredible value in the human relationship component of learning,” Littman says, “and when you just take humans out of the equation, something is lost.”

This story about AI tutors was produced by The Hechinger Report, a nonprofit, independent news organization focused on inequality and innovation in education. Sign up for the Hechinger newsletter.

[ad_2]

Chris Berdik

Source link

July 10, 2024
AI-Powered Super Soldiers Are More Than Just a Pipe Dream

[ad_1]

The day is slowly turning into night, and the American special operators are growing concerned. They are deployed to a densely populated urban center in a politically volatile region, and local activity has grown increasingly frenetic in recent days, the roads and markets overflowing with more than the normal bustle of city life. Intelligence suggests the threat level in the city is high, but the specifics are vague, and the team needs to maintain a low profile—a firefight could bring known hostile elements down upon them. To assess potential threats, the Americans decide to take a more cautious approach. Eschewing conspicuous tactical gear in favor of blending in with potential crowds, an operator steps out into the neighborhood’s main thoroughfare to see what he can see.

With a click of a button, the operator sees … everything. A complex suite of sensors affixed to his head-up display start vacuuming up information from the world around him. Body language, heart rates, facial expressions, and even ambient snatches of conversation in local dialects are rapidly collected and routed through his backpack supercomputers for processing with the help of an onboard artificial intelligence engine. The information is instantly analyzed, streamlined, and regurgitated back into the head-up display. The assessment from the operators’ tactical AI sidekick comes back clear: There are a series of seasonal events coming into town, and most passersby are excited and exuberant, presenting a minimal threat to the team. Crisis averted—for now.

This is one of many potential scenarios repeatedly presented by Defense Department officials in recent years when discussing the future of US special operations forces, those elite troops tasked with facing the world’s most complex threats head-on as the “tip of the spear” of the US military. Both defense officials and science-fiction scribes may have envisioned a future of warfare shaped by brain implants and performing enhancing drugs, or a suit of powered armor straight out of Starship Troopers, but according to US Special Operations Command, the next generation of armed conflict will be fought (and, hopefully, won) with a relatively simple concept: the “hyper enabled operator.”

More Brains, Less Brawn

First introduced to the public in 2019 in an essay by officials from SOCOM’s Joint Acquisition Task Force (JATF) for Small Wars Journal, the hyper-enabled operator (HEO) concept is the successor program to the Tactical Assault Light Operator Suit (TALOS) effort that, initiated in 2013, sought to outfit US special operations forces with a so-called “Iron Man” suit. Inspired by the 2012 death of a Navy SEAL during a hostage rescue operation in Afghanistan, TALOS was intended to improve operators’ survivability in combat by making them virtually resistant to small-arms fire through additional layers of sophisticated armor, the latest installment of the Pentagon’s decades-long effort to build a powered exoskeleton for infantry troops. While the TALOS effort was declared dead in 2019 due to challenges integrating its disparate systems into one cohesive unit, the lessons learned from the program gave rise to the HEO as a natural successor.

The core objective of the HEO concept is straightforward: to give warfighters “cognitive overmatch” on the battlefield, or “the ability to dominate the situation by making informed decisions faster than the opponent,” as SOCOM officials put it. Rather than bestowing US special operations forces with physical advantages through next-generation body armor and exotic weaponry, the future operator will head into battle with technologies designed to boost their situational awareness and relevant decisionmaking to superior levels compared to the adversary. Former fighter pilot and Air Force colonel John Boyd proposed the “OODA loop” (observe, orient, decide, act) as the core military decisionmaking model of the 21st century; the HEO concept seeks to use technology to “tighten” that loop so far that operators are quite literally making smarter and faster decisions than the enemy.

“The goal of HEO,” as SOCOM officials put it in 2019, “is to get the right information to the right person at the right time.”

To achieve this goal, the HEO concept calls for swapping the powered armor at the heart of the TALOS effort for sophisticated communications equipment and a robust sensor suite built on advanced computing architecture, allowing the operator to vacuum up relevant data and distill it into actionable information through a simple interface like a head-up display—and do so “at the edge,” in places where traditional communications networks may not be available. If TALOS was envisioned as an “Iron Man” suit, as I previously observed, then HEO is essentially Jarvis, Tony Stark’s built-in AI assistant that’s constantly feeding him information through his helmet’s head-up display.

[ad_2]

Jared Keller

Source link

July 8, 2024
French AI Startups Felt Unstoppable. Then Came the Election

[ad_1]

“Then on the other extreme, [the left-wing New Popular Front] have been so vocal about all the taxation measures they want to bring back that it looks like we’re just going back to pre-Macron period,” Varza says. She points to France’s 2012 “les pigeons” (or “suckers”) movement, a campaign by angry internet entrepreneurs that opposed Socialist president François Hollande’s plan to dramatically raise taxes for founders.

Maya Noël, CEO of France Digitale, an industry group for startups, is worried not only about France’s ability to attract overseas talent, but also about how appealing the next government will be to foreign investors. In February, Google said it would open a new AI hub in Paris, where 300 researchers and engineers would be based. Three months later, Microsoft also announced a record $4 billion investment in its French AI infrastructure. Meta has had an AI research lab in Paris since 2015. Today France is attractive to foreign investors, she says. “And we need them.” Neither Google nor Meta replied to WIRED’s request for comment. Microsoft declined to comment.

The vote will not unseat Macron himself—the presidential election is not scheduled until 2027—but the election outcome could dramatically reshape the lower house of the French Parliament, the National Assembly, and install a prime minister from either the far-right or left-wing coalition. This would plunge the government into uncertainty, raising the risk of gridlock. In the past 60 years, there have been only three occasions when a president has been forced to govern with a prime minister from the opposition party, an arrangement known in France as “cohabitation.”

No AI startup has benefited more from the Macron era than Mistral, which counts Cédric O, former digital minister within Macron’s government, among its cofounders. Mistral has not commented publicly on the choice France faces at the polls. The closest the company has come to sharing its views is Cédric O’s decision to repost an X post by entrepreneur Gilles Babinet last week that said: “I hate the far-right but the left’s economic program is surreal.” When WIRED asked Mistral about the retweet, the company said O was not a spokesperson, and declined to comment.

Babinet, a member of the government’s artificial intelligence committee, says he has already heard colleagues considering leaving France. “A few of the coders I know from Senegal, from Morocco, are already planning their next move,” he says, claiming people have also approached him for help renewing their visas early in case this becomes more difficult under a far-right government.

While other industries have been quietly rushing to support the far-right as a preferable alternative to the left-wing alliance, according to reports, Babinet plays down the threat from the New Popular Front. “It’s clear they come with very old-fashioned economical rules, and therefore they don’t understand at all the new economy,” he says. But after speaking to New Popular Front members, he says the hard-left are a minority in the alliance. “Most of these people are Social Democrats, and therefore they know from experience that when François Hollande came into power, he tried to increase the taxes on the technology, and it failed miserably.”

Already there is a sense of damage control, as the industry tries to reassure outsiders everything will be fine. Babinet points to other moments of political chaos that industries survived. “At the end of the day, Brexit was not so much of a nightmare for the tech scene in the UK,” he says. The UK is still the preferred place to launch a generative AI startup, according to the Accel report.

Stanislas Polu, an OpenAI alumnus who launched French AI startup Dust last year, agrees the industry has enough momentum to survive any headwinds coming its way. “Some of the outcomes might be a bit gloomy,” he says, adding he expects personal finances to be hit. “It’s always a little bit more complicated to navigate a higher volatility environment. I guess we’re hoping that the more moderate people will govern that country. I think that’s all we can hope for.”

[ad_2]

Morgan Meaker

Source link

June 27, 2024
My Memories Are Just Meta’s Training Data Now

[ad_1]

In R. C. Sherriff’s novel The Hopkins Manuscript, readers are transported to a world 800 years after a cataclysmic event ended Western civilization. In pursuit of clues about a blank spot in their planet’s history, scientists belonging to a new world order discover diary entries in a swamp-infested wasteland formerly known as England. For the inhabitants of this new empire, it is only through this record of a retired school teacher’s humdrum rural life, his petty vanities and attempts to breed prize-winning chickens, that they begin to learn about 20th-century Britain.

If I were to teach futuristic beings about life on earth, I once believed I could produce a time capsule more profound than Sherriff’s small-minded protagonist, Edgar Hopkins. But scrolling through my decade-old Facebook posts this week, I was presented with the possibility that my legacy may be even more drab.

Earlier this month, Meta announced that my teenage status updates were exactly the kind of content it wants to pass on to future generations of artificial intelligence. From June 26, old public posts, holiday photos, and even the names of millions of Facebook and Instagram users around the world would effectively be treated as a time capsule of humanity and transformed into training data.

That means my mundane posts about university essay deadlines (“3 energy drinks down 1,000 words to go”) as well as unremarkable holiday snaps (one captures me slumped over my phone on a stationary ferry) are about to become part of that corpus. The fact that these memories are so dull, and also very personal, makes Meta’s interest more unsettling.

The company says it is only interested in content that is already public: private messages, posts shared exclusively with friends, and Instagram Stories are out of bounds. Despite that, AI is suddenly feasting on personal artifacts that have, for years, been gathering dust in unvisited corners of the internet. For those reading from outside Europe, the deed is already done. The deadline announced by Meta applied only to Europeans. The posts of American Facebook and Instagram users have been training Meta AI models since 2023, according to company spokesperson Matthew Pollard.

Meta is not the only company turning my online history into AI fodder. WIRED’s Reece Rogers recently discovered that Google’s AI search feature was copying his journalism. But finding out which personal remnants exactly are feeding future chatbots was not easy. Some sites I’ve contributed to over the years are hard to trace. Early social network Myspace was acquired by Time Inc. in 2016, which in turn was acquired by a company called Meredith Corporation two years later. When I asked Meredith about my old account, they replied that Myspace had since been spun off to an advertising firm, Viant Technology. An email to a company contact listed on its website was returned with a message that the address “couldn’t be found.”

Asking companies still in business about my old accounts was more straightforward. Blogging platform Tumblr, owned by WordPress owner Automattic, said unless I’d opted out, the public posts I made as a teenager will be shared with “a small network of content and research partners, including those that train AI models” per a February announcement. YahooMail, which I used for years, told me that a sample of old emails—which have apparently been “anonymized” and “aggregated”—are being “utilized” by an AI model internally to do things like summarize messages. Microsoft-owned LinkedIn also said my public posts were being used to train AI although some “personal” details included in those posts were excluded, according to a company spokesperson, who did not specify what those personal details were.

[ad_2]

Morgan Meaker

Source link

June 21, 2024
We’re Still Waiting for the Next Big Leap in AI

[ad_1]

When OpenAI announced GPT-4, its latest large language model, last March, it sent shockwaves through the tech world. It was clearly more capable than anything seen before at chatting, coding, and solving all sorts of thorny problems—including school homework.

Anthropic, a rival to OpenAI, announced today that it has made its own AI advance that will upgrade chatbots and other use cases. But although the new model is the world’s best by some measures, it’s more of a step forward than a big leap.

Anthropic’s new model, called Claude 3.5 Sonnet, is an upgrade to its existing Claude 3 family of AI models. It is more adept at solving math, coding, and logic problems as measured by commonly used benchmarks. Anthropic says it is also a lot faster, better understands nuances in language, and even has a better sense of humor.

That’s no doubt useful to people trying to build apps and services on top of Anthropic’s AI models. But the company’s news is also a reminder that the world is still waiting for another AI leap forward in AI akin to that delivered by GPT-4.

Expectation has been building for OpenAI to release a sequel called GPT-5 for more than a year now, and the company’s CEO, Sam Altman, has encouraged speculation that it will deliver another revolution in AI capabilities. GPT-4 cost more than $100 million to train, and GPT-5 is widely expected to be much larger and more expensive.

Although OpenAI, Google, and other AI developers have released new models that out-do GPT-4, the world is still waiting for that next big leap. Progress in AI has lately become more incremental and more reliant on innovations in model design and training rather than brute-force scaling of model size and computation, as GPT-4 did.

Michael Gerstenhaber, head of product at Anthropic, says the company’s new Claude 3.5 Sonnet model is larger than its predecessor but draws much of its new competence from innovations in training. For example, the model was given feedback designed to improve its logical reasoning skills.

Anthropic says that Claude 3.5 Sonnet outscores the best models from OpenAI, Google, and Facebook in popular AI benchmarks including GPQA, a graduate-level test of expertise in biology, physics, and chemistry; MMLU, a test covering computer science, history, and other topics; and HumanEval, a measure of coding proficiency. The improvements are a matter of a few percentage points though.

This latest progress in AI might not be revolutionary but it is fast-paced: Anthropic only announced its previous generation of models three months ago. “If you look at the rate of change in intelligence you’ll appreciate how fast we’re moving,” Gerstenhaber says.

More than a year after GPT-4 spurred a frenzy of new investment in AI, it may be turning out to be more difficult to produce big new leaps in machine intelligence. With GPT-4 and similar models trained on huge swathes of online text, imagery, and video, it is getting more difficult to find new sources of data to feed to machine-learning algorithms. Making models substantially larger, so they have more capacity to learn, is expected to cost billions of dollars. When OpenAI announced its own recent upgrade last month, with a model that has voice and visual capabilities called GPT-4o, the focus was on a more natural and humanlike interface rather than on substantially more clever problem-solving abilities.

[ad_2]

Will Knight

Source link

June 20, 2024
Banks must address bias in large language models

[ad_1]

Sometimes the bias can be easy to identify and easily fixed. For example, the large training text might include toxic or hateful dialogue, in which case that text is identified and removed, write Zor Gorelov and Pablo Duboue, of Kasisto.
sdecoret/sdecoret – stock.adobe.com

In the rapidly evolving landscape of artificial intelligence for banking, the past 18 months have produced a fascinating evolution in the technology, the players and overall industry perception.

Even with its ambitious vision to transform the banking industry and its noteworthy early successes, generative AI has one well-known drawback: implicit bias, which poses a risk if unaddressed. For example, on Feb. 26, Google’s parent company, Alphabet, saw its market capitalization drop by the equivalent of Disney’s total net worth after its Gemini product was widely criticized for its issues with bias.

Is AI bias worth addressing? Is it worth addressing in banking? Absolutely. But what exactly is the problem and how does it get fixed?

Let’s begin by discussing the expectations of relevancy and freshness of the training data, particularly in the context of written content. By its very nature, once a word has been laid down to paper (or to electronic format) it is already an expression of the past.

Even if it was only written a week ago, it is now weekold news. This fundamental principle of relevancy and freshness in human communication particularly affects large language models, the brains behind generative AI. The training data that LLMs require combines large amounts of internet text from various time periods.

This text reflects different societal positions on various topics and is written in the language of those times. We can then say the LLM exhibits “bias” as a way of simplifying the problem. All cultures have explicit and implicit cultural biases. We notice the text is inappropriate because its bias is out of touch with our current societal perceptions, meaning LLMs are by definition being trained on outdated information.

Sometimes the bias can be easy to identify and easily fixed. For example, the large training text might include toxic or hateful dialogue, in which case that text is identified and removed.

For wide adoption of LLMs in banking, removing these biases is not only needed but also legally required. Producing customer communications with a gender or racial bias will clearly find pushback from customers and regulators. Most of the training data employed in LLMs is from the 1990s and 2000s when the culture of sharing text freely on the Internet was commonplace. Nowadays, more content is in images and video or behind paywalls.

Fast forward to 2024, our current society has significantly changed its views in many of these areas. Thus, at the very least, a tight human and regulatory oversight for these types of sensitivities is recommended.

Furthermore, cultural bias can be difficult to perceive for individuals immersed in a given culture. It is part of the “operating system” of the society. There are a number of recent technical advances that enable adjusting the LLM bias to conform to current times. It all starts by identifying the existing biases in the system and then using humans to indicate which variations in a text are to be preferred. This is the method used by OpenAI’s ChatGPT as well as other leading LLMs to add guardrails to overcome some of the existing bias. This process is very expensive in terms of both personnel and computer time.

In the world of banking, this process needs to be enhanced to prevent LLMs from being used for blatantly illegal activities, such as impersonation, to obtain a loan. Implementing guardrails is an approximation, and the process should be carefully managed as it is prone to overcorrection. This issue contributing to Alphabet’s value loss mentioned above was about their new product, Gemini, overcorrecting to the point of generating historically inaccurate iconography of the U.S. Founding Fathers.

Addressing implicit bias must start at the source. There is a growing understanding in the world of generative AI that the companies that train and build their LLMs on high-quality human-curated data and text, rather than large amounts of random data and text, will provide the most value to their customers.

In financial services, it is imperative to partner with vendors that use high-quality, banking-specific data sources to help mitigate the risk of implicit bias in the AI systems being developed.

Addressing biases necessitates a shift toward custom LLMs that are tailored for industry specific needs. An LLM that is built for banking offers the same experiences and features as the larger, general purpose LLM while also meeting the banking industry’s requirements for accuracy, transparency, trust and customization.

These models are not only more cost-effective to create and operate, but they also provide better performance compared to general-purpose LLMs. Moreover, as generative AI has evolved toward multimodal capabilities, integrating text, image and other data modalities, banks will be able to leverage this capability to analyze diverse types of information and deliver more comprehensive insights.

[ad_2]

Zor Gorelov

Source link

June 19, 2024
Adobe Says It Won’t Train AI Using Artists’ Work. Creatives Aren’t Convinced

[ad_1]

When users first found out about Adobe’s new terms of service (which were quietly updated in February), there was an uproar. Adobe told users it could access their content “through both automated and manual methods” and use “techniques such as machine learning in order to improve [Adobe’s] Services and Software.” Many understood the update as the company forcing users to grant unlimited access to their work, for purposes of training Adobe’s generative AI: Firefly.

Late on Tuesday, Adobe issued a clarification: In an updated version of its terms of service agreement, it pledged not to train AI on its user content stored locally or in the cloud and gave users the option to opt-out of content analytics.

Caught in the crossfire of intellectual property lawsuits, the ambiguous language used to previously update the terms shed light on a climate of acute skepticism among artists, many of whom over rely on Adobe for their work. “They already broke our trust,” says Jon Lam, a senior storyboard artist at Riot Games, referring to how award-winning artist Brian Kesinger discovered generated images in the style of his art being sold under his name on their stock image site, without his consent. Earlier this month, the estate of late photographer Ansel Adams publicly scolded Adobe for allegedly selling generative AI imitations of his work.

Scott Belsky, Adobe’s Chief Strategy Officer, had tried to assuage concerns when artists started protesting, clarifying that machine learning refers to the company’s non-generative AI tools—Photoshop’s “Content Aware Fill” tool, which allows users to seamlessly remove objects in an image, is one of the many tools done through machine learning. But while Adobe insists that the updated terms does not give the company content ownership and that they will never use user content to train Firefly, the misunderstanding triggered a bigger discussion about the company’s market monopoly and how a change like this could threaten livelihoods of artists at any point. Lam is among the artists that still believes that, despite Adobe’s clarification, the company will use work created on its platform to train Firefly without the creator’s consent.

The nervousness over non-consensual use and monetization of copyrighted work by generative AI models is not new. Early last year, artist Karla Ortiz was able to prompt images of her work using her name on various generative AI models; an offense that gave rise to a class action lawsuit against Midjourney, DeviantArt, and Stability AI. Ortiz was not alone—Polish fantasy artist Greg Rutkowski found that his name was one of the most commonly-used prompts in Stable Diffusion when the tool first launched in 2022.

As the owner of Photoshop and creator of PDFs, Adobe has reigned as the industry standard for over 30 years, powering the majority of the creative class. An attempt to acquire product design company Figma was blocked and abandoned in 2023 for antitrust concerns attesting to its size.

Adobe specifies that Firefly is “ethically trained” on Adobe Stock, but Eric Urquhart, long-time stock image contributor, insists that “there was nothing ethical about how Adobe trained the AI for Firefly,” pointing out that Adobe does not own the rights to any images from individual contributors. Urquhart originally put his images up on Fotolia, a stock image site, where he agreed to licensing terms that did not specify any uses for generative AI. Fotolia was then acquired by Adobe in 2015, which rolled out silent terms of service updates that later allowed the company to train Firefly using Eric’s photos without his explicit consent: “the language in the current change of TOS, it’s very similar to what I saw in the Adobe Stock TOS.”

[ad_2]

Tiffany Ng

Source link

June 19, 2024
How Game Theory Can Make AI More Reliable

[ad_1]

Posing a far greater challenge for AI researchers was the game of Diplomacy—a favorite of politicians like John F. Kennedy and Henry Kissinger. Instead of just two opponents, the game features seven players whose motives can be hard to read. To win, a player must negotiate, forging cooperative arrangements that anyone could breach at any time. Diplomacy is so complex that a group from Meta was pleased when, in 2022, its AI program Cicero developed “human-level play” over the course of 40 games. While it did not vanquish the world champion, Cicero did well enough to place in the top 10 percent against human participants.

During the project, Jacob—a member of the Meta team—was struck by the fact that Cicero relied on a language model to generate its dialog with other players. He sensed untapped potential. The team’s goal, he said, “was to build the best language model we could for the purposes of playing this game.” But what if instead they focused on building the best game they could to improve the performance of large language models?

Consensual Interactions

In 2023, Jacob began to pursue that question at MIT, working with Yikang Shen, Gabriele Farina, and his adviser, Jacob Andreas, on what would become the consensus game. The core idea came from imagining a conversation between two people as a cooperative game, where success occurs when a listener understands what a speaker is trying to convey. In particular, the consensus game is designed to align the language model’s two systems—the generator, which handles generative questions, and the discriminator, which handles discriminative ones.

After a few months of stops and starts, the team built this principle up into a full game. First, the generator receives a question. It can come from a human or from a preexisting list. For example, “Where was Barack Obama born?” The generator then gets some candidate responses, let’s say Honolulu, Chicago, and Nairobi. Again, these options can come from a human, a list, or a search carried out by the language model itself.

But before answering, the generator is also told whether it should answer the question correctly or incorrectly, depending on the results of a fair coin toss.

If it’s heads, then the machine attempts to answer correctly. The generator sends the original question, along with its chosen response, to the discriminator. If the discriminator determines that the generator intentionally sent the correct response, they each get one point, as a kind of incentive.

If the coin lands on tails, the generator sends what it thinks is the wrong answer. If the discriminator decides it was deliberately given the wrong response, they both get a point again. The idea here is to incentivize agreement. “It’s like teaching a dog a trick,” Jacob explained. “You give them a treat when they do the right thing.”

The generator and discriminator also each start with some initial “beliefs.” These take the form of a probability distribution related to the different choices. For example, the generator may believe, based on the information it has gleaned from the internet, that there’s an 80 percent chance Obama was born in Honolulu, a 10 percent chance he was born in Chicago, a 5 percent chance of Nairobi, and a 5 percent chance of other places. The discriminator may start off with a different distribution. While the two “players” are still rewarded for reaching agreement, they also get docked points for deviating too far from their original convictions. That arrangement encourages the players to incorporate their knowledge of the world—again drawn from the internet—into their responses, which should make the model more accurate. Without something like this, they might agree on a totally wrong answer like Delhi, but still rack up points.

[ad_2]

Steve Nadis

Source link

June 9, 2024
The racial wealth gap is getting wider. Can technology fix it?

[ad_1]

Four years ago, George Floyd was choked to death by a police officer after trying to use a possibly counterfeit $20 bill at a Minneapolis convenience store. Widespread outrage about the killing spurred the largest U.S. banks to vow to do their part to fix the inequalities in the American financial system.

JPMorgan Chase announced it would spend $30 billion to address social and economic inequities. Bank of America and Citi each pledged $1 billion. Wells Fargo promised $450 million, U.S. Bank $116 million.

Today, the banks say they’ve put this money to good use.

JPMorgan Chase says it’s invested $30.7 billion in racial equity initiatives, mostly in the preservation and construction of affordable housing. Citi says it has provided growth capital and technical assistance to minority depository institutions, invested in Black-owned businesses and affordable housing and is working to become an antiracist institution.

Wells Fargo has committed $150 million to a special purpose credit program. Bank of America says it’s committed $1.2 billion to advance economic opportunity, focusing on jobs, affordable housing, small businesses and health equity. U.S. Bank says it has stepped up lending to minority owned small businesses and mortgage down payment assistance in underserved communities.

Despite the tens of billions of dollars banks have spent, the racial wealth gap has actually widened over this time period.

According to the Federal Reserve Board’s most recent report on racial inequality, median wealth among white families was $285,000 in 2022, compared with $44,900 for Black families. That’s a difference of about $241,000. In 2019, the difference was roughly $191,000. For Hispanic families, the median wealth totaled $61,600 in 2022. That means the wealth gap between Hispanic and white families totaled $224,000, up from roughly $177,000 just three years earlier.

And while 72.7% of white Americans own their own home, only 44% of Black Americans do, according to the National Association of Realtors. Among Hispanic families, the home ownership rate is 50.6%; among Asian families, it’s 62.8%. Black people account for only 4.3% of the 22.2 million business owners in the U.S.

“The reality is, white America and people of color America are living in two different financial realities,” said Silvio Tavares, CEO of VantageScore. “And as Americans, we know that that’s not sustainable. Putting aside the moral aspects of it, just as a business proposition, that’s just not sustainable.”

“Wealth affects two important things on the household level. It affects education and the environment that you’re in. Without being able to improve those, you have this continuous cycle,” said Aaron Long, head of client advisory and strategy at Zest AI.

Impact of the racial wealth gap

Aaron Long grew up in the 1980s in St. Louis.

“In the inner cities, you had the drugs, the crack, all of that stuff,” said Long, who is head of client advisory and strategy at Zest AI, a technology company with an AI-based lending platform. “Wealth affects two important things on the household level. It affects education and the environment that you’re in. Without being able to improve those, you have this continuous cycle.”

People will sometimes blithely say that kids born in disadvantaged neighborhoods just have to pull themselves up by their bootstraps, work hard and overcome their circumstances. But Long says this cliche is not a realistic prescription to improve the lives of children growing up in poverty.

“It’s super tough to get out,” he said. “You don’t have the skills to do it. You don’t have the education to do it. You don’t know where to go to do it.”

Kids who grow up in poor inner cities have “small dreams,” Long said, “because that’s the only thing that you know how to dream about — you don’t see anyone in your family that you can pick up the phone and say, ‘How do I start a business?’”

And it’s been this way in the United States for decades. In the mid-1960s, the average Black household was making around 57 cents per dollar compared with the average white household, according to Long. Today it’s around 62 cents.

“You can see over the generations that the wealth gap is still there,” Long said. “If we continue with that trajectory, it’ll be well over 500 years before we’re able to have no wealth gap at all.”

Racism and systemic issues still prevent African Americans from getting approved for credit, said Tonita Webb, CEO of Verity Credit Union in Seattle.

“It is so traumatizing for some to even just walk into a bank to apply, because of their past experience,” she said. “I know people who won’t do it because they think the financial services industry is not for them because of all the nos that they have received.”

Some of those nos may have been for sound creditworthiness reasons, she said, but banks frequently also don’t take any steps to help move these applicants forward. Others are rejected “just because that’s been the history of our financial services industry,” Webb said.

A long history

Wole Coaxum left his job at JPMorgan Chase and started a fintech called Mocafi after Michael Brown, an 18-year-old Black man, was shot and killed by a police officer in Ferguson, Missouri, in 2014. A grand jury subsequently declined to indict the officer, and a firestorm of protests followed. Mocafi works with governments and nonprofits to provide financial services to underserved consumers.

“Watching the folks in Ferguson in the streets protesting, for me, was an instance of people fighting for social justice, but also a need for economic justice and a lack of access to opportunity,” said Coaxum. Their lack of resources was part of the reason they were in the streets, he thought.

In Coaxum’s view, the racial wealth gap “is deeply rooted in the bones of this country, and I’m reminded of it regularly.”

For instance, President Franklin Delano Roosevelt’s G.I. Bill was designed to help World War II veterans obtain affordable mortgages guaranteed by the Veterans Administration. But the loans were made by white-run financial institutions that rarely provided mortgages to Black people.

As a result, the vast majority of the benefits went to white service members. In one example, “fewer than 100 of the 67,000 mortgages insured by the GI Bill supported home purchases by non-whites” in the New York and northern New Jersey suburbs, historian Ira Katznelson wrote in the book “When Affirmative Action Was White: An Untold History of Racial Inequality in Twentieth-Century America.”

“The biggest economic driver of the 20th century that enabled us to become a superpower post World War II excluded Black people,” Coaxum said. “From a historical lens to a modern lens, there is a consistent thread of Black folks having less access to wealth building opportunities,” Coaxum said.

What it would take to shrink the racial wealth gap

The racial wealth gap is a huge, multifaceted problem with experts disagreeing over how to best close it. Some consider increased home ownership the answer, because of all the socioeconomic benefits that stem from that. Others focus on improvements in wages, basic income, increased savings or short-term loans that people can turn to in a pinch and, say, get new tires for their car so they can keep going to work. Others still think artificial intelligence will help. Many believe it will take a concerted effort by the banking industry, fintechs and government.

“It is a question I grapple with all the time,” Webb said. “And here’s where I land. We can make a difference for our small community and our small membership. But I think to make a difference for the overall wealth gap, the financial services industry has to make a decision to provide programs to undo systemic practices and policies and use technology, such as AI, that looks at other things besides the credit score, which we know is systemically created to have an advantage for some and a disadvantage for others.”

Financial services firms could provide education to help people understand the financial system and how to navigate it, she said. And products need to be developed for the purpose of shrinking the wealth gap.

If more than 70% of white people own homes and only 40-plus percent of Black people do, “there has to be something specifically done to close that gap,” Webb said.

It’s not enough for the government to put out a policy that companies can no longer discriminate, Webb said. There are already laws, including the Fair Housing Act of 1968 and the Equal Credit Opportunity Act, that prohibit lending discrimination based on race — and yet these issues persist.

“We’ve had decades and years of discrimination,” she said. “We also have to create programs that give access where folks didn’t have access before in order to shrink that gap. We’ve got to remember there are underserved communities that are way behind, so they’re playing the catch-up game.”

Coaxum sees the racial wealth gap as a market failure that would be best solved in partnership with the government. Banks are driven to target more affluent — and in general, white — customers. These consumers tend to have more assets that the banks hope to help them invest. Originating one larger mortgage for a more expensive home is seen as less of a hassle than making several smaller loans for more modest houses. Credit decisions tend to be easier, and lenders feel more assured they will be paid back.

“If left to the private sector, it’s going to come along in a drive towards efficiency that doesn’t necessarily have a wide net that is systematic, sustainable and strong enough to close the wealth gap in our communities,” Coaxum said.

Until local, state or federal government does something, “we’re just going to have a series of really smart people building really interesting companies, but may not have the scale that’s required to really meaningfully shift the needle,” Coaxum said.

One thing governments could do is rethink how they get resources to the unbanked and underbanked of their communities and work with partners to do this digitally, rather than through checks and benefits cards, Coaxum said.

Coaxum’s fintech, Mocafi, for instance, works with New York City to provide immigrants with debit cards they can use to receive help.

New migrants to New York are processed at the Roosevelt Hotel in Manhattan. They used to receive food deliveries every three days but this inevitably meant that uneaten food was thrown out, making the effort expensive and wasteful. With Mocafi, the city is testing giving immigrants a preloaded debit card so that they can buy their own food. According to Coaxum, this new system is a third of the cost of having food delivered and gives participants more choice in what they eat. It also puts dollars into the community and reduces waste, he said.

The credit gap

Tavares’ family came to the United States from Angola when he was 10 years old. His mother was a physician and his father was a politician turned professor. His parents found a house they liked in a safe neighborhood with good public schools. His father went to the local savings bank to apply for a mortgage.

“He fully anticipated that he would be approved because he had a Ph.D.,” Tavares, VantageScore’s CEO, recalled. “He was a professor at a prestigious university, and he had money in the bank.”

The application came back a couple weeks later: Denied. When his father walked into the bank branch to ask why, he was told it was because he was an immigrant and didn’t have a credit report. Tavares’ parents talked about this a lot at the kitchen table.

“I was just starting to learn English, but I kept on hearing this weird word, ‘mortgage,’” Tavares said.

It’s degrading and discouraging to be declined for credit the way his family was, Tavares said.

“When you say to somebody, you are not creditworthy, what they often focus on is not the credit part, but the banker saying, ‘You are not worthy,’” he said.

That stigma is part of the reason why African Americans and Hispanics often are suspicious of the banking system, “because they have a relative or somebody that they know who was very hardworking, very focused on savings, but then when they applied, they got denied,” Tavares said.

In Tavares’ case, his father decided to use the family’s entire savings to buy the house, against his mother’s objections that if any one of them got sick, the family would be ruined. His father said the family would build a credit report over three or four years, refinance and get the money back.

“They were able to do that, and that’s what paid for my engineering degree, my MBA and my law degree,” Tavares said.

Starting in the fourth quarter, the Federal Housing Finance Agency will require lenders to use VantageScore 4.0 scoring models in order to sell mortgages to Fannie Mae and Freddie Mac. VantageScore 4.0 uses machine learning and trended credit data to assess the creditworthiness of people who have limited credit history. Trended data shows a person’s pattern of financial behavior over a set period of time, generally about 24 months. Tavares estimates that this will enable 4.9 million new borrowers to become eligible for a mortgage and 2.7 million will be able to easily get a new mortgage because their credit score will be above 620.

Everyone who is creditworthy should have access to a mortgage, which is the key to unlocking financial stability, Tavares said.

Demonstrators hold up images of George Floyd during a protest in 2021. Floyd was choked to death by a police officer after trying to use a possibly counterfeit $20 bill. His death spurred large U.S. banks to pledge funds to help fix the inequities in the U.S. financial system.
Christian Monterrosa/Bloomberg

“If you own a home, all sorts of great things flow from that: better access to public schools, a financial security cushion when times get rough, because you can dip into your home equity,” Tavares said. “Eventually when kids finish public high school, they can go on to college and you can tap your home equity to finance that.”

Besides mortgages, access to other types of credit, such as an auto loan, can make a significant difference in closing the racial wealth gap, experts said.

“Being able to access a car directly translates into better opportunities to tap new work opportunities,” Tavares said. “It gives you the ability to find the best job in your area, the one that pays the highest wages, and that translates directly into increased wealth and closing that racial wealth gap.”

Solo Funds, a Los Angeles fintech that hosts a platform on which people in disadvantaged communities make small loans to one another, is closing the racial wealth gap for its members, according to co-founder Rodney Williams.

Solo Funds’ borrowers have saved nearly $30 million in fees they would have paid had they used a credit card, Williams said. And people who lend on the platform are seeing their money grow for the first time in their lives, he said.

Solo doesn’t have the budget to do much marketing, he said.

“But if you go into the inner city community, if you go to the barber shop and you have a flat tire, someone’s going to say, use Solo,” Williams said. “That’s just the word on the street.”

The need for alternative data

Some blame the banking industry’s reliance on the FICO score and traditional credit history data for the persistence of the racial wealth gap.

“There’s not enough data in the traditional credit bureau system to give lenders confidence about how to lend to segments that are not well represented in the credit bureau file,” said Misha Esipov, founder and CEO of Nova Credit. “To better serve those segments, you need to have a platform which includes the infrastructure, the analytics and the compliance to better understand those segments.”

Nova Credit’s platform provides credit bureau data (including from other countries), bank account transaction data and rent payment history as well as analytics and income verification.

“Our belief is that when you have more data and more visibility, you can responsibly serve these segments that the traditional credit bureau model just doesn’t quite capture,” Esipov said.

One in five Americans have no credit score because they don’t have enough credit history to be scored, said Brian Hughes, former chief risk officer at Discover.

Yet 95% of American adults have a checking account, “which is a great source of data and payroll data,” Hughes said. “There’s light that can be brought to these customers that don’t have a credit score. And once it’s brought, then adoption can happen and if adoption happens, greater inclusion happens,” he said.

Webb at Verity Credit Union agrees the FICO score is not sufficient to determine creditworthiness. FICO scores are calculated using data in credit reports that is grouped into five categories: payment history, amounts owed, length of credit history, new credit and credit mix. (FICO also offers UltraFICO, a model through which consumers opt to have a bank incorporate an analysis of their bank account data into their score. VantageScore offers a similar product, VantageScore 4plus.)

“A FICO score really only looks at five or six different pieces of data,” Webb said. “There’s lots of other ways that we can get more information about somebody’s character. Someone shouldn’t have to pay for the rest of their lives for maybe a blip in their lives.”

For instance, a consumer could get a cancer diagnosis that impacts their ability to work for a time, she said.

“That is life and that is part of credit,” Webb said. “You can’t make somebody pay for this for 10 years. The situation can improve and no longer be a mitigating factor to how they’re going to pay their bills moving forward.”

Banks’ and credit unions’ efforts to use alternative data, such as checking account data, to inform lending decisions is a step in the right direction, in Coaxum’s view.

“But you can’t forget that check cashers and pawn shops and payday lenders are serving this customer, and those data elements are not in the algorithms,” he pointed out.

If algorithms had data from these sources, banks would have “a pretty good shot at maybe reimagining lending for this population,” Coaxum said. “That dataset would allow you to come up with some more interesting and creative lending solutions that you could feed the algorithms that might open the market up.”

While check cashers and pawn shops don’t report repayments of loans to credit bureaus, they do sometimes report when people don’t repay, creating a double negative for people who don’t have access to bank branches. The same is typically true for rent payments — the landlords that do report to credit bureaus tend to only report missed payments, not payments.

Some see hope in a movement to get landlords to report tenants’ rent payment to the credit bureaus. This could give people who can’t afford to purchase a home a way to build a credit history and work toward possibly obtaining a mortgage.

Esusu, for example, facilitates the reporting of on-time rent payments to credit agencies. It partners with government-sponsored housing enterprises like Fannie Mae and Freddie Mac.

The company says it has unlocked billions of dollars in credit and facilitated access to loans, mortgages and student loans for individuals who were previously underserved.

“The tangible increase in credit scores among renters and the creation of new credit tradelines demonstrate progress in bridging the racial wealth gap by providing financial opportunities to those who were previously credit invisible,” said Samir Goel, co-founder and co-CEO of Esusu.

AI-based lending

Some bank and fintech leaders think AI could help close the racial wealth gap.

“We are in the early stages of assessing the transformative power of AI,” said Carolina Jannicelli, head of community impact at JPMorgan Chase. “We do believe that advancements in technology, as has been the case throughout history, have the potential to advance our economy and positively impact communities.”

Since Verity Credit Union began using Zest AI in lending decisions last year, it has seen a significant increase in the number of approvals for protected status applicants, including a 271% rise for individuals aged 62 and older, a 177% increase for African Americans and a 375% uptick for Asian Americans and Pacific Islanders. Approvals for women increased by 194% and by 158% for Hispanic borrowers.

The $809 million-asset credit union tries not to decline people without helping them get to a yes, Webb said.

“Not everyone has been told how to navigate finances,” Webb said. “We also understand, especially for traditionally underserved individuals, there’s a lot of trauma around finances. So dealing with those issues that may be present for folks helps get them in the position of a yes for some of the loans.”

The credit union is using Zest AI software to make unsecured auto loans, credit cards and personal loans. It meets quarterly with Zest’s data analytics team to review data on the results.

Tia Narron, chief lending officer at Verity Credit Union, considers a borrower’s current ability to repay the loan a much stronger indicator than if the person’s credit history indicates a brief past financial challenge.

The company hopes to use this technology beyond lending, for things like preapprovals and account opening.

“It is so traumatizing for some to even just walk into a bank to apply, because of their past experience,” said Tonita Webb, CEO of Verity Credit Union in Seattle. “I know people who won’t do it because they think the financial services industry is not for them because of all the nos that they have received.”

AI’s unintended consequences

As the many recent examples of inaccuracies, hallucinations and bias in generative AI models show, AI is obviously not a cure-all.

“I believe that technology is an accelerant, not necessarily a problem solver,” Coaxum said. “It could make the problem worse if we’re not careful.”

The use of AI to make decisions doesn’t equate to treating people equally, Coaxum said, because AI models are dependent on the datasets they are fed. And where banks aren’t serving minority communities, or aren’t serving them much, they lack the necessary data.

According to the Federal Reserve Bank of Philadelphia, since the onset of the COVID-19 pandemic, the total number of U.S. bank branches has declined by 5.6%. The number of so-called banking deserts — neighborhoods where no banks have a physical presence — has increased by 217, and the population living in banking deserts has increased by more than 760,000 people.

A consequence of under-serving minority communities is that when banks are building datasets to inform the algorithms they use for lending decisions, they don’t have a large enough data sample to be able to really understand payment behaviors of these customer bases.

“It becomes, in my mind, challenging to have a robust lending framework,” Coaxum said. “Not because they’re not good people, not because they don’t want to, they just don’t have the customer base.”

There is a chance AI could perpetuate discrimination, resulting in further unequal treatment of racial minorities, Goel said.

“To mitigate the risk of worsening the racial wealth gap, we have to ensure that AI systems are ethically developed, regularly audited for biases, and are regulated to prioritize fairness and inclusivity in financial services,” he said.

AI systems used in commercial settings are typically trained on past human-generated data, pointed out Daniel Susskind, economics professor at King’s College London, senior research associate at the Institute for Ethics in AI at Oxford University and author of the book “Growth: A History and a Reckoning.”

“So a system that determines who gets a job interview is in part trained on the sorts of decisions that human interviewers have made in the past,” Susskind said. “The great risk, and we see this in practice, is that the sorts of biases that people exhibit in human decision making simply get replicated and in some cases magnified by these systems, which are learning how to act from human experience.”

When AI models do demonstrate biases after being trained on human data, “quite often they tell us interesting and uncomfortable things about ourselves,” Susskind said. “They hold a mirror up sometimes to our own biases, some of which we didn’t know that we had.”

In a paper entitled, “What’s in a Name? Auditing Large Language Models for Race and Gender Bias,” Stanford law school graduate student Amit Haim, research fellow Alejandro Salinas de Leon and Prof. Julian Nyarko, who is also associate director of the Stanford Institute for Human-Centered Artificial Intelligence, tried asking ChatGPT and other large language models for help in several scenarios, such as buying a car or a bicycle, using different names. Names commonly associated with white men, such as Dustin, Hunter and Jake, produced the best results. Names associated with Black women, such as Keyana, Lakisha and Latonya, received the least advantageous outcomes.

“Models are trained on historically biased data,” said Salinas de Leon. “So when you put bias in, you will get bias out on the other side. If we continue on this path without properly reviewing the models and the training data they are given, then we’ll definitely increase the gap because we’re unaware of all the biases that they were trained on.”

On the other hand, algorithms have less intentional bias than humans, Nyarko pointed out.

“Algorithms don’t have animus,” he said. “In the law, we care a lot about, do you have discriminatory intent? When algorithms make decisions, they don’t have the intent to hurt minorities. They might do that as a byproduct, but for humans, there can be specific intent or subconscious biases.”

According to Laura Kornhauser, founder and CEO of Stratyfy, transparency is key for a fintech providing AI-based underwriting and fairness models. Many models are tested after they’ve made decisions, which can make it hard to revise the models, she said.

“That ends up being really essential in this bias question,” Kornhauser said. “If I’m just feeding the data we have into a machine, even if I’m doing some smart things around dual optimization and adversarial biasing, if I can’t see inside the guts of the machine and make changes to how it’s working, then the risk of that bias that exists in the data being propagated forward is very real and very meaningful.”

Stratyfy is working with Underwriting for Racial Justice on a pilot with several lenders to drive greater fairness and access within BIPOC communities.

“That ends up being such a hard piece of really moving that racial wealth gap as it relates to availability of fairly priced credit,” Kornhauser said. “So many lenders are so set in the way they’ve done things before.”

Part of a broader issue

The racial wealth gap is part of an overall wealth gap in America. According to Advisorpedia, more than 70% of wealth in America is owned by 10% of families. The gap between the haves and the have nots isn’t new, but it has been growing.

“When you look at 74% of Americans, according to our Inside the Wallet report, living paycheck to paycheck, you realize very quickly that it’s just everybody you know,” said Michael Woodhead, chief commercial officer of FinFit.

“Despite the best efforts of organizations like ours that are focused on financial wellness solutions and services, this problem’s only gotten worse, and it was exacerbated by macroeconomics that came out of the pandemic,” Woodhead said.

In his view, the financial services industry in this country has always been set up to serve people who have extra money at the end of the month, and they take that extra money and help them make it more money.

“As a result, if you don’t have extra money at the end of the month, the financial services industry really doesn’t have much to offer you,” Woodhead said.

The way most Americans who are living paycheck to paycheck solve problems of lack of liquidity is with debt services that they can’t afford, which creates even more problems, Woodhead said.

“But financially healthy people, even if they don’t have savings to speak of, have access to affordable credit,” Woodhead said.

FinFit works with employers to provide financial services to individuals who are underserved by the marketplace today, he said. It offers access to credit for emergencies or for long-term debt consolidation, with interest rates of 7.9% to 24.9%. Applicants don’t need to have a FICO or VantageScore score, and instead, FinFit relies on a machine learning algorithm to price its loans.

The most important thing FinFit offers is an emergency savings solution, Woodhead said. “So the next time I have a financial emergency, I have an option: I could use credit, or I could use my own emergency savings account that I have built up over time,” Woodhead said.

The traditional financial services industry has been paternalistic in telling people they’re spending too much money — if they would just spend less than they make, they wouldn’t have these problems, Woodhead said.

“That’s the way we have tried as an industry to solve this problem for about 30 years: by shaking a finger at people,” Woodhead said.

The cost of doing nothing

Banks that don’t try to address the racial wealth gap face an existential threat, Tavares said.

“The demographics of our society are changing and technology has to keep pace in order for the lending system to continue to be resilient, growing, fair and free from risk,” Tavares said. “What people don’t often think about is there’s a significant cost to not updating and innovating the technology for lending.”

Some lenders hold that what worked 30 years ago or 20 years ago is tried and true and will continue to work today.

“There’s actually a risk for that because in the America that we have today, the borrowers are not the same as 30 years ago,” Tavares said. “And yet you’re using this old, outdated technology, so there’s a risk also of not innovating.”

Many banks are making the decision to include more updated and inclusive technology because it’s a business imperative in a country that’s rapidly becoming majority-minority demographically, he said.

“If you look at a state like California, 58% of the population is Asian American, Hispanic American, and African American,” Tavares said. “If you can’t lend effectively to those people because you have outdated technology, that’s a business problem, that’s a profitability bottom line problem,” he said.

[ad_2]

Penny Crosman

Source link

June 6, 2024
OpenAI Offers a Peek Inside the Guts of ChatGPT

[ad_1]

ChatGPT developer OpenAI’s approach to building artificial intelligence came under fire this week from former employees who accuse the company of taking unnecessary risks with technology that could become harmful.

Today, OpenAI released a new research paper apparently aimed at showing it is serious about tackling AI risk by making its models more explainable. In the paper, researchers from the company lay out a way to peer inside the AI model that powers ChatGPT. They devise a method of identifying how the model stores certain concepts—including those that might cause an AI system to misbehave.

Although the research makes OpenAI’s work on keeping AI in check more visible, it also highlights recent turmoil at the company. The new research was performed by the recently disbanded “superalignment” team at OpenAI that was dedicated to studying the technology’s long-term risks.

The former group’s coleads, Ilya Sutskever and Jan Leike—both of whom have left OpenAI—are named as coauthors. Sutskever, a cofounder of OpenAI and formerly chief scientist, was among the board members who voted to fire CEO Sam Altman last November, triggering a chaotic few days that culminated in Altman’s return as leader.

ChatGPT is powered by a family of so-called large language models called GPT, based on an approach to machine learning known as artificial neural networks. These mathematical networks have shown great power to learn useful tasks by analyzing example data, but their workings cannot be easily scrutinized as conventional computer programs can. The complex interplay between the layers of “neurons” within an artificial neural network makes reverse engineering why a system like ChatGPT came up with a particular response hugely challenging.

“Unlike with most human creations, we don’t really understand the inner workings of neural networks,” the researchers behind the work wrote in an accompanying blog post. Some prominent AI researchers believe that the most powerful AI models, including ChatGPT, could perhaps be used to design chemical or biological weapons and coordinate cyberattacks. A longer-term concern is that AI models may choose to hide information or act in harmful ways in order to achieve their goals.

OpenAI’s new paper outlines a technique that lessens the mystery a little, by identifying patterns that represent specific concepts inside a machine learning system with help from an additional machine learning model. The key innovation is in refining the network used to peer inside the system of interest by identifying concepts, to make it more efficient.

OpenAI proved out the approach by identifying patterns that represent concepts inside GPT-4, one of its largest AI models. The company released code related to the interpretability work, as well as a visualization tool that can be used to see how words in different sentences activate concepts, including profanity and erotic content, in GPT-4 and another model. Knowing how a model represents certain concepts could be a step toward being able to dial down those associated with unwanted behavior, to keep an AI system on the rails. It could also make it possible to tune an AI system to favor certain topics or ideas.

[ad_2]

Will Knight

Source link

June 6, 2024
Google’s AI Overviews Will Always Be Broken. That’s How AI Works

[ad_1]

A week after its algorithms advised people to eat rocks and put glue on pizza, Google admitted Thursday that it needed to make adjustments to its bold new generative AI search feature. The episode highlights the risks of Google’s aggressive drive to commercialize generative AI—and also the treacherous and fundamental limitations of that technology.

Google’s AI Overviews feature draws on Gemini, a large language model like the one behind OpenAI’s ChatGPT, to generate written answers to some search queries by summarizing information found online. The current AI boom is built around LLMs’ impressive fluency with text, but the software can also use that facility to put a convincing gloss on untruths or errors. Using the technology to summarize online information promises can make search results easier to digest, but it is hazardous when online sources are contractionary or when people may use the information to make important decisions.

“You can get a quick snappy prototype now fairly quickly with an LLM, but to actually make it so that it doesn’t tell you to eat rocks takes a lot of work,” says Richard Socher, who made key contributions to AI for language as a researcher and, in late 2021, launched an AI-centric search engine called You.com.

Socher says wrangling LLMs takes considerable effort because the underlying technology has no real understanding of the world and because the web is riddled with untrustworthy information. “In some cases it is better to actually not just give you an answer, or to show you multiple different viewpoints,” he says.

Google’s head of search Liz Reid said in the company’s blog post late Thursday that it did extensive testing ahead of launching AI Overviews. But she added that errors like the rock eating and glue pizza examples—in which Google’s algorithms pulled information from a satirical article and jocular Reddit comment, respectively—had prompted additional changes. They include better detection of “nonsensical queries,” Google says, and making the system rely less heavily on user-generated content.

You.com routinely avoids the kinds of errors displayed by Google’s AI Overviews, Socher says, because his company developed about a dozen tricks to keep LLMs from misbehaving when used for search.

“We are more accurate because we put a lot of resources into being more accurate,” Socher says. Among other things, You.com uses a custom-built web index designed to help LLMs steer clear of incorrect information. It also selects from multiple different LLMs to answer specific queries, and it uses a citation mechanism that can explain when sources are contradictory. Still, getting AI search right is tricky. WIRED found on Friday that You.com failed to correctly answer a query that has been known to trip up other AI systems, stating that “based on the information available, there are no African nations whose names start with the letter ‘K.’” In previous tests, it had aced the query.

Google’s generative AI upgrade to its most widely used and lucrative product is part of a tech-industry-wide reboot inspired by OpenAI’s release of the chatbot ChatGPT in November 2022. A couple of months after ChatGPT debuted, Microsoft, a key partner of OpenAI, used its technology to upgrade its also-ran search engine Bing. The upgraded Bing was beset by AI-generated errors and odd behavior, but the company’s CEO, Satya Nadella, said that the move was designed to challenge Google, saying “I want people to know we made them dance.”

Some experts feel that Google rushed its AI upgrade. “I’m surprised they launched it as it is for as many queries—medical, financial queries—I thought they’d be more careful,” says Barry Schwartz, news editor at Search Engine Land, a publication that tracks the search industry. The company should have better anticipated that some people would intentionally try to trip up AI Overviews, he adds. “Google has to be smart about that,” Schwartz says, especially when they’re showing the results as default on their most valuable product.

Lily Ray, a search engine optimization consultant, was for a year a beta tester of the prototype that preceded AI Overviews, which Google called Search Generative Experience. She says she was unsurprised to see the errors that appeared last week given how the previous version tended to go awry. “I think it’s virtually impossible for it to always get everything right,” Ray says. “That’s the nature of AI.”

[ad_2]

Will Knight

Source link

May 31, 2024
Google Admits Its AI Overviews Search Feature Screwed Up

[ad_1]

When bizarre and misleading answers to search queries generated by Google’s new AI Overview feature went viral on social media last week, the company issued statements that generally downplayed the notion the technology had problems. Late Thursday, the company’s head of search, Liz Reid, admitted that the flubs had highlighted areas that needed improvement, writing, “We wanted to explain what happened and the steps we’ve taken.”

Reid’s post directly referenced two of the most viral, and wildly incorrect, AI Overview results. One saw Google’s algorithms endorse eating rocks because doing so “can be good for you,” and the other suggested using nontoxic glue to thicken pizza sauce.

Rock eating is not a topic many people were ever writing or asking questions about online, so there aren’t many sources for a search engine to draw on. According to Reid, the AI tool found an article from The Onion, a satirical website, that had been reposted by a software company, and it misinterpreted the information as factual.

As for Google telling its users to put glue on pizza, Reid effectively attributed the error to a sense of humor failure. “We saw AI Overviews that featured sarcastic or troll-y content from discussion forums,” she wrote. “Forums are often a great source of authentic, first-hand information, but in some cases can lead to less-than-helpful advice, like using glue to get cheese to stick to pizza.”

It’s probably best not to make any kind of AI-generated dinner menu without carefully reading it through first.

Reid also suggested that judging the quality of Google’s new take on search based on viral screenshots would be unfair. She claimed the company did extensive testing before its launch and that the company’s data shows people value AI Overviews, including by indicating that people are more likely to stay on a page discovered that way.

Why the embarassing failures? Reid characterized the mistakes that won attention as the result of an internet-wide audit that wasn’t always well intended. “There’s nothing quite like having millions of people using the feature with many novel searches. We’ve also seen nonsensical new searches, seemingly aimed at producing erroneous results.”

Google claims some widely distributed screenshots of AI Overviews gone wrong were fake, which seems to be true based on WIRED’s own testing. For example, a user on X posted a screenshot that appeared to be an AI Overview responding to the question “Can a cockroach live in your penis?” with an enthusiastic confirmation from the search engine that this is normal. The post has been viewed over 5 million times. Upon further inspection, though, the format of the screenshot doesn’t align with how AI Overviews are actually presented to users. WIRED was not able to recreate anything close to that result.

And it’s not just users on social media who were tricked by misleading screenshots of fake AI Overviews. The New York Times issued a correction to its reporting about the feature and clarified that AI Overviews never suggested users should jump off the Golden Gate Bridge if they are experiencing depression—that was just a dark meme on social media. “Others have implied that we returned dangerous results for topics like leaving dogs in cars, smoking while pregnant, and depression,” Reid wrote Thursday. “Those AI Overviews never appeared.”

Yet Reid’s post also makes clear that not all was right with the original form of Google’s big new search upgrade. The company made “more than a dozen technical improvements” to AI Overviews, she wrote.

Only four are described: better detection of “nonsensical queries” not worthy of an AI Overview; making the feature rely less heavily on user-generated content from sites like Reddit; offering AI Overviews less often in situations users haven’t found them helpful; and strengthening the guardrails that disable AI summaries on important topics such as health.

There was no mention in Reid’s blog post of significantly rolling back the AI summaries. Google says it will continue to monitor feedback from users and adjust the features as needed.

[ad_2]

Reece Rogers

Source link

May 30, 2024