Subscribe to Our Newsletter

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn't arrive within 3 minutes, check your spam folder.

Ok, Thanks
Google is reportedly testing Gemini against Anthropic's Claude
Credit: Google

Google is reportedly testing Gemini against Anthropic's Claude

Google contractors have been reportedly asked to compare and score Gemini's responses against Claude's using criteria including truthfulness and verbosity. A spokesperson for Google DeepMind confirmed the company was only using Claude for model output comparison, a standard industry practice.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara

According to recent reports, Google contractors are being tasked with comparing Gemini's responses to Claude's against each other and scoring them on criteria including truthfulness and verbosity. The reports are based on internal documentation and model outputs seen by TechCrunch. In the correspondence, contractors note the increasing number of references to Anthropic's AI assistant Claude. Moreover, at least one of the model outputs obtained by TechCrunch explicitly states “I am Claude, created by Anthropic.”

Anthropic is known for prioritizing safety. The startup often positions itself as a more ethical and safety-focused alternative to competitors, including OpenAI and Google itself. Contractors note in the correspondence that in many cases, Claude will refuse to answer a query that other assistants will not consider unsafe, ranging from pretending to be a different AI assistant to cases where Claude wouldn't answer and Gemini's response had to be flagged as a severe safety violation for involving nudity and bondage.

TechCrunch could not confirm whether Google asked for Anthropic's authorization before performing these tests. Anthropic's commercial terms of service forbid using Claude to train rival products, services, and AI models. Google DeepMind spokesperson Shira McNamara emphasized that Claude was not being used in any capacity to train Gemini. Rather, the company leverages Claude for evaluation purposes, using processes such as comparing model outputs, a standard industry practice.

Ellie Ramirez-Camara profile image
by Ellie Ramirez-Camara
Updated

Data Phoenix Digest

Subscribe to the weekly digest with a summary of the top research papers, articles, news, and our community events, to keep track of trends and grow in the Data & AI world!

Success! Now Check Your Email

To complete Subscribe, click the confirmation link in your inbox. If it doesn’t arrive within 3 minutes, check your spam folder.

Ok, Thanks

Read More