Multilingual Semantic Search
This talk will discuss why multilingual semantic search is amazing, how respective models are trained, and the new use cases this unlocks.
This talk will discuss why multilingual semantic search is amazing, how respective models are trained, and the new use cases this unlocks.
Connecting Large Language Models with embeddings and semantic search on your own data has become widely popular. But how does this work in other languages and across languages? Join me for this talk why multilingual semantic search is amazing, how respective models are trained, and new use-cases this unlocks.
Nils Reimers
Nils Reimers did his Ph.D. and Post-Doc at the TU Darmstadt, where he created the foundation on how to use transformer networks for semantic search. After his post-doc, he joined Hugging Face to work on self-supervised domain adaptation for semantic search. Last year, Nils joined Cohere.com as director of machine learning to work on large language models for text understanding, including search, classification and text aggregation.
Beijing-based Moonshot AI raised $2 billion at a $20 billion valuation, quintupling its value in six months as investor interest in Chinese open-weight AI models surges due to increased customer demand.
Tekst raised $13.5 million in Series A funding to build its "Process Intelligence" technology that helps AI agents understand and automate complex enterprise workflows by automatically mapping the unwritten rules and context hidden in emails, documents, and institutional knowledge.
Google DeepMind launched Gemma 4 in April, a family of open-source AI models under Apache 2.0 license that delivers state-of-the-art reasoning across four sizes—including two edge-optimized models that run autonomous agentic workflows entirely offline on mobile and IoT devices.
Dex, an AI-powered recruiting platform for tech talent, raised $5.3 million in seed funding to expand its conversational AI agent that matches engineers with companies. The startup reports it has reached about $1.8M in ARR since it launched its paid services.
DeepSeek released DeepSeek-V4, an open-source 1.6-trillion-parameter model with a one-million-token context window that achieves near-frontier performance at roughly one-sixth the API cost of GPT-5.5 and Claude Opus 4.7.
Data Phoenix is a live media platform for AI and Data professionals, covering technologies under the hood, best practices, and live demos from the builders shaping the industry, via original shows.
Copyright © 2026 Data Phoenix. Published with Ghost and Data Phoenix.
Privacy Policy | Terms of Service | Cookie Preferences
Comments