Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!
Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...
Click here for details.
Data Phoenix's upcoming webinar:
A Whirlwind Tour of ML Model Serving Strategies (Including LLMs)
There are many recipes to serve machine learning models to end users today, and even though new ways keep popping up as time passes, some questions remain: How do we pick the appropriate serving recipe from the menu we have available, and how can we execute it as fast and efficiently as possible? In this talk, we’re going to go through a whirlwind tour of the different machine learning deployment strategies available today for both traditional ML systems and Large Language Models, and we’ll also touch on a few do’s and don’ts while we’re at it. This session will be jargonless but not buzzwordy- or meme-less.
- AI in 2023: A year in review.
- Meta is developing open source AGI, says Zuckerberg.
- Google's new Gemini-powered tool helps advertisers quickly create search campaigns using chat.
- Stability AI presented Stable LM 2 1.6B - a small language model trained on multilingual data in English, Spanish, German, Italian, French, Portuguese, and Dutch.
- MLOps platform Seldon has changed the license from Apache License Version 2.0 (“Apache 2.0”) to Business Source License v1.1 (“BSL”).
- Runway Gen-2 adds multiple motion controls to AI videos with Multi Motion Brush.
- Deci introduced two new models: DeciDiffusion 2.0 for image generation and DeciCoder-6B for code generation.
- Google DeepMind’s new AI system can solve complex geometry problems.
- Generative AI and Ukraine emerge as priorities at the 2024 World Economic Forum.
- Pinecone debuts Pinecone serverless, a revamped vector database.
- New investment rounds were raised by ElevenLabs ($80M), Qdrant ($28M), RagaAI ($4.7M), Sakana AI ($30M), SKY ENGINE AI ($7M), Phospho (€1.7M), Quora ($75M) and others.
Summary of the top articles and papers
Deploying Kedro Pipelines on Vertex AI: The MLOps journey of a Life Company
Kedro is an open-source Python framework for creating reproducible, sustainable, and modular data science code. In this article, you’ll learn how Kedro can be used to streamline the adoption and implementation of MLOps best practices and methods in a life company.
A Comprehensive Overview of Gaussian Splatting
3D Gaussian Splatting is a recent volume rendering method useful to capture real-life data into a 3D space and render them in real-time. This article explores the topic, providing an overview of how a 3D world can be represented with a set of 3D points and going deeper into comparing the rendering speed of Gaussian Splatting vs. NeRF.
Fine-tune and Deploy Llama 2 Models Cost-Effectively in Amazon SageMaker JumpStart with AWS Inferentia and AWS Trainium
The cost-efficiency of model deployment, fine-tuning, and inference should always be on the agenda of any ML engineer. Because no matter how good your ML models are, if the cost is through the roof business won’t start to use them in the real world. This article dives deep into these considerations, with a focus on the AWS ecosystem of solutions and services.
A Simple CI/CD Setup for ML Projects
Integrations, deployment, and scalability, a part of any ML project, can be quite challenging to tackle. And while MLOps professionals can (and should) help, knowing some standard well-defined practices that can help you when you kick off a project is also a must. This article provides some of them. Give it a read!
Deep Learning for Single-cell Sequencing: A Microscope to See the Diversity of Cells
Deep learning empowers scientists to explore previously undiscovered aspects of cellular behavior. This article explores how its application in single-cell sequencing functions as an advanced microscope can reveal intricate insights within individual cells and provide a profound understanding of cellular heterogeneity and complexity in biological systems.
Papers & projects
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding
PhotoMaker is an efficient personalized text-to-image generation method, which encodes an arbitrary number of input ID images into a stack ID embedding for preserving ID information. It satisfies the requirements of high efficiency, promising identity (ID) fidelity, and flexible text controllability, which makes it unique among text-to-image generation methods.
Neural Spline Fields for Burst Image Fusion and Layer Separation
Burst imaging pipelines are essential to high-quality cellphone photography. In this work, burst image stacks are used for layer separation. The authors demonstrate how their method can help remove occlusions, suppress reflections, and erase photographer-cast shadows, outperforming learned single-image and multi-view obstruction removal methods.
Fast Inference of Mixture-of-Experts Language Models with Offloading
Mixture-of-Experts (MoE), a type of model architecture where only a fraction of model layers are active for any given input, can be used to run Large Language Models more efficiently. In this work, the authors study the problem of running large MoE language models on consumer hardware with limited accelerator memory and share the results.
4DGen: Grounded 4D Content Generation with Spatial-temporal Consistency
4DGen is a novel framework for grounded 4D content generation that decomposes the 4D generation task into multiple stages. It yields competitive results in reconstructing input signals and inferring renderings from novel viewpoints and timesteps. It supports grounded generation, offering users enhanced control, which is hard to achieve with previous methods.
SiLK – Simple Learned Keypoints
In this work, the authors propose Simple Learned Keypoints (SiLK). Despite its simplicity, SiLK advances new state-of-the-art on Detection Repeatability and Homography Estimation tasks on HPatches and 3D Point-Cloud Registration task on ScanNet, and achieves competitive performance to state-of-the-art on camera pose estimation.
Self-Rewarding Language Models
Superhuman agents require superhuman feedback. In this work, the authors study Self-Rewarding Language Models, where the language model is used via LLM-as-a-Judge prompting to provide its own rewards during training. Their model outperforms many existing systems on the AlpacaEval 2.0 leaderboard. Check out the results!
3D-aware Blending with Generative NeRFs
BlendNerf is a 3D-aware blending method using generative Neural Radiance Fields (NeRF), including two key components: 3D-aware alignment and 3D-aware blending. It automatically aligns and composes images with different camera poses and object shapes.
Adding Conditional Control to Text-to-Image Diffusion Models
ControlNet is a neural network structure designed to control pre-trained large diffusion models to support additional input conditions. The ControlNet learns task-specific conditions in an end-to-end way, and the learning is robust even when the training dataset is small.