Data Phoenix Digest - ISSUE 17.2023

Gen AI Data Chain at Scale, RAG vs Fine Tuning, Feature Store Design at Constructor, Multimodal Chain-of-Thought Reasoning in Language Models, Build and Deploy ML Inference Applications from Scratch Using SageMaker, ReBotNet, BiomedGPT, and more.

Dmitry Spodarets
Dmitry Spodarets

Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!

Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...

Want to promote your company product, conference, job, or event to the Data Phoenix community of Data & AI researchers and engineers? Click here for details.

Data Phoenix's upcoming webinars:

Gen AI data chain at scale
Generative AI workflows heavily rely on data-centric tasks—such as filtering samples by annotation fields, vector distances, or scores produced by custom classifiers. At the same time, computer vision datasets are quickly approaching petabyte volumes, rendering data wrangling difficult. In addition, the iterative nature of data preparation necessitates robust dataset sharing and versioning mechanisms, both of which are hard to implement ad-hoc. In this workshop we will introduce DVCx - an upcoming product by Iterative that separates the storage and processing of samples from metadata and enables data-centric operations at scale for machine learning teams and individual researchers.

​​Speaker: Tibor Mach is a Machine Learning Solutions Engineer at He has been working in ML and MLOps in the past 5 years. Tibor has a Ph.D in mathematics from the University of Göttingen and had published papers in the field of probability theory prior to refocusing to ML.

Summary of the top articles and papers


RAG vs Fine Tuning — Which Is the Best Tool to Boost Your LLM Application?
RAG and fine tuning are often considered two almost similar techniques to enhance the performance of LLM-based applications. However, they actually address different aspects of the optimization process, and this is crucial when it comes to choosing one over the other. This article dives deep into the difference between their two, with their pros and cons.

Build and Deploy ML Inference Applications from Scratch Using Amazon SageMaker
AI/ML-powered inference applications are more and more often applied to solve a range of complex business problems. The solution to these complex business problems requires using multiple ML models and steps. This article provides a step-by step guide on how to build and host an ML application with custom containers on Amazon SageMaker.

PEFT: Parameter-Efficient Fine-Tuning of Billion-Scale Models on Low-Resource Hardware
As models get larger, full fine-tuning becomes infeasible to train on consumer hardware. Storing and deploying fine-tuned models independently for each downstream task becomes expensive too. PEFT approaches are meant to address both problems!

Optimize Your Machine Learning Deployments with Auto Scaling on Amazon SageMaker
This article offers a design pattern for deriving the right auto scaling configuration. It includes a list of steps to follow, so even if your app has a unique behavior, such as different system characteristics, this systemic approach can be applied to determine the right scaling policies.

Feature Store Design at Constructor
In this article, the author discusses feature challenges in real-time ML, walks through the current design of the Feature Store in Constructor, goes over key decision drivers behind it, and shows how DS experimentation workflow benefits from the Feature Store API.

Papers & projects

Putting the Object Back into Video Object Segmentation
Cutie is a video object segmentation (VOS) network with object-level memory reading, which puts the object representation from memory back into the video object segmentation result. Cutie can cleanly separate the semantics of the foreground object from the background.

4K4D: Real-Time 4D View Synthesis at 4K Resolution
4K4D is a 4D point cloud representation that supports hardware rasterization and enables unprecedented rendering speed. It is built on a 4D feature grid so that the points are naturally regularized and can be robustly optimized. Learn more about it!

ReBotNet: Fast Real-time Video Enhancement
ReBotNet is a video architecture designed to swiftly enhance live streams and video conferences in real-time. ReBotNet outperforms existing approaches with lower computations, reduced memory requirements, and faster inference time. Take a look!

Multimodal Chain-of-Thought Reasoning in Language Models
In this paper, the authors propose Multimodal-CoT that incorporates language and vision modalities into a two-stage framework that separates rationale generation and answer inference, to achieve better generated rationales. Learn how they do it!

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks
Biomedical Generative Pre-trained Transformer (BiomedGPT) is a unified and generalist model that uses self-supervision on large and diverse datasets to accept multi-modal inputs and perform downstream tasks. It delivers expansive and inclusive representations of biomedical data, outperforming the majority of preceding state-of-the-art models.

DataPhoenix is free today. Do you enjoy our digests and webinars? Value our AI coverage? Your support as a paid subscriber helps us continue our mission of delivering top-notch AI insights. Join us as a paid subscriber in shaping the future of AI with the DataPhoenix community.

Data Phoenix Digest