Data Phoenix Digest - ISSUE 3.2024
Latest AI News, Become a Speaker at Data Phoenix Webinars, Getting Started with Diffusers for Text-to-Image, Merge Large Language Models, Building an LLMOPs Pipeline, Dealing with MRI and Deep Learning with Python, Lumiere, Diffutoon, SEELE, and more.
Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!
Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...
Click here for details.
Call For Speakers
We regularly host webinars for our global AI and Data community of Engineers, Executives, and Founders. Being a speaker at our events is a great opportunity to share your technical expertise and knowledge with the Data Phoenix community.
To be considered for a speaker in our future events, please fill out this form.
Latest news
- The EU AI Act is closer than ever after representatives vote to confirm the law's final draft.
- Figure AI, a startup building a humanoid robot, is in funding talks with Microsoft and OpenAI.
- The linear transformer-based Eagle-7B is the strongest multi-language open-source LLM to date.
- Meta released a new version of the Code Llama model family
- The PyTorch libraries received an update following the PyTorch 2.2 release
- Dynatrace provides enterprises with an end-to-end operational view of AI-powered applications
- The Synthesia AI Video assistant helps users turn text into video
- Version Lens has secured pre-seed funding for its project manager co-pilot
- Databricks announces its acquisition of AI startup Einblick
- Sentify raises $1.1M in pre-seed funding to help its users extract insights from LLM products
- CARPL - Radiology AI Platform closed a successful $6M seed round
- Google announced Imagen 2, a DeepMind-powered update to its image-generation technology
Find details and more news in our weekly review, "This Week in AI".
Summary of the top articles and papers
Articles
Getting Started with Diffusers for Text-to-Image
Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and 3D structures of molecules. In this comprehensive article, you will learn how to generate images from text descriptions using this library.
Introducing ASPIRE for Selective Prediction in LLMs
ASPIRE is a novel framework designed to enhance the selective prediction capabilities of LLMs. It significantly outperforms state-of-the-art selective prediction methods on a variety of QA datasets, such as the CoQA benchmark. Learn more about it!
Merge Large Language Models
Model merging is a novel technique that allows to combine multiple models by merging them into one. In doing so, it is possible to retain quality while also getting additional benefits. This article explores various merging algorithms, ways of implementing them, and more.
Building an LLMOPs Pipeline
Productionizing LLMs is a consistent challenge for users. This article explains how to build a pipeline that fine-tunes, deploys, and evaluates a Llama 7B model, to help evaluate different models, datasets, and prompts. Check out the results!
Dealing with MRI and Deep Learning with Python
Embarking on CV tasks through DL involves using standard public image datasets such as ImageNet, characterized by 3-channel RGB natural images. But the images are different in format and features in different domains! This article explains how to deal with this.
Papers & projects
LUMIERE: A Space-Time Diffusion Model for Video Generation
Lumiere is a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse, and coherent motion. Based on a Space-Time U-Net architecture, it generates the entire temporal duration of the video at once, through a single pass in the model. Learn more about its state-of-the-art text-to-video generation results!
Diffutoon
Diffutoon is a toon shading approach, aiming to transform photorealistic videos in anime styles. It can handle exceptionally high resolutions and rapid motions. It is capable of editing the content according to prompts. Diffutoon surpasses both open-source and closed-source baseline approaches in the experiments.
SEELE: Repositioning The Subject Within Image
Current image manipulation centers on static manipulation, such as replacing regions within an image or altering its overall style. SEELE is an innovative dynamic manipulation task and a subject repositioning method that employs an interactive pre-processing, manipulation, and post-processing pipeline for subject repositioning.
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Recaption, Plan and Generate (RPG) is a new training-free text-to-image generation/editing framework that harnesses the powerful chain-of-thought reasoning ability of multimodal LLMs to enhance the compositionality of text-to-image diffusion models. RPG framework exhibits wide compatibility with various MLLM architectures and diffusion backbones.
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All
"Diffuse to Choose" is a novel diffusion-based image-conditioned inpainting model that efficiently balances fast inference with the retention of high-fidelity details in a given reference item while ensuring accurate semantic manipulations in the given scene content. It is superior to existing zero-shot diffusion inpainting methods.