Data Phoenix Digest

Data Phoenix Digest - ISSUE 12.2023

Best practices for building LLM-based applications, pandas for time series, use Stable Diffusion XL with Amazon SageMaker JumpStart, training Diffusion models with Reinforcement Learning, AI / ML / LLM / Transformer models timeline and list, GigaGAN, FABRIC, AnimateDiff, AutoRecon, and more.

by Dmitry Spodarets

Updated August 01, 2023

Hey folks,

Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!

Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...

Join our Slack

📣

Want to promote your company, conference, job, or event to the Data Phoenix community of Data & AI researchers and engineers? Click here for details.

Data Phoenix upcoming webinars:

Best practices for building LLM-based applications
Many businesses started incorporating Large Language Models into their applications. There are, however, several challenges that may impact such systems. It’s great to be aware of them before you start. During the talk, we will review the existing tools and see how to move from development to production without a headache.

LlamaIndex: How to use Large Language Models to Interface with Multiple Data Sources / August 3
Leveraging Large Language Models for Enterprise Usage / August 17
Go Beyond Chatbot - Emerging Patterns in Generative AI Applications / August 24

Stand with Data Phoenix in the AI Revolution

From fresh startups to global enterprises - the world is abuzz with AI. Yet, critical information can easily be missed in the noise. By subscribing to Data Phoenix Premium, you can support our mission to provide comprehensive AI coverage.

Ignite your knowledge with Data Phoenix.

Subscribe now!

Summary of the top articles and papers

Articles

Using AI to Fight Climate Change
Climate change is an incredibly complex topic, and it only makes sense to utilize the data analytics power of AI to advance our understanding of climate change, optimize existing systems, and accelerate breakthrough science of climate and its effects.

Pandas for Time Series
There are many definitions for time series, but generally it’s defined as a set of data points collected over a period of time. This article explains how to apply Pandas to a time series dataset, with an example of generated blood sugar level records.

Leveraging Llama 2 Features in Real-world Applications: Building Scalable Chatbots with FastAPI, Celery, Redis, and Docker
This article provides a step-by-step guide on building a chat app using FastAPI, Celery, Redis, and Docker with Meta’s Llama 2. The application demonstrates the potential of open-sourced language models like Llama 2 in a commercial setting.

Use Stable Diffusion XL with Amazon SageMaker JumpStart in Amazon SageMaker Studio
SDXL 1.0 is the latest image generation model from Stability AI. With it now available for customers through Amazon SageMaker JumpStart, it is important to figure out how you can use it for a variety of tasks on AWS. Check out this explainer to learn more!

Training Diffusion Models with Reinforcement Learning
The researchers from UC Berkeley show how diffusion models are trained on downstream objectives using RL. They finetune Stable Diffusion on a variety of objectives to improve the model’s performance on unusual prompts, demonstrating how AI models can improve each other without humans in the loop.

Papers & projects

AI / ML / LLM / Transformer Models Timeline and List
This collection highlights the most important works in the realm of Large Language Models and Transformer Models, with a particular focus on recent developments since mid-2022. While it's not intended to be exhaustive, it provides an overview of the progression in this field. Please note that updates are made actively to keep abreast of the latest research.

GigaGAN: Large-scale GAN for Text-to-Image Synthesis
In this paper, the authors present a 1B-parameter GigaGAN, achieving lower FID than Stable Diffusion v1.5, DALL·E 2, and Parti-750M. They also train a fast upsampler that can generate 4K images from the low-res outputs of text-to-image models. Learn more!

FABRIC: Personalizing Diffusion Models with Iterative Feedback
In this study, the authors investigate the integration of iterative human feedback into diffusion-based text-to-image models with FABRIC (Feedback via Attention-Based Reference Image Conditioning), a training-free approach that allows to condition the diffusion process on a set of feedback images, applicable to popular diffusion models.

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
The authors propose a framework to animate personalized text-to-image models. It appends a newly-initialized motion modeling module to the frozen based text-to-image model, and trains it on video clips thereafter to distill a reasonable motion prior. All personalized versions become text-driven models that produce personalized animated images.

AutoRecon: Automated 3D Object Discovery and Reconstruction
AutoRecon is a novel framework for the automated discovery and reconstruction of an object from multi-view images. Foreground objects can be robustly located and segmented from SfM point clouds by leveraging self-supervised 2D vision transformer features. Experiments demonstrate the effectiveness and robustness of AutoRecon.

🤗

If you enjoy our work, we would greatly appreciate your support by sharing our digest with your friends on Twitter, LinkedIn, or Facebook using the hashtag #dataphoenix. Your help in reaching a wider audience is invaluable to us!

by Dmitry Spodarets

Updated August 01, 2023

Subscribe to Our Newsletter

Data Phoenix Digest - ISSUE 12.2023

Data Phoenix upcoming webinars:

Summary of the top articles and papers

Articles

Papers & projects

DeepMind researchers trained an AI to see if it could mediate discussions on divisive topics

NotebookLM is no longer an experiment and it is getting a Business edition

dottxt is focused on making LLM outputs more predictable

Lightmatter raised $400M to advance its photonic interconnect technology

COMPL-AI is the first evaluation framework for the EU AI Act compliance evaluation