Data Phoenix Digest - ISSUE 8.2024
Webinar "Training an Expert Finance LLM", Customizing Large Language Models, Efficiently Finetune Llama 3 with PyTorch FSDP and Q-Lora, Improving Diffusion Models for Authentic Virtual Try-on in the Wild, StoryDiffusion, Invisible Stitch, and more.
Welcome to this week's edition of Data Phoenix Digest!
Be active in our community and join our Slack to discuss the latest news, events of our community, research papers, articles, jobs, and more...
What can good data do for you? - Twilio Segment
Great customer experiences require better data. With customer profiles that update real-time, and best in class privacy features - Segment's Customer Data Platform allows you to make good data available to every team.
Deliver personalized experiences at scale. Twilio Segment helps 25,000 companies power their most important objectives with data they can trust.
Data Phoenix's upcoming webinar:
The challenge with financial agents successfully completing complex workflows like tabular reasoning or sentiment analysis often comes down to the reliability of executing numerous chained tasks together. Establishing the p99s necessary has to happen at the model level, yet most finance domain-specific LLMs are either only pre-training (BloombergGPT) or using supervised fine-tuning (FinBERT).
This presentation reveals how we transformed an open-source model into Albatross, capable of performing at the top of the leaderboard on chat as well as domain-specific tasks. Our journey involved an intensive data pipeline and training regiment, incorporating a combination of continual pre-training, fine-tuning, and preference optimization, to customize the model for the intricacies of financial tasks. We'll share our insights on overcoming the execution hurdle, which is often the downfall of AI projects in specialized domains.
Key Highlights of the Webinar:
- Building Domain-Specific Models: Explore how to evolve an open-source model into a leading domain-specific model like Albatross - capable of excelling in both general and domain-specific tasks.
- Model Transformation Techniques: Learn about the intensive data pipeline and training regimen that included continual pre-training, fine-tuning, and preference optimization.
- Customization for Financial Tasks: Understand the specific strategies used to tailor Albatross for financial tasks, addressing the unique intricacies of this field.
- Importance of Performance Metrics: Gain insight into why establishing high-performance benchmarks (like p99s) at the model level is crucial for success in finance-specific applications, where current financial LLMs often focus only on pre-training or supervised fine-tuning.
Explore recordings of all our past webinars to deepen your AI knowledge and enhance your learning journey:
- Evaluating LLM Models for Production Systems: Methods and Practices
- Exploring Infrastructure Management for GenAI Beyond Kubernetes
- Democratizing AI Deployment
- GPT on a Leash: Evaluating LLM-based Apps & Mitigating Their Risks
- Building Customized CV Applications with FiftyOne Plugins
- Large Language Models for Program Synthesis
- Lessons Learned from Building a Managed RAG Solution
- and more
200+ AI Models. One API. 24/7 AI Solution
AI/ML API specializes in delivering a comprehensive suite of AI models, including predictive analytics, natural language processing, and image recognition, among others. Ideal for developers, tech startups, and innovation labs, this tool simplifies the integration of AI technologies into applications, enhancing functionalities and driving forward the boundaries of what's possible.
ARTICLES, TUTORIALS, and LECTURES
Customizing Large Language Models
In this step-by-step article, the author explains how to use the Modelfile in Ollama to change how an existing LLM (Llama2) behaves when interacting with it. He also shows how to save newly customized models to a personal namespace on the Ollama server.
Stanford Seminar: Transformers United
In this Stanford seminar, the lecturers examine the details of how transformers work and dive deep into the different kinds of transformers and how they are applied in different fields. The seminar combines instructor lectures, guest lectures, and classroom discussions.
Efficiently Finetune Llama 3 with PyTorch FSDP and Q-Lora
Unlocking the potential of LLMs often involves fine-tuning them on custom data. Fine-tuning smaller LLMs can be done on a single GPU by using Q-Lora. But efficiently fine-tuning bigger models like Llama 3 70b or Mixtral is a challenge. See how it can be done!
DragonCrawl: Generative AI for High-Quality Mobile Testing
DragonCrawl is a system that uses LLMs to execute mobile tests with the intuition of a human. It decides what actions to take based on the screen it sees and independently adapts to UI changes. Learn more about it in this article!
Building DoorDash’s Product Knowledge Graph with Large Language Models
Building an in-house attribute extraction/tagging model requires a significant amount of labeled training data. But LLMs can perform NLP with reasonable accuracy without requiring many labeled examples. See how this can be used to build a product knowledge graph!
PAPERS & PROJECTS
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation
StoryDiffusion is a novel framework that helps maintain consistent content across a series of generated images. It uses Consistent Self-Attention, a new way of self-attention calculation, and Semantic Motion Predictor, a novel semantic space temporal motion prediction module, to describe a text-based story with consistent images or videos. Check it out!
Improving Diffusion Models for Authentic Virtual Try-on in the Wild
IDM-VTON is an image-based virtual try-on, which renders an image of a person wearing a curated garment, given a pair of images depicting the person and the garment, respectively. IDM-VTON, uses two different modules to encode the semantics of garment image. Learn more about them and real-world testing results!
Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting
In this paper, the authors make two fundamental contributions to the 3D scene generation field: note that lifting images to 3D with a monocular depth estimation model is suboptimal; introduce a novel depth completion model, trained via teacher distillation and self-training to learn the 3D fusion process. Explore their method in more detail!