Data Phoenix Digest

Data Phoenix Digest - ISSUE 58

Webinar "dstack – a command-line utility to provision infrastructure for ML workflows", a unified benchmark for mathematical reasoning, fine-tuning language models via epistemic neural networks, how Uber optimizes the timing of push notifications using ML, news, and more.

by Dmitry Spodarets

Updated November 11, 2022

Data Phoenix Events

Data Phoenix team invites you all to our upcoming "AI Project Spotlight" charity webinar that’s going to take place on November 30, 2022 at 16.00 CET.

Topic: “dstack – a command-line utility to provision infrastructure for ML workflows”.
Speaker: Andrey Cheptsov, CEO and Founder of dstack.ai
Language: English
Participation: free (but you’ll be required to register)
Karma perk: donate to our charity initiative

ABOUT THE SPEAKER AND TOPIC

Andrey is the creator of dstack. He is passionate about open-source and developer tools for AI. Previously, Andrey worked at JetBrains with the PyCharm team.

"Unlike traditional development workflows, ML workflows are difficult to run on a local machine (due to the lack of memory, more CPUs/GPUs, etc).

Imagine, if you could run your ML workflows the very same way as you do it locally, but they would run in the cloud. And you wouldn’t need to worry about provisioning infrastructure, setting up the environment, etc.

We are delighted to introduce dstack, an open-source tool that is designed to do exactly that."

NEWS

PAPERS

DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models

DiffusionDB is the first large-scale text-to-image prompt dataset. It contains 2 million images generated by Stable Diffusion using prompts and hyperparameters specified by real users.

eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers

eDiff-I is an ensemble of diffusion models that helps improve text alignment while maintaining the same inference computation cost and preserving high visual quality. It outperforms previous large-scale text-to-image diffusion models on the standard benchmark.

Musika! Fast Infinite Waveform Music Generation

The new Musika music generation system can be trained on hundreds of hours of music using a single consumer GPU, and allows much faster generation of arbitrary length music on a consumer CPU.

Crosslingual Generalization through Multitask Finetuning
Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting. In this paper, the authors go beyond English and apply MTF to the pretrained multilingual BLOOM and mT5 model families to get impressive results.

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model
Machine learning is increasingly resource-intensive, which comes with a cost to the environment. In this paper, the authors try to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle.

Fine-Tuning Language Models via Epistemic Neural Networks
Large language models are extensively used in machine learning. In this paper, the authors show how you can augment these models with an epinet: a small additional network architecture that helps to estimate model uncertainty and form an epistemic neural network (ENN).

Lila: A Unified Benchmark for Mathematical Reasoning
LILA is a unified mathematical reasoning benchmark consisting of 23 diverse tasks along four dimensions: 1) mathematical abilities; 2) language format; 3) language diversity; 4) external knowledge. All these skills are essential for general-purpose intelligent systems.

Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models
Spatially Sparse Inference (SSI) is a general-purpose technique that selectively performs computation for edited regions and accelerates various generative models, including conditional GANs and diffusion models. They way it works can lead to a significant savings of computation.

ARTICLES

How Uber Optimizes the Timing of Push Notifications using ML and Linear Programming
Uber needed a comprehensive approach to handling push notifications. They introduced the Consumer Communication Gateway (CCG): a centralized intelligence layer to manage the quality, ranking, timing, and frequency of push notifications on a user level.

Run Text Generation with GPT and Bloom Models on Amazon SageMaker JumpStart
In this article, the AWS team provides a walkthrough on how to deploy pre-trained text generation models, via JumpStart’s UI in Amazon SageMaker Studio and programmatically through JumpStart APIs, to demonstrate how you can obtain the same result in different modes.

Meet Julia: The Future of Data Science
Julia is considered by many to be “future programming language of data science” that can replace Python and R in Data Science. Is it so? In this article, we will look into what Julia is, its applications, and whether it is worth learning the language for Data Science.

Temporal Fusion Transformer: Time Series Forecasting with Deep Learning — Complete Tutorial
Temporal Fusion Transformer outperforms all prominent DL models for time series forecasting. In this article, we briefly explain the novelties of Temporal Fusion Transformer and build an end-to-end project on Energy Demand Forecasting.

Modeling Starbucks Waiting Time Using Markov Chains, with Python
In this article, the author creatively deeps into the probabilistic problem, trying to figure out how long he will have to wait for his next cup of coffee. He explains how to build and use a Time-Dependent Markov Chain to get accurate predictions.

We hope that you liked the digest. Kindly help us make DataPhoenix a better place for all readers. Please, take part in our survey — It won’t take more than a few minutes!

Get Started

by Dmitry Spodarets

Updated November 11, 2022