Data Phoenix Digest

Data Phoenix Digest - ISSUE 8.2023

Upcoming webinars and video records of our past events, how you can evaluate your ML model, how to build ML pipeline, using activation functions in DL models, monitoring Time Series model, DragGAN, BloombergGPT, HuaTuo, SAM, reproducible DL course, and more.

by Dmitry Spodarets

Updated June 11, 2023

Hey folks,

Welcome to this week's edition of Data Phoenix Digest! This newsletter keeps you up-to-date on the news in our community and summarizes the top research papers, articles, and news, to keep you track of trends in the Data & AI world!

Be active in our community and join our Slack to discuss the latest news of our community, top research papers, articles, events, jobs, and more...

Join our Slack

📣

Want to promote your company, conference, job, or event to the Data Phoenix community of Data & AI researchers and engineers? Click here for details.

Data Phoenix community news

Upcoming webinars:

Unlocking Data Value with Large Language Models / June 15
Reducing NLP Inference costs through model specialisation / June 22

Video records of past events:

📣

Don't miss out! Subscribe to our YouTube channel now and be the first to receive notifications about the video records of past events and other valuable content to help you stay ahead!

Summary of the top articles, papers, and courses

Articles

Introducing an image-to-speech Generative AI application using Amazon SageMaker and Hugging Face
Describe for Me is a website which helps the visually impaired understand images through image caption, facial recognition, and text-to-speech, a technology we refer to as “Image to Speech.” in this blog post will walk you through the Solution Architecture behind “Describe For Me”, and the design considerations of the solution.

How to Evaluate the Performance of Your ML/ AI Models
This article explores how you can evaluate your AI or ML model. Regardless of the type of model you have and your end application, you will learn how to improve its performance by using the Wine dataset from sklearn, applying the support vector classifier (SVC), and then testing the model’s metrics. Check it out!

How to Build an End-To-End ML Pipeline
End-to-end machine learning pipelines can help engineers save lots of their precious time and resources, and allow them to focus more on deploying new models than maintaining existing ones. In this article, you will learn how to quickly build and deploy an end-to-end ML pipeline with Kubeflow Pipelines on AWS.

Using Activation Functions in Deep Learning Models
In PyTorch, there are many activation functions available for use in your deep learning models. In this post, you will see how the choice of activation functions can impact the model. Take a deep dive into how activation functions work and how to use them.

Monitoring Your Time Series Model in Comet
This tutorial goes through steps on how to use Comet to monitor a time-series forecasting model. The author explains how to carry out some EDA on the dataset, and then log the visualizations onto the Comet experimentation website or platform.

Papers & projects

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
This paper explores a new way of controlling GANs that includes "dragging" any points of the image to reach target points in a user-interactive manner - DragGAN. It can help to deform an image with precise control over where pixels go, thus manipulating the pose, shape, expression, and layout of diverse categories.

Structural Pruning for Diffusion Models
Diff-Pruning is a compression method tailored for learning lightweight diffusion models from pre-existing ones, without extensive re-training. The essence of Diff-Pruning is encapsulated in a Taylor expansion over pruned timesteps, a process that disregards non-contributory diffusion steps and ensembles informative gradients to identify important weights.

BloombergGPT: A Large Language Model for Finance
BloombergGPT is a 50 billion parameter language model that is trained on a wide range of financial data. It is validated on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage.

HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge
HuaTuo is a LLaMA-based model that has been supervised-fine-tuned with generated Q&A instances. The experimental results demonstrate that HuaTuo generates responses that possess more reliable medical knowledge.

Segment Anything Model (SAM): a new AI model from Meta AI that can "cut out" any object, in any image, with a single click
SAM is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. The model was trained on Meta AI’s SA-1B dataset for 3-5 days on 256 A100 GPUs. Make sure that you try it!

Courses

Reproducible Deep Learning: PhD Course in Data Science
Building a DL model is a complex task. The aim of this course is to start from a simple DL model implemented in a notebook, and port it to a ‘reproducible’ world by including code versioning, data versioning, experiment logging, hyper-parameter tuning, etc.

Responsible AI Course by Machine Learning University
This course is designed to introduce you to several dimensions of Responsible AI with a focus on fairness criteria and bias mitigation. In 30 short videos, you will learn about different fairness criteria, bias measurements, and bias mitigation techniques.

🤗

If you enjoy our work, we would greatly appreciate your support by sharing our digest with your friends on Twitter, LinkedIn, or Facebook using the hashtag #dataphoenix. Your help in reaching a wider audience is invaluable to us!

by Dmitry Spodarets

Updated June 11, 2023

Subscribe to Our Newsletter

Data Phoenix Digest - ISSUE 8.2023

Data Phoenix community news

Upcoming webinars:

Video records of past events:

Summary of the top articles, papers, and courses

Articles

Papers & projects

Courses

Mistral AI released Mistral Large 2, a multilingual, tool use-capable, open model of its own

The FTC is gathering information on surveillance pricing products and services

A new Meta AI update brings multilingual support, Llama 3.1 models, and "Imagine me" prompts

Meta wants the open-source Llama 3.1 405B to compete with heavyweights like GPT-4 and Claude 3.5 Sonnet

Adobe introduced new Firefly AI-powered features for Photoshop and Illustrator