Data Phoenix Digest - ISSUE 54

Charity webinar "The promising role of synthetic data to enable responsible innovation", YOLOV7 object counter logic, AutoML for object detection, neural density-distance fields, retrieval-augmented diffusion models, MinVIS, CLIFF, OFA, news, courses, tools, and more.

Dmitry Spodarets
Dmitry Spodarets

Data Phoenix Events team invites you all on October 19 to our "The A-Z of Data" charity AI webinar. The topic - "The promising role of synthetic data to enable responsible innovation".

Good quality FAIR data is fundamental for enhancing data reuse. When we discuss data quality in the FAIR context, we often focus on the metadata level quality attributes like accessibility and reuse conditions rather than the semantic ones like imbalances, outliers, and duplicates. In practice, ensuring both the metadata and semantic levels of data quality is crucial but also challenging. One solution for this challenge is synthetic data. MIT technology review names synthetic data as one of the ten tech breakthroughs of 2022 citing it as a solution for training AI models when faced with inadequate quality, or incomplete data or biased data. Synthetic data improves data quality and helps accelerate AI projects enabling responsible innovation. Let's understand how it works in practice with the experience of the co-founder of a synthetic data company and how to check for data quality at scale using open-source libraries, as well as metrics required to measure the ensuing synthetic data quality.



YOLOV7 Object Counter Logic
YOLO (“You Only Look Once”) is an effective real-time object recognition algorithm. YOLOv7 is the fastest and newest YOLO model. Don’t hesitate to check it out!

Automated Model Deployment with BentoML and Kubeflow
In this article, you’ll find a proof-of-concept using BentoML and Yatai, explaining how to automate model deployment and model retraining in your own setup. Enjoy!

How to Create Synthetic Dataset for Computer Vision (Object Detection)
Training of any object detection model requires a ready-to-use dataset with images featuring objects of interest and annotations. There can be a workaround for this tiresome process, though.

SKU110K-DenseDet : A Machine Learning Model That Can Detect Products in a Store
SKU110K-DenseDet is a machine learning model that can detect the bounding boxes of products  on a supermarket shelf without categorization. Only the presence of a product is detected.

AutoML for Object Detection: How to Train a Model to Identify Potholes
Initial algorithm selection and hyperparameter optimization can be tricky. Thankfully, AutoML can help. Learn about using the new Azure AutoML feature for object detection.

Training Larger-Than-Memory PyTorch Models Using Gradient Checkpointing
In this article, you’ll learn about gradient checkpointing, a model training technique that works by recomputing the intermediate values of a deep neural net at backward time.

MLOps Foundation Roadmap for Enterprises with Amazon SageMaker
As enterprise businesses embrace machine learning, manual workflows become bottlenecks to innovation. This article explains how to automate and streamline the end-to-end ML lifecycle.


Retrieval-Augmented Diffusion Models
Generative image synthesis with diffusion models has achieved excellent visual quality in such tasks such as text-based or class-conditional image synthesis. This paper presents an alternative.

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
The authors offer a unified paradigm for multimodal pretraining. They propose OFA, a Task-Agnostic and Modality-Agnostic framework that supports Task Comprehensiveness.

Expanding Language-Image Pretrained Models for General Video Recognition
The authors propose a cross-frame attention mechanism that exchanges information across frames. The module is lightweight and can be plugged into pretrained language-image models seamlessly.

CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Top-down methods dominate the field of 3D human pose and shape estimation. The authors propose to Carry Location Information in Full Frames (CLIFF) into this task to solve its challenges.

MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training
MinVIS is a minimal video instance segmentation (VIS) framework that achieves state-of-the-art VIS performance with neither video-based architectures nor training procedures. Check it out!

Multi-scale Multi-band DenseNets for Audio Source Separation
This paper presents a novel network architecture that extends the recently developed densely connected convolutional network (DenseNet) for image classification tasks.

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
This paper offers a method for generating specific concepts (e.g. personal objects or artistic styles) by describing them using "words" in the embedding space of pre-trained text-to-image models.

GAUDI: A Neural Architect for Immersive 3D Scene Generation
GAUDI is a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera. Learn more!

Neural Density-Distance Fields
This paper proposes Neural Density-Distance Field (NeDDF), a novel 3D representation that reciprocally constrains the distance and density fields solving a lot of challenges of NeRFs.


Reproducible Deep Learning [PhD Course in Data Science]
In this practice course, you’ll start by building a simple DL model in a notebook and continue to make it reproducible through code versioning, data versioning, experiment logging, etc.

Art From Code [Workshop]
R is a mainstream language for data science and analytics. This workshop provides a hands-on introduction to generative art in R, to help you learn artistic techniques of generative artists.


Silero Models
Silero Models is a collection of pre-trained enterprise-grade STT / TTS models and benchmarks. They ensure quality comparable to Google's STT but are refreshingly simple to use.

Metaseq is a codebase for working with Open Pre-trained Transformers. The OPT 125M-30B models are now available in HuggingFace Transformers. Check them out!