Data Phoenix Digest - ISSUE 44

Kubeflow pipeline from scratch, financial text classification using FinBERT, a single model for many visual modalities, a gentle introduction to supervised learning, GAN-based facial editing of real videos, NN-SVG, GreaseLM, Data2vec, videos, and more ...

Dmitry Spodarets
Dmitry Spodarets


Tutorial — Basic Kubeflow Pipeline From Scratch
In this step-by-step guideline, we will go through every step that is necessary to have a functioning Kubeflow pipeline. Get ready for a bit of practice!

Detect NLP Data Drift Using Custom Amazon SageMaker Model Monitor
In this article, you’ll learn about the types of data quality drift in NLP data and also explore a new approach to detecting data drift using Amazon SageMaker Model Monitor.

The First High-Performance Self-Supervised Algorithm that Works for Speech, Vision, and Text
Data2vec is Meta AI’s first high-performance self-supervised algorithm that works for multiple modalities, ranging from speech to images and text. Dig in for details!

Metaprogramming in Julia: A Full Overview
Metaprogramming may be defined as the programming in which we write Julia code to process and modify Julia code. Interested? Check out the full article for details!

Financial Text Classification With Deep Learning Using FinBERT
In this article, you’ll look into one of the ways of the application of the FinBERT pre-trained model for financial text data classification tasks. A concise but interesting read!


Omnivore: A Single Model for Many Visual Modalities
Omnivore is a universal model that excels at classifying images, videos, and single-view 3D data using exactly the same model parameters.

GreaseLM: Graph REASoning Enhanced Language Models for Question Answering
GreaseLM is a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations.

Stitch it in Time: GAN-Based Facial Editing of Real Videos
In this paper, the authors propose a framework for semantic editing of faces in videos, demonstrating significant improvements over the current state-of-the-art models.


Machine Learning Simplified: A Gentle Introduction to Supervised Learning
Looking to develop strong intuition into inner workings of ML? Check out this amazing book by Andrew Wolf and fully learn the scope of supervised ML.


Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
Learn how the team came up with a way to model the denoising distribution using a complex multimodal distribution, to enable denoising with large steps.

Stitch it in Time: GAN-Based Facial Editing of Real Videos
In this project, the authors propose a method that can apply semantic manipulations to real facial videos without requiring any temporal components. Check out examples to get how it works!


NN-SVG is a tool for creating Neural Network architecture drawings parametrically rather than manually. It can also be used to export the drawings to Scalable Vector Graphics (SVG) files.


The Unreasonable Effectiveness of JPEG: A Signal Processing Approach
In this video, you’ll learn about the core parts of the JPEG algorithm, specifically color spaces, YCbCr, chroma subsampling, the discrete cosine transform, quantization, and lossless encoding.