Data Phoenix Digest - ISSUE 47

Kubeflow MLOps, a comprehensive list of strategies for feature selection, introducing PyScript, open pre-trained transformer language models, DeepNorm, NeurMiPs, Stanford CS224N NLP with Deep Learning, LAION-5B dataset, and more.

Dmitry Spodarets
Dmitry Spodarets


Testing Feature Logic, Transformations, and Feature Pipelines with PyTest
In this article, you will learn how to design, build, and run tests for features using PyTest: execute unit tests for feature logic, transformation functions, and end-to-end tests for feature pipelines.

Tackling Multiple Tasks with a Single Visual Language Model
The DeepMind team we introduces Flamingo, a single visual language model that can tackle difficult problems with a handful of task-specific examples, without any additional training required.

Kubeflow MLOps : Automatic Pipeline Deployment with CI / CD / CT
Learn how to create an advanced Kubeflow pipeline, and automate its deployments and updates with continuous integration, deployment, and training.

Feature Selection: A Comprehensive List of Strategies
In this article, you will find a a checklist of strategies that can be applied to help you decide what features to keep and what features to cut off. A dozen ways to tackle feature selection!

Lasso and Ridge Regression: An Intuitive Comparison
Linear Regression is one of the most simple machine learning algorithms. In this post, you will explore Ridge and Lasso regressions in more detail. Enjoy!

PyScript 101 [Introduction]
In this article, you’ll learn the basics about PyScript, a new project by Anaconda and a framework that allows users to run Python and create rich applications in the browser using HTML tags.


Panini-Net: GAN Prior Based Degradation-Aware Feature Interpolation for Face Restoration
Panini-Net is a novel degradation-aware feature interpolation network for face restoration tasks through explicit learning of the abstract representations, to distinguish various degradations.

DeepNet: Scaling Transformers to 1,000 Layers
The authors introduce a method for stabilization of deep transformers. A normalization function, DeepNorm, modifies the residual connection in transformers with theoretically derived initialization.

NeurMiPs: Neural Mixture of Planar Experts for View Synthesis
NeurMiPs is a novel planar-based scene representation for modeling geometry and appearance that uses a collection of local planar experts in 3D space as the scene representation.

Focal Sparse Convolutional Networks for 3D Object Detection
In this paper, you will find two new modules to enhance the capability of Sparse CNNs, both are based on making feature sparsity learnable with position-wise importance prediction.

OPT: Open Pre-trained Transformer Language Models
Open Pre-trained Transformers (OPT) is a suite of decoder-only pre-trained transformers ranging from 125M to 175B parameters. OPTs are easily shareable. Check them out!


Stanford CS224N NLP with Deep Learning | Winter 2021
This course from Stanford University covers the basics of NLP with Deep Learning, from word vectors and neural classifiers to BERT and machine translation. 20 lectures in total.


LAION-5B is a dataset of 5,85 billion CLIP-filtered image-text pairs, 14x bigger than LAION-400M, previously the biggest openly accessible image-text dataset in the world. Check it out!