Data Phoenix Digest - ISSUE 40

Data Phoenix invites everyone to Slack chat, the birth of Albumentations, neural network from scratch, Self-Supervised Learning from 100 million medical images, a general language assistant as a laboratory for alignment, NL-Augmenter, projects, jobs, and more ...

Dmitry Spodarets

Data Phoenix team has great news to share! We want to be close to you as much as possible, that's why we created Slack chat where we can talk to you 24/7, you can text what you expect to see on our social media, what would you like to have more or less, also you can find friends who are sharing the same interests as you. Isn't it amazing? Please tap on the button, and let's have some fun!



The Birth of Albumentations
Do you want to know how Albumentations, an open-source library for image augmentation, was born and evolved over time? Check out this comprehensive article, then!

Google Research: Themes from 2021 and Beyond
In this article, you’ll find major trends for AI/ML in 2021 and beyond. Every trend comes with related research and the directions and progress we’ll likely see in the next few years.

Neural Network From Scratch
Have you ever tried to explain how neural networks work in the simplest way possible? In this article, you’ll see how it can be done by building one from scratch.

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform
Netflix has finally open-sourced a GUI for Metaflow, a monitoring tool for allows data scientists. Learn more how it can be used for practical tasks in this post.

Using Spellchecking to Improve Tesseract OCR Accuracy
In this tutorial, you’ll learn how to use textblob to improve OCR accuracy by automatically spellchecking OCR’d text. Check out a previous tutorial, too.

Why You Should Stop Predicting Prices If You Want to Stand a Chance of Predicting Prices
Does it make sense to predict stock prices? In this article, you’ll find a new approach to AI and the development of models to predict stock prices. Check out this interesting read for sure!


Self-Supervised Learning from 100 Million Medical Images
The researchers propose a method for self-supervised learning of rich image features based on contrastive learning and online feature clustering, leveraging 100,000,000 medical images.

Simulation Intelligence: Towards a New Generation of Scientific Methods
"Nine Motifs of Simulation Intelligence" is a roadmap for the development and integration of the essential algorithms necessary for a merger of scientific computing, scientific simulation, and AI.

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation
NL-Augmenter is a new participatory Python-based natural language augmentation framework which supports the creation of both transformations and filters in data.

A General Language Assistant as a Laboratory for Alignment
In this paper, Amanda Askell et al. explore methods of building a general-purpose, text-based assistant that is aligned with human values, meaning that it is helpful, honest, and harmless.

Omnizart: A General Toolbox for Automatic Music Transcription
Omnizart is a new Python library that provides a streamlined solution to automatic music transcription. It is the first transcription toolkit with models covering a wide class of instruments.


Fake It Till You Make It [Microsoft Face Analysis Project]
Learn how to combine a procedurally-generated parametric 3D face model with a comprehensive library of hand-crafted assets to render training images with unprecedented realism and diversity.

BANMo: Building Animatable 3D Neural Models from Many Casual Videos
BANMo reconstructs an animatable 3D model, including a canonical 3D shape, appearance, skinning weights, and time-varying articulations, without pre-defined shape templates.

EditGAN: High-Precision Semantic Image Editing
EditGAN is a novel method for high-quality, high-precision semantic image editing, allowing users to edit images by modifying their highly detailed part segmentation masks.


Looking to feature your open positions in the digest? Kindly reach out to us at [email protected] for details. We'll be proud to help your business thrive!