Data Phoenix Digest

Data Phoenix Digest - 29.07.2021

AI in motorsport, racing, and surfing; ML from research to production – challenges, best practices, and tools; underfitting and overfitting in deep learning, deep automatic natural image matting, courses, competitions, jobs, and more...

by Dmitry Spodarets

Updated July 29, 2021

NEWS

What's new this week?

AI in motorsport, racing, and surfing. AI that finds relationships between people, skills, and projects. Fraud detection with Machine Learning. A new look at AI ethics, and AI that fights aging.

AI is applied in every industry nowadays, even in sports. From motosport and racing to surfing and curling, sports teams are working hard to glean actionable insights from data. This is the case with:

Envision Virgin Racing that takes advantage of data science to hone performance on race day. AI analyzes data and finds trends that are impossible for the human brain to detect.
The Olympics' surfing teams: AI generates artificial waves and helps tracks sleep and other vitals to help coaches fine-tune athletes' training and recovery.
Drone racing: New control algorithms developed at the University of Zurich calculate the optimal route at every point in the flight, rather than doing it section by section, having helped them to beat experienced human pilots for the first time.

In the meantime, NASA employs AI to build a talent mapping database using Neo4j technology that shows the relationships between people, skills, and projects, to search for talent and job opportunities. The University of Surrey has built an AI model that identifies chemical compounds that promote healthy aging, paving the way towards pharmaceutical innovations that extend a person's lifespan.

Another large area of research is anomaly detection. Here we have American Express that has been experimenting with AI-generated fake fraud patterns to sharpen its models’ ability to detect rare or uncommon swindles, thus helping detect fraud more accurately.

And finally, AI ethics. To get it right, organizations should consider these six strategies: 1) remove the fear of not getting it right away, 2) tailor your message to your audience, 3) tie your efforts to your company purpose, 4) define what ethics means in an operational way, 5) lean on trusted and influential individuals, 6) never stop educating.

SPONSOR

Постройте первую ML-модель за 3 недели онлайн-интенсива «Machine Learning. Введение в регрессионный анализ».

Вы научитесь применять регрессию, метод k-ближайших соседей и нейросети, чтобы обучить алгоритмы делать прогнозы на основе полученных данных.

Подключайтесь к обучению с 09.08. Детали здесь.

Reach Data Phoenix Digest readers by sponsoring an issue.
Click here for details.

ARTICLES

Reducing the Computational Cost of Deep Reinforcement Learning Research
In this exploratory and research article by Google AI, you'll learn about the team's efforts to reproduce the findings of the Rainbow paper and uncover new and interesting phenomena. The results of Google's experiments are quite interesting.

High Fidelity Image Generation Using Diffusion Models
Join Google AI's team to push the performance of diffusion models to state-of-the-art on super-resolution and class-conditional ImageNet generation benchmarks. Learn the results of their tests and how they managed to limit the diffusion models for generative modeling problems.

Speeding Up Reinforcement Learning with a New Physics Simulation Engine
In this article, the researchers at Google AI invite all to check the results of their work and learn how to perform a more qualitative measure of Brax’s physics fidelity by training their own policies in the Brax Training Colab.

Accurate Prediction of Protein Structures and Interactions Using a Three-Track Neural Network
In this research article, the team explores network architectures with a three-track network. It produces accurate structure predictions, solves the challenges of X-ray crystallography and cryo-EM structure modeling, and provides insights into the functions of proteins of unknown structures.

TonY joins LF AI & Data Foundation
TonY has become part of the LF AI & Data Foundation that supports open source innovation in AI/ML/DL. TonY enables AI engineers to more easily train distributed DL models on Hadoop. A peer LF AI & Data project, Horovod, is now supported in TonY too.

How to Speed Up Python Data Pipelines Up to 91X?
In this tutorial, you'll learn about efficient ways of setting up and managing your Python data pipelines, to handle big data faster and on a larger scale. Actually, you'll be able to accelerate your data pipelines by 91x. Read on to check yourself if it's even possible!

Underfitting and Overfitting in Deep Learning
In this article by Artem Oppermann, you'll explore the challenges of and the solutions to underfitting and overfitting in deep learning. Follow Artem step-by-step to learn how to ensure that your DL models are designed to hit the bull's eye from step one.

In-depth Guide to ML Model Debugging and Tools You Need to Know
ML systems are trickier to test than traditional software. In this guide, you'll learn some debugging strategies for ML models and the tools to implement them. Model interpretability will also be discussed, showing how to trace the path of errors from the input to the output.

ML from Research to Production – Challenges, Best Practices and Tools
In this guide by Neptune AI, you'll learn how to get your ML models to production. You'll touch upon common challenges like monitoring and bias, explore step-by-step how to get ready to model productionization, and look into other details about your models in and out of production.

Experiment Tracking vs Machine Learning Model Management vs MLOps
In this article, you'll learn about experiment tracking, machine learning model management, and MLOps — three areas that are required to master, to make sure that you can safely take a machine learning model from idea to production.

PAPERS

DiSECt: A Differentiable Simulation Enginefor Autonomous Robotic Cutting
In this paper by Eric Heiden et al., you'll learn about DiSECt, the first differentiable simulator for cutting soft materials. Through various experiments, the team evaluates the performance of the simulator, demostrating the potential for optimization in robotic cutting of soft materials.

Towards Real-World Blind Face Restoration with Generative Facial Prior
Xintao Wang et al. present a GFP-GAN that leverages a pretrained face GAN for blind face restoration. It is incorporated into the face restoration process via novel channel-split spatial feature transform layers, to achieve a good balance of realness and fidelity.

A Modular U-Net for Automated Segmentation of X-Ray Tomography Images in Composite Materials
In this paper, the authors propose a modular interpretation of UNet (Modular U-Net) to segment 3D tomography images and to automate XCT data processing pipelines needing no human.

Per-Pixel Classification Is Not All You Need for Semantic Segmentation
Mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks using the exact same model, loss, and training procedure. Your thoughts?

Deep Automatic Natural Image Matting
The paper investigates the challenges of dealing with natural images in AIM and propose a novel end-to-end matting network to solve the problem. Have a look at their test set.

DATASETS

TextOCR
TextOCR is a dataset to benchmark text recognition on arbitrary shaped scene-text that features 1M high-quality word annotations on TextVQA images. It allows data scientists to more easily apply end-to-end reasoning to downstream tasks, such as visual question answering or image captioning.

Project CodeNet
Project CodeNet provides the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI techniques. Check out their GitHub page to find the dataset you need.

COURSES

PyTorch Fundamentals
In this beginner-friendly course by Microsoft, you'll learn the fundamentals of deep learning with PyTorch. The covered use cases include speech, vision, and natural language processing.

JOBS

Machine Learning Engineer, Shelf
Data Scientist, Shelf
Experienced ML Engineer, Lun
Machine Learning Architect, SoftServe
Lead MLOps Engineer, SoftServe
Product Data Scientist, Snap
Senior Machine Learning Engineer, Sigma Software
Senior Machine Learning Scientist, Upwork
Product Manager, Machine Learning, Grammarly
Deep Learning Research Engineer (RnD), Reface
Computer Vision Research Engineer, SQUAD

Looking to feature your open positions in the digest? Kindly reach out to us at [email protected] for details. We'll be proud to help your business thrive!

by Dmitry Spodarets

Updated July 29, 2021