Data Phoenix Digest

NEWS

What's new this week?

Project OpenBytes and the NextArch Foundation. Azure OpenAI Service. No more facial recognition for Facebook. Self-driving farm robots and AI for drinking water.

The Linux Foundation announces Project OpenBytes and the NextArch Foundation. OpenBytes is an “open data community” and a new data standard and format for AI apps, while NextArch will build software development architectures that support a range of environments.
Microsoft launches the Azure OpenAI Service, making OpenAI’s ML models available on Azure. GPT-3, OpenAI’s groundbreaking language model that can produce human-like text with just a few prompts, is now up for grabs for Microsoft developers.
Amid concerns about individual privacy and surveillance, Facebook will shut down its facial recognition system and delete the facial recognition templates. In 2019, Facebook was fined $5 billion by the US Federal Trade Commission over the use of facial recognition.
Self-driving farm robot built by Corbon Robotics uses lasers to kill 100,000 weeds p/hour, giving famers hope to save land from toxic herbicides. Herbicide is expensive, so there’s an incentive to find new ways of weed control in addition to the health and safety reasons.
Barati Farimani, an assistant professor of mechanical engineering at Carnegie Mellon University, and his team use AI to design an improved method of desalination. AI decides which atoms should be removed from nanopore graphene membranes.

Funding News

Anomalo, the complete data quality platform company, raises $33M in Series A funding led by Norwest Venture Partners.
OctoML, a startup that helps enterprises optimize and deploy their ML models raises $85M in Series C funding led by Tiger Global Management.
Notable, a startup that develops intelligent automation solutions for healthcare, raises $100M in Series B funding led by ICONIQ.

ARTICLES

PyTorch: Transfer Learning and Image Classification
In this tutorial, Adrian Rosebrock will explain how you can perform transfer learning for image classification using the PyTorch deep learning library. Check out part 1 of the tutorial.

Neural Radiance Field (NeRF) Papers at ICCV 2021
In anticipation of ICCV (Intl. Conf. on Computer Vision), the author rounded up all papers that use Neural Radiance Fields (NeRFs) that will be represented in the main ICCV2021 conference.

A Gentle Introduction to Vector Space Models
In this tutorial, you'll learn about vector space, the properties of cosine similarity and how it can help you compare two vectors, and how cosine similarity and L2 distance are different.

Predicting Spreadsheet Formulas from Semi-Structured Contexts
In this article, Google presents their new model that learns to automatically generate formulas based on the rich context around a target cell. Find the related paper inside.

Improve Your Data Science Workflow with a Multi-Branch Training MLOps Pipeline Using AWS
In this post, you'll learn how to create a multi-branch training MLOps CI/CD pipeline using AWS CodePipeline and AWS CodeCommit, in addition to Jenkins and GitHub.

Using Singular Value Decomposition to Build a Recommender System
In this tutorial, you'll explore the ins and outs of singular value decomposition and its relation to a matrix. You'll learn how to make use of SVD to analyze data to make recommendations.

Deciding Which Tasks Should Train Together in Multi-Task Neural Networks
Learn about Task Affinity Groupings (TAG), an efficient method to determine which tasks should train together in a single training run. TAG is competitive with the prior state-of-the-art.

PAPERS

Wav2CLIP: Learning Robust Audio Representations From CLIP
Wav2CLIP is a robust audio representation learning method by distilling from CLIP. It can outperform publicly available pre-trained audio representation algorithms.

Non-Deep Networks
In this paper, Ankit Goyal et al. theorize and show that it is possible to build high-performing "non-deep" neural networks by using parallel subnetworks instead of stacking one layer after another.

ByteTrack: Multi-Object Tracking by Associating Every Detection Box
The authors present present a simple, effective and generic association method, called BYTE, tracking BY associating Every detection box instead of only the high score ones.

ADOP: Approximate Differentiable One-Pixel Point Rendering
ADOP is a point-based, differentiable neural rendering pipeline for scene refinement and novel view synthesis that can synthesize sharper and more consistent novel views than existing approaches.

Multi-Label Classification with Partial Annotations using Class-Aware Selective Loss
In this work, the authors analyze the partial labeling problem and propose a solution. Their approach is effective and allows to achieve state-of-the-art results on OpenImages dataset.

COURSES

Natural Language Processing (NLP) Course
During this course, the students can gain a comprehensive understanding of NLP, from the principles and theories of NLP to various NLP technologies. 12 lectures in total.

AutoML Course [Part 1]
During this course, the student will learn how to use Light AutoML open-source network to build, accelerate, and customized automated ML pipelines. 5 modules in total.

JOBS

Director of Data Operations - GitHub, Remote (US / Canada)
Engineering Manager, Machine Learning - Grammarly, Kyiv, Remote
Sr. Field Data Scientist - Domino Data Lab, Remote
Senior Data Scientist - Intercom, Remote (UK / Ireland)
Data Science Intern (Summer 2022) - Reddit, Remote (US)

Looking to feature your open positions in the digest? Kindly reach out to us at editor@dataphoenix.info for details. We'll be proud to help your business thrive!

Subscribe

Data Phoenix Digest - ISSUE 30

NEWS

ARTICLES

PAPERS

COURSES

JOBS

Comments

Read Next

Harvey AI soars to $5B valuation with $300M Series E funding

Mira Murati's secretive startup Thinking Machines Lab raises $2B

Cluely, which launched promising to help users cheat on everything, raised $15M