Data Phoenix Digest - 17.09.2021

Computer vision against cyber threats, Grammarly for developers, understanding convolutions on graphs, simulating traffic flow in python, deep reinforcement learning at the edge of the statistical precipice, datasets, jobs, and more...

Dmitry Spodarets

Hey guys! Sorry for a short delay with this week's issue of the digest. We're focused on webinars right now, just to put out higher-quality content for you. Thank you for attending the webinars, by the way! But now it’s time to read and enjoy the digest, with love from Data Phoenix Team.

I'd like to remind you that we'll have several events soon:

Kindly register to listen to our awesome speakers; we'd love to see you anytime!

NOTE: If you missed any of our previous webinars, they're available on our YouTube channel. Please, take a look and make sure to comment.

NEWS

What's new this week?

Computer Vision against cyber threats. TensorFlow Similarity. Grammarly for developers. Godzilla leading Project Kaiju. And AI that predicts startup success.

Funding News:

  • PolyAI raises $14 million from Khosla Ventures to accelerate expansion in the US
  • Deep Vision closes a $35 million series B financing round led by Tiger Global
  • Ambi Robotics announced a $26 million Series A in funding from Tiger Global

ARTICLES

Annotate and Improve Computer Vision Datasets with CVAT and FiftyOne
This post covers two example workflows showing how to use the integration between FiftyOne and CVAT, helping you to build efficient annotation workflows and train better models.

Understanding Convolutions on Graphs
In this article, you'll learn about the building blocks and design choices of graph neural networks. Make sure to check out the Supplementary Material section for more goodies.

Understanding ROC Curves with Python
In this article by Lucas Soares, you'll learn how to design and build the basic intuition for receiver operating characteristics (ROC) curve with Python.

Simulating Traffic Flow in Python
Predicting traffic is a challenging task with multiple variables. In this article, you'll explore the methods of simulating traffic by implementing a microscopic traffic model.

A Lightweight Data Validation Ecosystem with R, GitHub, and Slack
Data quality monitoring is an essential part of any data analysis or business intelligence workflow. In this article, you'll learn how to build a data validation system with at-hand tools.

ML Metadata Store: What It Is, Why It Matters, and How to Implement It
Learn about ML metadata stores, how  they are different from other tools used for building models, and how they can help you build and deploy models with more confidence.

Using Deep Learning to Detect Abusive Sequences of Member Activity
In this post, you'll look into the methods LinkedIn's Anti-Abuse AI Team uses to detect the creation of fake accounts, member profile scraping, automated spam, and account takeovers.

Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework
In this article by Uber’s Global Scaled Solutions team, you'll dig deeper into their expertise of building a streaming real-time analytics by using a variety of cloud services.

Build Machine Learning at the Edge Applications Using Amazon SageMaker Edge Manager and AWS IoT Greengrass V2
Running ML models at the edge can be a key enhancement for IoT solutions that must perform inference without a constant connection back to the cloud. Learn how it is done on AWS.

PAPERS

Deep Reinforcement Learning at the Edge of the Statistical Precipice
In the paper, the authors propose a new approach to reliable evaluation of deep RL models. They illustrate their findings using a case study on the Atari 100k benchmark.

GeneAnnotator: A Semi-automatic Annotation Tool for Visual Scene Graph
GeneAnnotator is a semi-automatic scene graph annotation tool for images that allows human annotators to describe the existing relationships in the visual scene in the form of directed graphs.

Parsing Table Structures in the Wild
This paper tackles the problem of table structure parsing from images in the wild. It establishes a practical table structure parsing system for scenarios where tabular input images are taken or scanned with severe deformation, bending or occlusions.

From Contexts to Locality: Ultra-high Resolution Image Segmentation via Locality-aware Contextual Correlation
The authors innovate the widely used high-resolution image segmentation pipeline, in which an ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask.

DATASETS

IKEA ASM Dataset
The IKEA ASM dataset is a multi-modal and multi-view video dataset of 371 samples of assembly tasks to enable rich analysis and understanding of human activities.

The Natural Scenes Dataset
The Natural Scenes Dataset (NSD) is a large-scale fMRI dataset consisting of whole-brain, high-resolution fMRI measurements of 8 healthy adult subjects while they viewed thousands of color natural scenes over the course of 30–40 scan sessions.

BOOKS

Learn Data Science with R
This book for data science beginners covers statistics, R, graphing, and machine learning. It features many practical examples that will help you put theory into practice.

JOBS

Looking to feature your open positions in the digest? Kindly reach out to us at [email protected] for details. We'll be proud to help your business thrive!

Digest