A roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I've noted recently.
Open Source AI, ML & Data Science News
Pandas 1.0.0 is released, a milestone for the ubiquitous Python data frame package.
Tensorflow 2.1.0 is released, the last Tensorflow release to support Python 2.
Karate Club, a library of state-of-the-art methods for unsupervised learning on graph structured data, built on the NetworkX Python package.
OpenAI announces it is standardizing on PyTorch, a move likely to affect future releases of deep learning models released by the research institution.
sparklyr is now a Linux Foundation incubation project, and version 1.1 of the Spark-and-R interface is now available.
DiCE, an open-source Python library for Diverse Counterfactual Explanations for machine learning models, from Microsoft Research.
Industry News
GCP releases Auto Data Exploration and Feature Recommendation Tool, to speed up the process of preparing data for machine learning.
Google Dataset Search, a search engine for finding public datasets, is now generally available.
Metaflow, a "human-centric" framework for data science in Python, released as open source by Netflix and AWS.
RStudio reorganizes as a Public Benefit Corporation, with a charter to create free and open-source software for data science, scientific research, and technical communication.
Facebook open sources VizSeq, a Python toolkit that simplifies visual analysis on a wide variety of text generation tasks.
Open Data on AWS, a new service to share data and include it in the Registry of Open Data on AWS.
Microsoft News
Microsoft launches AI for Health, a $40M five-year program that provides grants, data science experts, technology, and other resources to tackle health issues with AI.
The Python extension for Visual Studio Code adds kernel selection for Notebooks, auto-activation of environments in the terminal, and other improvements.
Azure Machine Learning service now supports VMs with single root input/output virtualization and InfiniBand, to speed up the process of training large deep learning models like BERT.
Microsoft Video Indexer now supports multi-language speech transcription, and extraction of high-resolution key frames for use with Custom Vision.
Microsoft Translator API now supports custom translations, to incorporate domain-specific vocabulary and idioms.
InterpretML, an open-source Python package that implements Explainable Boosting Machine to train interpretable machine learning models and explain black-box systems, is now available as an Alpha release.
fairlearn 0.4.1, the latest release of the Python package for assessing the fairness and mitigating the observed unfairness of AI systems, is now available.
Learning resources
Computer Vision Recipes, a repository of examples and best practice guidelines for building computer vision systems. Includes Jupyter notebooks and utility functions based on PyTorch for several scenarios.
NLP Recipes, a repository of Natural Language Processing best practices and examples. Includes Jupyter notebooks implementing state-of-the-art methods and common scenarios that are popular among researchers and practitioners working on problems involving text and language.
The AI Now Institute 2019 report offers 12 recommendations on what policymakers, advocates and researchers can do to address the use of AI in ways that widen inequality.
Stanford's AI Index 2019 report tracks data, metrics and trends related to Artificial Intelligence research, applications, and impact.
Mathematics for Machine Learning, a book (by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong) to teach the mathematical concepts underlying AI implementations. (PDF available online.)
How to select algorithms for Azure Machine Learning, a guide to selecting machine learning methods by problem type and other requirements.
A tutorial on Automated Machine Learning for time series forecasting, with Azure Machine Learning.
People+AI Guidebook, resources for designing human-centered AI products from Google.
Data Project Checklist, a questionnaire to guide setting up a new data science project from Fast AI's Jeremy Howard.
PyTorch: An Imperative Style, High-Performance Deep Learning Library, the first full research paper on the popular framework.
Applications
AI Dungeon 2, a free-form text adventure game for mobile and browsers. Based on GPT-2, the interactive story can go in just about any direction.
Generative music playing in the lobby of a NYC hotel (created by Musician Björk in partnership with Microsoft) reacts to weather and migrating flocks of birds detected in a rooftop camera.
Facebook develops an AI system based on neural machine translation that can solve mathematical equations.
DialoGPT, a large-scale pretrained model from Microsoft Research that generates human-quality conversational responses.
VisualizeMNIST, a browser-based tool to interactively visualize the layers of a simple digit recognition model.
Find previous editions of the AI roundup here.