A roundup of news about Artificial Intelligence, Machine Learning and Data Science. This is an eclectic collection of interesting blog posts, software announcements and data applications from Microsoft and elsewhere that I've noted recently.
Open Source AI, ML & Data Science News
Python 3.8 is now available. From now on, new versions of Python will be released on a 12-month cycle, in October of each year.
Python takes the #2 spot in Github's annual ranking of programming language popularity, displacing Java and behind JavaScript.
PyTorch 1.3 is now available, with improved performance, deployment to mobile devices, "Captum" model interpretability tools, and Cloud TPU support.
The Gradient documents the growing dominance of PyTorch, particularly in research.
Keras Tuner, hyperparameter optimization for Keras, is now available on PyPI.
ONNX, the open exchange format for deep learning models, is now a Linux Foundation project.
AI Inclusive, a newly-formed worldwide organization to promote diversity in the AI community.
Industry News
Databricks announces the MLflow Model Registry, to share and collaborate on machine learning models with MLflow.
Flyte, Lyft's cloud-native machine learning and data processing platform, has been released as open source.
RStudio introduces Package Manager, a commercial RStudio extension to help organizations manage binary R packages on Linux systems.
Exploratory, a new commercial tool for data science and data exploration, built on R.
GCP releases Explainable AI, a new tool to help humans understand how a machine learning model reaches its conclusions.
Google proposes Model Cards, a standardized way of sharing information about ML models, based on this paper.
GCP AutoML Translation is now generally available, and the GCP Translation API is now available in Basic and Advanced editions.
GCP Cloud AutoML is now integrated with the Kaggle data science competition platform.
Amazon Rekognition adds Custom Labels, allowing users to train the image classification service to recognize new objects with as few as 10 training images per label.
Amazon Sagemaker can now use hundreds of free and paid machine learning models offered in Amazon Marketplace.
The AWS Step Functions Data Science SDK, for building machine learning workflows in Python running on AWS infrastructure, is now available.
Microsoft News
Azure Machine Learning service has released several major updates, including:
- a new web-based "studio" user interface (shown in this video),
- a new drag-and-drop workflow designer (video),
- a simple UI to apply automated machine learning to a dataset (video)
- built-in Notebook support,
- creation of end-to-end machine learning pipelines for training and deployment (video),
- auto-scaling compute instances for training and model deployment (video),
- a new Model Interpretability toolkit (video),
- a new asynchronous batch inferencing capability (video),
- integration with Azure Open Datasets (video),
- integration with Visual Studio Code,
- expanded PyTorch support, and
- new R SDK support (video).
Visual Studio Code adds several improvements for Python developers, including support for interacting with and editing Jupyter notebooks.
ONNX Runtime 1.0 is now generally available, for embedded inference of machine learning models in the open ONNX format.
Many new capabilities have been added to Cognitive Services, including:
- Personalizer, a new service to optimize user interfaces with reinforcement learning;
- Form Recognizer, a new service to extract structured data from printed or handwritten forms;
- A new Custom Neural Voice capability, to build a unique voice from a few minutes of training audio;
- Microsoft Video Indexer adds multi-language speech identification and transcription;
- Several services, including Face API, Custom Vision, and Translator Text are new available to Azure Free accounts
Bot Framework SDK v4 is now available, and a new Bot Framework Composer has been released on Github for visual editing of conversation flows.
SandDance, Microsoft's interactive visual exploration tool, is now available as open source.
Learning resources
An essay about the root causes of problems with diversity in NLP models: for example, "hers" not being recognized as a pronoun.
Videos from the Artificial Intelligence and Machine Learning Path, a series of six application-oriented talks presented at Microsoft Ignite.
A guide to getting started with PyTorch, using Google Colab's Free GPU offer.
Public weather and climate datasets, provided by Google.
Applications
The Relightables: capture humans in a custom light stage, drop video into a 3-D scene with realistic lighting.
How Tesla builds and deploys its driving automation models with PyTorch (presentation at PyTorch DevCon).
OpenAI has released the full GPT-2 language generation model.
Spleeter, a pre-trained PyTorch model to separate a music track into vocal and instrument audio files.
Detectron2, a PyTorch reimplementation of Facebook's popular object-detection and image-segmentation library.
Find previous editions of the AI roundup here.