In this is an in-browser representation of a neural network, built by one of our colleagues at Google. You can change the number of layers, neurons, learning rate, and so forth, and see how the neural network general lives over a data set. This thing was a huge education al success.
It's been used in universities across the world now, and Google's own machine-learning course, and I definitely recommend checking this out. We asked ourselves why was this such a success? Why do people care about this in-browser machine learning thing at all? We distilled it down to a few points. The obvious thing is that you just click a link and you get going. If you've done Python ML, it's a pain in the butt to get your drivers installed, Python libraries, and that thing.
There are a lot of awesome people in this community and we wanted to marry these two worlds. At the same time, we wanted folks experienced in machine learning to be able to port their work to the web.
These goals are sometimes at conflict. We will talk a little bit about how we resolve those. Okay, so one of the principles that we had was we wanted the library to be super easy to use, and we kind of leaned towards that over performance. At the same time, we didn't want to sacrifice any functionality for simplicity, so just jumping into what that means, we decided to go with this eager-only approach. I'm not going to go into what that means, but it's a much simpler way of programming, and most of the machine learning world is moving towards this eager approach, compared to a graph-based approach.
We wanted it to be easy. We also provide a high-level layers API which is a set of best practices in the machine learning community, so you don't have to think about all the details of your linear algebra when constructing a model. We also provide a whole repository of zero-train models that require zero understanding of machine learning to get started.
I will show you a couple of of those in a second. I wanted to highlight this. We worked on performance when and there are where it matters. We want it to be fast.
We took individual models, and we figured out how to make those individual models faster on a use-case basis. As I said, we don't want to make the library less functional.
We support gradients. This is fancy talk for sensitivity of eave of the weights which means you can train for any operation you use in a TensorFlow library. We support TensorFlow ops, and for any of these models, you can dig down and get some of the machine learning constructs out of them if you want to. So quickly jumping into what the technical stack looks like, at the very top of the abstraction APIs, we are our models repo.
A couple of pre-trained models require little understanding of machine learning. Below that, this is our layers API. This is where you can construct a model, you can train the model, you can serialise the model for later, and we will show you some of that soon too, and we also have below that our core API which is just linear algebra kernel, matrix multiplies, convolutions, and derivatives.
All of these APIs you can hook in at any of these abstraction layers. All of these sit on top of web GL in the browser. We use fragment shaders to run our math in parallel. In Node we actually bind with the NPI with TensorFlow C++. If you use the same API for any of these things, you immediately get the hardware acceleration that TensorFlow has been working hard on for the CPU and GPU and we will have GPU support eventually.
Just check this link out. I want to show you one of the demos because it's fun. This model is called Pose Net.
Running completely in the browser. Nothing is being sent back to the server. The idea here is that we take RGB images from webcam, pass it through the pose-detection model which generates key points for my body parts and then it returns an object that we can render on the screen. Obviously, it works for two people, so this is a lot of fun.
We will show you how to use one of these models in a minute. The second model is very similar to Pose Net, doing a person segmentation.
This background is a little funny. What it does is that it draws a mask of one where it thinks there's a human pose, and a zero where it thinks there's not. This one is a lot of fun. One of the effects that I really like is portrait mode.
You can see this thing blurring, so we have, you know, a software-based portrait mode that is running in the browser pretty fast. So let's go back to the slides. I'm going to show you how that works and what the code looks like.
That model is called Bodypix, a pre-trained segmentation model that is super fast. Straightforward. You import TensorFlow.js and Bodypix. We have regular image tag.
That's it. This image is Frank.
Frank is Nick's baby, and he is doing a yoga pose for us. We are going to try to do a pose for us. These weights we host on our bucket for you so you don't have to pay for any of that. Then you call one line of code, call segmentation on the image, and you get a JSON object out and inside that object is a binary mask where it thinks the kid is. You don't have to understand ML bits of this.
One of the other things this model gives you is parts as well, so it will tell you which pitch else are face, arms, and legs, and so forth. We provide some fun utilities for drawing masks on top of those. So, you can imagine this being used for a video game Sprite.
It's a set of experiments around can we make interacting with music and art more accessible? We're using that Pose Net model and we are able to play a synth with our face. The link is in the browser.
Try it after the talk. Then we have a project called Man Manifold and this is a way to debug and understand machine learning models, and they use TensorFlow.js for linear algebra. Air B and B is using TensorFlow.js, they ship a model to the client. When you're about to upload a profile picture if they see a licence or a government-issued passport in the photo, they yell at you so they don't have to only that P on the back-end.
On the desktop in Node, there's a project called Clinic Doctor which monitors your Node application for GPU spikes. They initiate to start garbage flow spikes. One of my personal favourites is a programme called Magenta Studio, and they have an Electron app that plugs directly into Ableton live and can generate MIDI notes on a track for you, or a drum beat, and this is a ton of fun and August minutes of our existing work flow.
We launched these about a year ago. The library is great, because it's the super fast, it uses that library like Nikhil mentioned, with it's great for deploying on to servers and doing local work flows on your desktop or work station but there are a few downsides that the library has.
One is the GPU acceleration requires a library which is fast but large. We at TensorFlow don't support MacOS, so there's no GPU acceleration on Mac. It's a native model - the Node ... Guda can be around 250 megabytes or so, just on Linux.
It's a very large package to ship. We start no think, there's something in between we can to on Node, and we started working really hard and launched earlier this year a new headless graphics stack for Node and we launched it and we worked hard with the Chrome team to build a headless graphic stack for that. We wanted to take that and accelerate our existing WebGL stack. This libraries runs by Angle which is the driver we ship in chrome today. On Windows, it's Direct 3D, open GL on Windows.
Your native OS. We think this is going to be great for some desktop apps like Electron, mobile, and embedded space, and IoT devices. Plus this is going to bring GPU-acceleration to Mac OS.
We are working hard to finish this up. I want to show a demo of this actually running. We built a really quick Electron app, so, if I go and run my app, this app uses MobileNet which does basic image classification. It can see an image and tell you what it is. As I pull up in my app here, not the most exciting UI, but it shows the GL stack that is running.
You can see it's running Angle with an open GL41 core. When I click run demo, it's fetching our model, it's loading, and it predicted that that is a Labrador-Retriever, and we are running 150 predictions on the image averaging 30 milliseconds, close to 30 frames a second in real time. We think this will be great on the Electron side. It doesn't lock your UI thread.
You're dispatching all of these ML calls through the Node process, all with a headless GL, and that package is like five to ten megabytes. It's very small. This is the latest type of IoT boards. This is jet son Nano with a big GPU stapled to the top of it.
Running that same model, Node, not the most exciting but we are doing around 76 millisecond of reference time with the very thin ARM64 build-up or Node back-end. I want to talk about another library we've been working really hard, and it's in-browser visualisation for our TensorFlow.js library. Not just package, called TFTS-VIS and you can think about it as the Chrome DevTools for models, it's a canvas for painting a bunch of elements that the library provides.
We have a bunch of built-in charts for loss of accuracy in ML training. We also have what we call high-level visualisation methods. This basically allows you to look at those complicated ops like convolutions which do a bunch of filters on your image while you're training and see what happens in between those convolutions. Evaluation utilities is another set of drawing libraries, and that shows you where your model might be over biased in a particular class in ways that you can see how you might alter your data set to see you have a nicely trained model. We've been talking about - we've shown you a lot of stuff, but we want to show you a lot of things that Nikhil and myself, and the team have been thinking about when going forward with the project.
Another one we've been looking at is Wasm. We need to do ... effective accelerator for CPUs. We are hoping to have something for devices where the GPU isn't all that great, we can fall back to Wasms. One of the great parts with the ML space is just the amount of research, and we are finding about every year, our same model app gets faster from reductions in architecture, or new hardware acceleration stories.
So, every year the models that we keep showing continue to get faster, especially on Edge and browser devices. Another great product we have at Google is Auto-ML. That solves the whole training part if you want to do an image classification problem.
You can give it a set of images, and it uploads to the cloud, and it automatically finds the right architecture for your model, and then spits out the model that you can play on your device. We are looking at some integration with that team as well, to make it a seamless experience. The other thing that our team has been focusing on is this optimising our existing back-ends, so our WebGL implementation, we worked on packing textures which is a fancy term of using less memory as much as possible in our acceleration library, and we done that speed for a bunch of things, including iOS up to ten times faster than what we were seeing before. Looking at the things we are going to launch this summer, Visualisation has already launched. Another package is the data library which is an easy-to-use package for getting stuff out of the browser - microphone data, webcam data.
You don't have to worry about Tensors. You can streamline these things into your model. On our platform side, expanding where we run TensorFlow.js, as mentioned, WeChat, the headless GPL stuff and diving into how we can provide a nice React Native experience.
Then on our out-of-the-box model fronts we will continue focusing on audio and text models, as well as improving the accuracy and performance of our existing offerings. With that, I want to thank you for attending our talk. Everything we've shown is we work purely in open source, and all of our stuff is found on gs.tensorflow.org. This project would not be where it's at with the large number of open-source contributors we have had and we want to extend a big thank you to them and all the hard work they've done.
One last plug: we are actually hiring a developer-advocate for our team. If anyone is interested, please follow that link, or some and see us at the booth here. That's all! [Applause].