Google launched TensorFlow Serving, that helps developers to take their TensorFlow machine learning models (and, even so, can be extended to serve other types of models) into production. TensorFlow Serving is an open source serving system (written in C++) now available on GitHub under the Apache 2.0 license.
What is the difference between TensorFlow and TensorFlow Serving? While in TensorFlow is easier for the developers to build machine learning algorithms and train them for certain types of data inputs, TensorFlow Serving specializes in making these models usable in production environments. The idea is that developers train their models using TensorFlow and then they use TensorFlow Serving’s APIs to react to input from a client.
This allows developers to experiment with different models in a large scale that change over time based on real-world data, and maintain a stable architecture and API in place.
The typical pipeline is that a training data is fed to the learner, which outputs a model, wich after being validated it is ready to be deployed to the TensorFlow serving system. It is quite common to launch and iterate on our model over time, as new data becomes available, or as you improve the model. In fact, in the google post they mention that at Google, many pipelines are running continuously, producing new model versions as new data becomes available.
Developers use to communicate with TensorFlow Serving a front-end implementation based on gRPC, a high performance, open source RPC framework from Google.
If you are interested to learn more about TensorFlow Serving, I suggest to begin by reading the Serving architecture overview section, set up your environment and start to do a basic tutorial . Good Luck!