Getting Started With DLAI Course Labs

In reality, this post was intended for my DLAI course’s students, although I think it may be of interest to other students. I am going to share in this blog the teaching material that I am going to generate for the part of DLAI course that will cover the basic principles of Deep Learning from a computational perspective.

This post provides a fast-paced introduction to the basic technologies and knowledge required to follow the DLAI Labs (Master Course at UPC – Autumn 2017). I will teach the part of DLAI course that will cover the basic principles of deep learning from computational perspectives. In this part we will review the latests advances in computing platforms, system middleware and DL frameworks required for current revolution in artificial intelligence for multimedia data analysis.

Python

Python is a widely used programming language (source code is now available under the GNU General Public License GPL) started by Guido van Rossum that supports multiple programming paradigms. Although it is an interpreted language rather than compiled language and therefore might take up more CPU time (important detail in our Computer Architecture department), Python has a gentle learning curve. Python is readable, writeable, and endlessly powerful. Its simplicity lets you become productive quickly. Python is the programming language of choice for our labs. Only Python basics are required in order to follow these labs. If you have no prior knowledge of Python, to help you learn the required background knowledge by yourself, you can follow this Python Quick Start.

Docker

Docker is the worlds leading software container platform. Developers use Docker to eliminate works on my local machine problems when collaborating on code with co-workers. Operators use Docker to run and manage apps side-by-side in isolated containers to get better compute density. Enterprises use Docker to build agile software delivery pipelines to ship new features faster, more securely and with confidence for both Linux and Windows Server apps.

A container image is a lightweight, standalone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, sys- tem libraries, settings. Available for both Linux and Windows based apps, containerised software will always run the same, regardless of the environment. Containers isolate software from its surroundings, for example differences between development and staging environments and help reduce conflicts between teams running different software on the same infrastructure. In this course, we will use Docker in order to isolate all the frameworks and programs and avoid configuration problems.

MNIST dataset

The MNIST data-set is composed by a set of black and white images containing hand-written digits, containing more than 60.000 examples for training a model, and 10.000 for testing it.  The MNIST data-set can be found at the MNIST database. This data-set is ideal for most of the people who begin with pattern recognition on real examples without having to spend time on data pre-processing or formatting, two very important steps when dealing with images but expensive in time.

The black and white images (bilevel) have been normalized into 20×20 pixel images, preserving the aspect ratio. For this case, we notice that the images contain gray pixels as a result of the anti-aliasing used in the normalization algorithm (reducing the resolution of all the images to one of the lowest levels). After that, the images are centered in 28×28 pixel frames by computing the mass center and moving it into the center of the frame. The images are like the ones shown here

image034

The images are represented as a numerical matrix. For example, one of the images of number 1 can be represented as:

image036

Where each position indicates the level of lackness of each pixel between 0 and 1. This matrix can also be transformed in a bunch of points in a vectorial space of 784 dimensions ( 28×28 = 784 numbers).

Lab Tasks

Below are the tasks of this lab session. If you don’t finish all of them during this session lab, please, read Task 7 before leaving classroom.

Task 1: Install Docker for your platform

Task 2: Download and run the Docker image

Open a terminal (Mac/Linux), Open cmd or powershell (Windows 10 Pro) or Open the Docker CLI (Other windows versions).

Download it from the repository:

docker pull jorditorresbcn/dlai-met:latest

MacOS and Windows users should have the docker program open in order to run docker commands. This docker image is based on Ubuntu 16.04 with the following software stack: Python3.5, Keras, TensorFlow, PyTorch, nano, htop, iPython, Jupyter, matplotlib and git.

Task 3: Run the docker image for first time:

Open a terminal on Linux/Mac, PowerShell on Windows 10 or the Docker CLI on the other Windows:

docker run -it -p 8888:8888  jorditorresbcn/dlai-met:latest

Task 4:  Run de Jupyter Notebook server:

Using the docker container, run the following command:

jupyter notebook --ip=0.0.0.0 --allow-root

On your computer, open your browser and go to http://localhost:8888, the password is dlaimet.

If you are on windows and you are experiencing connectivity issues, please check THIS.

Task 5: Resume the container (OPTIONAL)

If you need to resume the container, you can use the following command:

docker ps -a
docker start -i CONTAINER_ID

Task 6: Download and print MNIST dataset

On your browser, create a new notebook.

Download the dataset using the following python code:

from keras.datasets import  mnist
# Load pre-shuffled MNIST data into train and test sets 
(X_train, y_train), (X_test, y_test) = mnist.load_data()

Print the shape of the variables

print(X_train.shape) 
print(y_train.shape) 
print(X_test.shape)
print(y_test.shape)

Plot one of the data numbers using the following code (the image will be in your shared folder)

%matplotlib inline
from matplotlib import pyplot as plt 
plt.imshow(X_train[0])

Task 7: Lab Report

Before leaving classroom show your teacher how you solved the problem in Task 6 and the image stored in you folder after the plot function have worked properly.

If you don’t have time to finish all tasks during this lab session, please, follow the indications of your teacher about how to create your lab report and how to submit it. Include in your report how you solved the problem in Task 3 and the image stored in you folder after the plot function.

Now you are ready for the next lab!

My thanks to Francesc Sastre for helping me with the preparation of this lab.

2017-10-04T12:20:07+00:00 September 15th, 2017|