Sunday, September 24, 2017

Simple regression model by TensorFlow

Overview

Neural network is composed of input, hidden and output layers. And the number of hidden layers is optional. So the simplest network architecture has just one hidden layer.

On this article, I’ll make the simplest neural network for regression by TensorFlow.

From the official web site,
TensorFlow™ is an open source software library for numerical computation using data flow graphs.

This makes it easier to make shallow and deep neural network and other machine leaning algorithms. Through the simple trial, we can learn about TensorFlow and the system of neural network.

About the Tensorflow itself, please check the article below.

Model architecture


Roughly, from the viewpoint of the layer’s position, neural network has 3 types of layers, meaning input, hidden and output layers. Input and output layers depend on the data and prediction form you want. On the other hand, you can choose the scale and size of hidden layers.

So, here, to attain “the simplest model”, I made the model which has just one hidden layer. Concretely, the architecture I adapted is the image below.

enter image description here

It has input, one hidden, output layers. On the image, the color fulfills each roles as followings.

  • blue: input data
  • red: bias item
  • orange: output

The number of blue circles depends on the data and the orange one depends on the purpose of the model. On this case, the data has 3 features and the model is for regression.

Data


I used iris dataset. This dataset has 4 features which is composed of numbers and has 1 class information. But this time, 3 features were used to predict one another feature’s value.

Concretely, the data is as following.

from sklearn import datasets
iris = datasets.load_iris()
print(iris.data[:5])
[[ 5.1  3.5  1.4  0.2]
 [ 4.9  3.   1.4  0.2]
 [ 4.7  3.2  1.3  0.2]
 [ 4.6  3.1  1.5  0.2]
 [ 5.   3.6  1.4  0.2]]

This is part of the data. On this matrix form, I used first 3 columns as features and last one as target to predict.

Make model


Let’s make model.
At first, we need to do followings.

  • import data
  • prepare data
  • normalization

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# set random number
seed = 2
tf.set_random_seed(seed)
np.random.seed(seed)

# data
iris = datasets.load_iris()
x = np.array([x[0:3] for x in iris.data])
y = np.array([x[3] for x in iris.data])

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.7)

# normalization
mms = MinMaxScaler()
x_train = mms.fit_transform(x_train)
x_test = mms.transform(x_test)

Although I don’t use much, from sklearn we can use various useful functions for data pre-processing. train_test_split() splits data into train and test data. From sklearn, we can also use the functions for standardization and normalization.

On the next step, we need to write model’s architecture.

enter image description here

The code below is to write model’s architecture the image above shows.

batch_size = 50

# placeholder
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# parameters
W1 = tf.Variable(tf.random_normal(shape=[3, 6]))
b1 = tf.Variable(tf.random_normal(shape=[6]))
hidden = tf.nn.relu(tf.add(tf.matmul(x_data, W1), b1))

W2 = tf.Variable(tf.random_normal(shape=[6, 1]))
b2 = tf.Variable(tf.random_normal(shape=[1]))
output = tf.nn.relu(tf.add(tf.matmul(hidden, W2), b2))

# loss
loss = tf.reduce_mean(tf.square(y_target - output))

# optimize
optimizer = tf.train.GradientDescentOptimizer(0.005)
train_step = optimizer.minimize(loss)

Relatively, TensorFlow lets us write model’s architecture as the graph image is. The correspondence is on the following list.

  • blue circles: x_data
  • orange circle: y_target
  • arrows from input layer to hidden layer: W1
  • arrows from hidden layer to ouptut layer: W2
  • arrows from input layer’s red circle to hidden layer: b1
  • arrows from hidden layer’s red circle to output layer: b2

By the code below, training of the model can be done.

with tf.Session() as sess:
    # initialize variables
    init = tf.global_variables_initializer()
    sess.run(init)

    train_loss = []
    test_loss = []
    for i in range(500):
        # index for training
        random_index = np.random.choice(len(x_train), size=batch_size)

        # prepare data
        random_x = x_train[random_index]
        random_y = np.transpose([y_train[random_index]])

        sess.run(train_step, feed_dict={x_data: random_x, y_target: random_y})

        # reserve train and test loss
        temp_train_loss = sess.run(loss, feed_dict={x_data: random_x, y_target: random_y})
        temp_test_loss = sess.run(loss, feed_dict={x_data: x_test, y_target: np.transpose([y_test])})

        train_loss.append(sess.run(tf.sqrt(temp_train_loss)))
        test_loss.append(sess.run(tf.sqrt(temp_test_loss)))

        if i % 50 == 0:
            print(str(i) + ':' + str([temp_train_loss, temp_test_loss]))

        if i == 500 - 1:
            pred = sess.run(output, feed_dict={x_data: x_test})
            pred_list = [x[0] for x in pred]
            print(np.transpose(np.array([y_test, pred_list])))

The training and test losses are stored to the lists. We can check how it went on by plot.

plt.plot(train_loss, 'k-', label='train loss')
plt.plot(test_loss, 'r--', label='test loss')
plt.legend(loc='upper right')
plt.show()



Reference


The articles on my blog are related the contents here.

The book, TensorFlow Machine Learning Cookbook, has basic information and many tips to use TensorFlow well.


All the code

from sklearn import datasets
iris = datasets.load_iris()
print(iris.data[:5])

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

# set random number
seed = 2
tf.set_random_seed(seed)
np.random.seed(seed)

# data
iris = datasets.load_iris()
x = np.array([x[0:3] for x in iris.data])
y = np.array([x[3] for x in iris.data])

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size=0.7)

# normalization
mms = MinMaxScaler()
x_train = mms.fit_transform(x_train)
x_test = mms.fit_transform(x_test)

batch_size = 50

# placeholder
x_data = tf.placeholder(shape=[None, 3], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32)

# parameters
W1 = tf.Variable(tf.random_normal(shape=[3, 6]))
b1 = tf.Variable(tf.random_normal(shape=[6]))
hidden = tf.nn.relu(tf.add(tf.matmul(x_data, W1), b1))

W2 = tf.Variable(tf.random_normal(shape=[6, 1]))
b2 = tf.Variable(tf.random_normal(shape=[1]))
output = tf.nn.relu(tf.add(tf.matmul(hidden, W2), b2))

# loss
loss = tf.reduce_mean(tf.square(y_target - output))

# optimize
optimizer = tf.train.GradientDescentOptimizer(0.005)
train_step = optimizer.minimize(loss)

with tf.Session() as sess:
    # initialize variables
    init = tf.global_variables_initializer()
    sess.run(init)

    train_loss = []
    test_loss = []
    for i in range(500):
        # index for training
        random_index = np.random.choice(len(x_train), size=batch_size)

        # prepare data
        random_x = x_train[random_index]
        random_y = np.transpose([y_train[random_index]])

        sess.run(train_step, feed_dict={x_data: random_x, y_target: random_y})

        # reserve train and test loss
        temp_train_loss = sess.run(loss, feed_dict={x_data: random_x, y_target: random_y})
        temp_test_loss = sess.run(loss, feed_dict={x_data: x_test, y_target: np.transpose([y_test])})

        train_loss.append(sess.run(tf.sqrt(temp_train_loss)))
        test_loss.append(sess.run(tf.sqrt(temp_test_loss)))

        if i % 50 == 0:
            print(str(i) + ':' + str([temp_train_loss, temp_test_loss]))

        if i == 500 - 1:
            pred = sess.run(output, feed_dict={x_data: x_test})
            pred_list = [x[0] for x in pred]
            print(np.transpose(np.array([y_test, pred_list])))

plt.plot(train_loss, 'k-', label='train loss')
plt.plot(test_loss, 'r--', label='test loss')
plt.legend(loc='upper right')
plt.show()