Supervised learning

Linear Regression routines

touvlo.supv.lin_rg.cost_func(X, y, theta)[source]

Computes the cost function J for Linear Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

Computed cost.

Return type:

float

touvlo.supv.lin_rg.grad(X, y, theta)[source]

Computes the gradient for Linear Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

Gradient column vector.

Return type:

numpy.array

touvlo.supv.lin_rg.h(X, theta)[source]

Linear regression hypothesis.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

The projected value for each line of the dataset.

Return type:

numpy.array

touvlo.supv.lin_rg.normal_eqn(X, y)[source]

Produces optimal theta via normal equation.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
Raises:

LinAlgError

Returns:

Optimized model parameters theta.

Return type:

numpy.array

touvlo.supv.lin_rg.predict(X, theta)[source]

Computes prediction vector.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

vector with predictions for each input line.

Return type:

numpy.array

touvlo.supv.lin_rg.reg_cost_func(X, y, theta, _lambda)[source]

Computes the regularized cost function J for Linear Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
Returns:

Computed cost with regularization.

Return type:

float

touvlo.supv.lin_rg.reg_grad(X, y, theta, _lambda)[source]

Computes the regularized gradient for Linear Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
Returns:

Regularized gradient column vector.

Return type:

numpy.array

Logistic Regression routines

touvlo.supv.lgx_rg.cost_func(X, y, theta)[source]

Computes the cost function J for Logistic Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

Computed cost.

Return type:

float

touvlo.supv.lgx_rg.grad(X, y, theta)[source]

Computes the gradient for the parameters theta.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

Gradient column vector.

Return type:

numpy.array

touvlo.supv.lgx_rg.h(X, theta)[source]

Logistic regression hypothesis.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • theta (numpy.array) – Column vector of model’s parameters.
Raises:

ValueError

Returns:

The probability that each entry belong to class 1.

Return type:

numpy.array

touvlo.supv.lgx_rg.p(x, threshold=0.5)[source]

Predicts whether a probability falls into class 1.

Parameters:
  • x (obj) – Probability that example belongs to class 1.
  • threshold (float) – point above which a probability is deemed of class 1.
Returns:

Binary value to denote class 1 or 0

Return type:

int

touvlo.supv.lgx_rg.predict(X, theta)[source]

Classifies each entry as class 1 or class 0.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • theta (numpy.array) – Column vector of model’s parameters.
Returns:

Column vector with each entry classification.

Return type:

numpy.array

touvlo.supv.lgx_rg.predict_prob(X, theta)[source]

Produces the probability that the entries belong to class 1.

Returns:Features’ dataset plus bias column. theta (numpy.array): Column vector of model’s parameters.
Return type:X (numpy.array)
Raises:ValueError
Returns:The probability that each entry belong to class 1.
Return type:numpy.array
touvlo.supv.lgx_rg.reg_cost_func(X, y, theta, _lambda)[source]

Computes the regularized cost function J for Logistic Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
Returns:

Computed cost with regularization.

Return type:

float

touvlo.supv.lgx_rg.reg_grad(X, y, theta, _lambda)[source]

Computes the regularized gradient for Logistic Regression.

Parameters:
  • X (numpy.array) – Features’ dataset plus bias column.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
Returns:

Regularized gradient column vector.

Return type:

numpy.array

Classification Neural Network routines

touvlo.supv.nn_clsf.back_propagation(y, theta, a, z, num_labels, n_hidden_layers=1)[source]

Applies back propagation to minimize model’s loss.

Parameters:
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array(numpy.array)) – array of model’s weight matrices by layer.
  • a (numpy.array(numpy.array)) – array of activation matrices by layer.
  • z (numpy.array(numpy.array)) – array of parameters prior to sigmoid by layer.
  • num_labels (int) – Number of classes in multiclass classification.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

array of matrices of ‘error values’ by layer.

Return type:

numpy.array(numpy.array)

touvlo.supv.nn_clsf.cost_function(X, y, theta, _lambda, num_labels, n_hidden_layers=1)[source]

Computes the cost function J for Neural Network.

Parameters:
  • X (numpy.array) – Features’ dataset.
  • y (numpy.array) – Column vector of expected values.
  • theta (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
  • num_labels (int) – Number of classes in multiclass classification.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

Computed cost.

Return type:

float

touvlo.supv.nn_clsf.feed_forward(X, theta, n_hidden_layers=1)[source]

Applies forward propagation to calculate model’s hypothesis.

Parameters:
  • X (numpy.array) – Features’ dataset.
  • theta (numpy.array) – Column vector of model’s parameters.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

A 2-tuple

consisting of an array of parameters prior to activation by layer and an array of activation matrices by layer.

Return type:

(numpy.array(numpy.array), numpy.array(numpy.array))

touvlo.supv.nn_clsf.grad(X, y, nn_params, _lambda, input_layer_size, hidden_layer_size, num_labels, n_hidden_layers=1)[source]

Calculates gradient of neural network’s parameters.

Parameters:
  • X (numpy.array) – Features’ dataset.
  • y (numpy.array) – Column vector of expected values.
  • nn_params (numpy.array) – Column vector of model’s parameters.
  • _lambda (float) – The regularization hyperparameter.
  • input_layer_size (int) – Number of units in the input layer.
  • hidden_layer_size (int) – Number of units in a hidden layer.
  • num_labels (int) – Number of classes in multiclass classification.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

array of gradient values by weight matrix.

Return type:

numpy.array(numpy.array)

touvlo.supv.nn_clsf.h(X, theta, n_hidden_layers=1)[source]

Classification Neural Network hypothesis.

Parameters:
  • X (numpy.array) – Features’ dataset.
  • theta (numpy.array) – Column vector of model’s parameters.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

The probability that each entry belong to class 1.

Return type:

numpy.array

touvlo.supv.nn_clsf.init_nn_weights(input_layer_size, hidden_layer_size, num_labels, n_hidden_layers=1)[source]

Initialize the weight matrices of a network with random values.

Parameters:
  • hidden_layer_size (int) – Number of units in a hidden layer.
  • input_layer_size (int) – Number of units in the input layer.
  • num_labels (int) – Number of classes in multiclass classification.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

array of weight matrices of random values.

Return type:

numpy.array(numpy.array)

touvlo.supv.nn_clsf.rand_init_weights(L_in, L_out)[source]

Initializes weight matrix with random values.

Parameters:
  • X (numpy.array) – Features’ dataset.
  • L_in (int) – Number of units in previous layer.
  • n_hidden_layers (int) – Number of units in next layer.
Returns:

Random values’ matrix of conforming dimensions.

Return type:

numpy.array

touvlo.supv.nn_clsf.unravel_params(nn_params, input_layer_size, hidden_layer_size, num_labels, n_hidden_layers=1)[source]

Unravels flattened array into list of weight matrices

Parameters:
  • nn_params (numpy.array) – Row vector of model’s parameters.
  • input_layer_size (int) – Number of units in the input layer.
  • hidden_layer_size (int) – Number of units in a hidden layer.
  • num_labels (int) – Number of classes in multiclass classification.
  • n_hidden_layers (int) – Number of hidden layers in network.
Returns:

array with model’s weight matrices.

Return type:

numpy.array(numpy.array)