symjax.nn.losses

vae(x, x_hat, q_mean, q_cov[, z_mean, …]) N samples of dimension D to latent space in K dimension with Gaussian distributions
vae_gmm(x, x_hat, z_mu, z_logvar, mu, …[, …]) N samples of dimension D to latent space of C sluters in K dimension
vae_comp_gmm(x, x_hat, z_mu, z_logvar, mu, …) N samples of dimension D to latent space of I pieces each of C sluters in K dimension
sparse_softmax_crossentropy_logits(p, q) Cross entropy loss given that \(p\) is sparse and \(q\) is the log-probability.
softmax_crossentropy_logits(p, q) see sparse cross entropy
sigmoid_crossentropy_logits
accuracy(targets, predictions) classification accuracy.
clustering_accuracy(labels, predictions, …) find accuracy of clustering based on intra cluster labels
huber(targets, predictions[, delta]) huber loss (regression).
explained_variance(y, ypred[, axis, epsilon]) Computes fraction of variance that ypred explains about y.
hinge_loss(predictions, targets[, delta]) (binary) hinge loss.
multiclass_hinge_loss(predictions, targets) multi-class hinge loss.
squared_differences(x, y) elementwise squared differences.

Detailed Descriptions

symjax.nn.losses.vae(x, x_hat, q_mean, q_cov, z_mean=None, z_cov=None, x_cov=None)[source]

N samples of dimension D to latent space in K dimension with Gaussian distributions

Parameters:
  • x (array) – should be of shape (N, D)
  • x_hat (array) – should be of shape (N, D)
  • q_mean (array) – should be of shape (N, K), infered mean of variational Gaussian
  • q_cov (array) – should be of shape (N, K), infered log-variance of variational Gaussian
  • z_mean (array) – should be of shape (K,), mean of z variable
  • z_cov (array) – should be of shape (K,), logstd of z variable
symjax.nn.losses.vae_gmm(x, x_hat, z_mu, z_logvar, mu, logvar, logpi, logvar_x=0.0, eps=1e-08)[source]

N samples of dimension D to latent space of C sluters in K dimension

Parameters:
  • x (array) – should be of shape (N, D)
  • x_hat (array) – should be of shape (N, D)
  • z_mu (array) – should be of shape (N, K), infered mean of variational Gaussian
  • z_logvar (array) – should be of shape (N, K), infered log-variance of variational Gaussian
  • mu (array) – should be of shape (C, K), parameter (centroids)
  • logvar (array) – should be of shape (C, K), parameter (logvar of clusters)
  • logpi (array) – should be of shape (C,), parameter (prior of clusters) :param logvar_x: :param eps:
symjax.nn.losses.vae_comp_gmm(x, x_hat, z_mu, z_logvar, mu, logvar, logpi, logvar_x=0.0, eps=1e-08)[source]

N samples of dimension D to latent space of I pieces each of C sluters in K dimension

Parameters:
  • x (array) – should be of shape (N, D)
  • x_hat (array) – should be of shape (N, D)
  • z_mu (array) – should be of shape (N, I, K), infered mean of variational Gaussian
  • z_logvar (array) – should be of shape (N, I, K), infered log-variance of variational Gaussian
  • mu (array) – should be of shape (I, C, K), parameter (centroids)
  • logvar (array) – should be of shape (I, C, K), parameter (logvar of clusters)
  • logpi (array) – should be of shape (I, C), parameter (prior of clusters) :param logvar_x: :param eps:
symjax.nn.losses.sparse_softmax_crossentropy_logits(p, q)[source]

Cross entropy loss given that \(p\) is sparse and \(q\) is the log-probability.

The formal definition given that \(p\) is now an index (of the Dirac) s.a. \(p\in \{1,\dots,D\}\) and \(q\) is unormalized (log-proba) is given by (for discrete variables, p sparse)

\[\mathcal{L}(p,q)=-q_{p}+\log(\sum_{d=1}^D \exp(q_d))\]
\[\mathcal{L}(p,q)=-q_{p}+LogSumExp(q)\]
\[\mathcal{L}(p,q)=-q_{p}+LogSumExp(q-\max_{d}q_d)\]

or by (non p sparse)

\[\mathcal{L}(p,q)=-\sum_{d=1}^Dp_{d}q_{d}+\log(\sum_{d=1}^D \exp(q_d))\]
\[\mathcal{L}(p,q)=-\sum_{d=1}^Dp_{d}q_{d}+LogSumExp(q)\]
\[\mathcal{L}(p,q)=-\sum_{d=1}^Dp_{d}q_{d}+LogSumExp(q-\max_{d}q_d)\]

with \(p\) the class index and \(q\) the predicted one (output of the network). This class takes two non sparse vectors which should be nonnegative and sum to one.

symjax.nn.losses.softmax_crossentropy_logits(p, q)[source]

see sparse cross entropy

symjax.nn.losses.accuracy(targets, predictions)[source]

classification accuracy.

It is computed by averaging the 0-1 loss as in

\[(Σ_{n=1}^N 1_{\{y_n == p_n\}})/N\]

where \(p\) denotes the predictions. The inputs must be vectors but in the special case where targets is a vector but predictions is a matrix, then the argmax is used to get the real predictions as in

\[(Σ_{n=1}^N 1_{\{y_n == arg \max p_{n,:}\}})/N\]

Wikipedia

Parameters:
  • targets (1D tensor-like) –
  • predictions (tensor-like) – it can be a \(2D\) matrix in which case the argmax is used to get the prediction
Returns:

Return type:

tensor-like

symjax.nn.losses.clustering_accuracy(labels, predictions, n_clusters)[source]

find accuracy of clustering based on intra cluster labels

This accuracy allows to quantify the ability of a clustering algorithm to solve the clustering task given the true labels of the data. This functions finds for each predicted cluster what is the most present label and uses it as the cluster label. Based on those cluster labels the accuracy is then computed.

Args:

labels: 1d integer Tensor
the true labels of the data
predictions: 1d integer Tensor
the predicted data clusters
n_clusters: int
the number of clusters
symjax.nn.losses.huber(targets, predictions, delta=1.0)[source]

huber loss (regression).

For each value x in error=targets-predictions, the following is calculated:

  • \(0.5 × x^2\) if \(|x| <= Δ\)
  • \(0.5 × Δ^2 + Δ × (|x| - Δ)\) if \(|x| > Δ\)

leading to

(Source code, png, hires.png, pdf)

../_images/losses-1.png

Wikipedia

Parameters:
  • targets – The ground truth output tensor, same dimensions as ‘predictions’.
  • predictions – The predicted outputs.
  • delta (Δ) – float, the point where the huber loss function changes from a quadratic to linear.
Returns:

loss float, this has the same shape as targets

symjax.nn.losses.explained_variance(y, ypred, axis=None, epsilon=1e-06)[source]

Computes fraction of variance that ypred explains about y. The formula is

\[1 - Var[y-ypred] / Var[y]\]

and in the special case of centered targets and predictions it becomes

\[1 - \|y-ypred\|^2_2 / \|y\|_2^2\]

hence it can be seen as an :math:`ℓ_2’ loss rescaled by the energy in the targets.

interpretation:

  • ev=0 => might as well have predicted zero
  • ev=1 => perfect prediction
  • ev<0 => worse than just predicting zero
y: Tensor like
true target
ypred: Tensor like
prediction
axis: integer or None (default=None)
the axis along which to compute the var, by default uses all axes
epsilon (ϵ): float (default=1e-6)
the added constant in the denominator
\[1 - Var(y-ypred)/(Var(y)+ϵ)\]

This is not a symmetric function

symjax.nn.losses.hinge_loss(predictions, targets, delta=1)[source]

(binary) hinge loss.

For an intended output \(t = ±1\) and a classifier score \(p\), the hinge loss is defined for each datum as

\[\max ( 0 , Δ − t p)\]

as soon as the loss is smaller than \(Δ\) the datum is well classified, however margin is increased by pushing the loss to \(0\) hence \(Δ\) is the user-defined prefered margin to reach. In standard SVM \(Δ=1\) leading to

(Source code, png, hires.png, pdf)

../_images/losses-2.png

Note that \(p\) should be the “raw” output of the classifier’s decision function, not the predicted class label. For instance, in linear SVMs, \(p = <w, x> + b\) where ( \(w , b\) are the parameters of the hyperplane and \(x\) is the input variable(s).

Parameters:
  • predictions (1D tensor) – prediction of the classifier (raw,)
  • targets (1D binary tensor with values in \(t\in\{-1,1\}\).) –
Returns:

An expression for the item-wise hinge loss

Return type:

1D tensor

Notes

This is an alternative to the categorical cross-entropy loss for classification problems

symjax.nn.losses.multiclass_hinge_loss(predictions, targets, delta=1)[source]

multi-class hinge loss.

\[L_i = \max_{j ≠ t_i} (0, p_j - p_{t_i} + Δ)\]
Parameters:
  • predictions (2D tensor) – Predictions in (0, 1), such as softmax output of a neural network, with data points in rows and class probabilities in columns.
  • targets (Theano 2D tensor or 1D tensor) – Either a vector of int giving the correct class index per data point or a 2D tensor of one-hot encoding of the correct class in the same layout as predictions (non-binary targets in [0, 1] do not work!)
  • delta (scalar, default 1) – The hinge loss margin
Returns:

An expression for the item-wise multi-class hinge loss

Return type:

Theano 1D tensor

Notes

This is an alternative to the categorical cross-entropy loss for multi-class classification problems

symjax.nn.losses.squared_differences(x, y)[source]

elementwise squared differences.

Computes element-wise

\[(x-y)^2\]

broadcasting applies as in any operations.

Wikipedia

Parameters:
  • x (tensor-like) –
  • y (tensor-like) –
Returns:

Return type:

tensor-like