how can you train neural nets in keras and have them share their losses while training?

Since you are not interested in using trainable weights (I label them coefficients to distinguish them from trainable weights) you can concatenate the outputs and pass them as single output to a custom loss function. This means that those coefficients will be available when the training will start.

You should provide a custom loss function as mentioned. A loss function is expected to take 2 arguments only so you should such a function aka categorical_crossentropy which should also be familiar with the parameters you are interested also like coeffs and num_class. So I instantiate a wrapper function with the arguments I want and then pass the inside actual loss function as the main loss function.

from tensorflow.keras.layers import Dense, Dropout, Input, Concatenate
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model

from tensorflow.python.framework import ops
from tensorflow.python.framework import smart_cond
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import array_ops
from tensorflow.python.keras import backend as K


def categorical_crossentropy_base(coeffs, num_class):

    def categorical_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0):
        """Computes the categorical crossentropy loss.
      Args:
        y_true: tensor of true targets.
        y_pred: tensor of predicted targets.
        from_logits: Whether `y_pred` is expected to be a logits tensor. By default,
          we assume that `y_pred` encodes a probability distribution.
        label_smoothing: Float in [0, 1]. If > `0` then smooth the labels.
      Returns:
        Categorical crossentropy loss value.
        https://github.com/tensorflow/tensorflow/blob/v1.15.0/tensorflow/python/keras/losses.py#L938-L966
      """
        y_pred1 = y_pred[:, :num_class]  # the 1st prediction
        y_pred2 = y_pred[:, num_class:2*num_class]  # the 2nd prediction
        y_pred3 = y_pred[:, 2*num_class:]  # the 3rd prediction

        # you should adapt the ground truth to contain all 3 ground truth of course
        y_true1 = y_true[:, :num_class]  # the 1st gt
        y_true2 = y_true[:, num_class:2*num_class]  # the 2nd gt
        y_true3 = y_true[:, 2*num_class:]  # the 3rd gt

        loss1 = K.categorical_crossentropy(y_true1, y_pred1, from_logits=from_logits)
        loss2 = K.categorical_crossentropy(y_true2, y_pred2, from_logits=from_logits)
        loss3 = K.categorical_crossentropy(y_true3, y_pred3, from_logits=from_logits)

        # combine the losses the way you like it
        total_loss = coeffs[0]*loss1 + coeffs[1]*(loss1 - loss2) + coeffs[2]*(loss2 - loss3)
        return total_loss

    return categorical_crossentropy

in1 = Input((6373,))
enc1 = Dense(num_nodes)(in1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
enc1 = Dropout(0.3)(enc1)
enc1 = Dense(num_nodes, activation='relu')(enc1)
out1 = Dense(units=num_class, activation='softmax')(enc1)

in2 = Input((512,))
enc2 = Dense(num_nodes, activation='relu')(in2)
enc2 = Dense(num_nodes, activation='relu')(enc2)
out2 = Dense(units=num_class, activation='softmax')(enc2)

in3 = Input((768,))
enc3 = Dense(num_nodes, activation='relu')(in3)
enc3 = Dense(num_nodes, activation='relu')(enc3)
out3 = Dense(units=num_class, activation='softmax')(enc3)

adam = Adam(lr=0.0001)

total_out = Concatenate(axis=1)([out1, out2, out3])
model = Model(inputs=[in1, in2, in3], outputs=[total_out])

coeffs = [1, 1, 1]
model.compile(loss=categorical_crossentropy_base(coeffs=coeffs, num_class=num_class),  optimizer='adam', metrics=['accuracy'])

I am not sure about the metrics regarding accuracy though. But I think it will work without other changes. I am also using K.categorical_crossentropy but you can freely change it with another implementation as well of course.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top