how do i mix trainable and non trainable weights in the same layer?

As Dr. Snoopy pointed out, your first solution overwrites the previously defined weight by using the same variable name.

As to why your second solution does not work either, it is because after calling tf.concat on your two tf.Variable w1 and w2, e gradient disappears. It is a known bug on Tensorflow, you can find the issue on github here : Gradients do not exist for variables after tf.concat(). #37726

A Minimal reproducible example

Lets do some experiment using tf.GradientTape to calculate the gradient :

w1 = tf.Variable([1.0])
w2 = tf.Variable([3.0])
w =  tf.expand_dims(tf.concat([w1,w2],0),-1)
X = tf.random.normal((1,2))
y = tf.reduce_sum(X,1)
with tf.GradientTape(persistent=True) as tape:
    r = tf.matmul(w,X)
    loss = tf.metrics.mse(y, w)
print(tape.gradient(loss, r))

Results in None.

A possible fix

One solution is to keep the Variable separated. For your layer, with a number of units=1, there is this trivial replacement of tf.matmul :

w1 = tf.Variable([1.0])
w2 = tf.Variable([3.0], trainable=False)
X = tf.random.normal((1,2))
y = tf.reduce_sum(X,1)
with tf.GradientTape(persistent=True) as tape:
    r = X[:,0]*w1 + X[:,1]*w2
    loss = tf.metrics.mse(y,r)
print(tape.gradient(loss, r))

Outputs : tf.Tensor([-3.1425157], shape=(1,), dtype=float32)

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top