As Dr. Snoopy pointed out, your first solution overwrites the previously defined weight by using the same variable name.
As to why your second solution does not work either, it is because after calling tf.concat
on your two tf.Variable
w1
and w2
, e gradient disappears. It is a known bug on Tensorflow, you can find the issue on github here : Gradients do not exist for variables after tf.concat(). #37726
A Minimal reproducible example
Lets do some experiment using tf.GradientTape
to calculate the gradient :
w1 = tf.Variable([1.0])
w2 = tf.Variable([3.0])
w = tf.expand_dims(tf.concat([w1,w2],0),-1)
X = tf.random.normal((1,2))
y = tf.reduce_sum(X,1)
with tf.GradientTape(persistent=True) as tape:
r = tf.matmul(w,X)
loss = tf.metrics.mse(y, w)
print(tape.gradient(loss, r))
Results in None
.
A possible fix
One solution is to keep the Variable separated. For your layer, with a number of units=1
, there is this trivial replacement of tf.matmul
:
w1 = tf.Variable([1.0])
w2 = tf.Variable([3.0], trainable=False)
X = tf.random.normal((1,2))
y = tf.reduce_sum(X,1)
with tf.GradientTape(persistent=True) as tape:
r = X[:,0]*w1 + X[:,1]*w2
loss = tf.metrics.mse(y,r)
print(tape.gradient(loss, r))
Outputs : tf.Tensor([-3.1425157], shape=(1,), dtype=float32)
CLICK HERE to find out more related problems solutions.