Those operations are the same. You already found the (likely fastest) vectorized operation.
T = 50
D = 10
K = 20
x = np.random.randn(T, D)
y = np.random.randn(T, K)
result = np.zeros((K, D))
for k in range(K):
for t in range(T):
result[k] += y[t, k] * x[t]
result2 = np.einsum("ij,ik->jk", y, x)
np.allclose(result, result2)
Out[]: True
Likely the problem is floating-point errors in whatever method you used to determine if they were “the same.” np.allclose()
is the solution to that. It rounds off the very small errors that occur between different methods of calculations using float
s.
As the @QuangHoang states in the comments though, y.T @ x
is much more readable
CLICK HERE to find out more related problems solutions.