why is the pytorch code implementing different implementations giving different losses?

You forgot to zero out/clear the gradients in your implementation. That is you are missing :

optimizer.zero_grad()

In other words simply do:

for i in range(10):
    running_loss = 0
    for images,labels in trainloader:
        images = images.view(images.shape[0], -1)
    
        output = model.forward(images)
        loss = criterion(output,labels)
        # missed this! 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    else:
        print(f"Training loss: {running_loss}")

and you are good to go!

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top