Resources needed for prediction depend on the model size – not on the training device.
If the model has 200 bln variables – you will not be able to run it on workstation (because you have not enough memory).
But you can use model with 10 mln variables with no problems even if it was trained on GPU or TPU.
Every variable takes 4 to 8 bytes. If you have 8 GB of memory – you will probably be able to run a model with hundreds million variables.
Prediction is fast (assuming you have enough memory). Resources needed to train model quickly. It is efficient to train on GPU/TPU even if your model is small.
CLICK HERE to find out more related problems solutions.