Before answering the two questions in your post, let’s first clarify `LearningRateScheduler`

is not for picking the ‘best’ learning rate.

I think what you really want to ask is “how to determine the best **initial** learning rate”. If I am correct, then you need to learn about hyperparameter tuning.

**Answer to Q1:**

In order to answer how `1e-8 * 10**(epoch / 20)`

works, let’s create a simple regression task

```
import tensorflow as tf
import tensorflow.keras.backend as K
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input,Dense
x = np.linspace(0,100,1000)
y = np.sin(x) + x**2
x_train,x_val,y_train,y_val = train_test_split(x,y,test_size=0.3)
input_x = Input(shape=(1,))
y = Dense(10,activation='relu')(input_x)
y = Dense(1,activation='relu')(y)
model = Model(inputs=input_x,outputs=y)
adamopt = tf.keras.optimizers.Adam(lr=0.01, beta_1=0.9, beta_2=0.999, epsilon=1e-8)
def schedule_func(epoch):
print()
print('calling lr_scheduler on epoch %i' % epoch)
print('current learning rate %.8f' % K.eval(model.optimizer.lr))
print('returned value %.8f' % (1e-8 * 10**(epoch / 20)))
return 1e-8 * 10**(epoch / 20)
lr_schedule = tf.keras.callbacks.LearningRateScheduler(schedule_func)
model.compile(loss='mse',optimizer=adamopt,metrics=['mae'])
history = model.fit(x_train,y_train,
batch_size=8,
epochs=10,
validation_data=(x_val, y_val),
verbose=1,
callbacks=[lr_schedule])
```

In the script above, instead of using a `lambda`

function, I wrote a function `schedule_func`

. Running the script, you will see that `1e-8 * 10**(epoch / 20)`

just set the learning rate for each `epoch`

, and the learning rate is increasing.

**Answer to Q2:**

There are a bunch of nice posts, for example

CLICK HERE to find out more related problems solutions.