Optimizer
- class lib.training.optimizer.Optimizer(model: Model, config: type[OptConfig], mixed_precision: bool = False, warmup_steps: int = 0)
Bases:
objectObject for managing the selected Torch optimizer
- Parameters:
model (Model) – The model that is to be trained
config (type[OptConfig]) – The optimizer user configuration options
mixed_precision (bool) –
Trueto train using mixed precision. Default:Falsewarmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0
Methods Summary
backward(loss)Perform the optimizer's backward pass
find_learning_rate(trainer, steps, start_lr, ...)Use the Learning Rate Finder to discover the optimal learning rate
load_state_dict(state_dict)Load the serialized data from a state dict into this object
set_lr(lr)Manually assign the optimizer's learning rate with the given value
Serialized data as a dict for relevant options contained in this class
step()Perform the optimizer step if valid and zero the gradients.
to(device)Place the optimizer onto the given device
Methods Documentation
- backward(loss: Tensor) None
Perform the optimizer’s backward pass
- Parameters:
loss (Tensor) – The loss scalar from the forward pass
- Return type:
None
- find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) bool
Use the Learning Rate Finder to discover the optimal learning rate
- Parameters:
trainer (Trainer) – The training loop with the loaded training plugin
steps (int) – The number of iterations to run the learning rate finder for
start_lr (float) – The learning rate to start scanning from
end_lr (float) – The final learning rate to scan until
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in
- Return type:
Trueif an optimal learning rate was discovered.
- load_state_dict(state_dict: dict[str, Any]) None
Load the serialized data from a state dict into this object
- Parameters:
state_dict (dict[str, Any]) – The serialized data to load
- Return type:
None
- set_lr(lr: float) None
Manually assign the optimizer’s learning rate with the given value
- Parameters:
lr (float) – The learning rate to apply to the optimizer
- Return type:
None
- state_dict() dict[str, Any]
Serialized data as a dict for relevant options contained in this class
- Return type:
The serialized data for this object for saving and loading
- step() None
Perform the optimizer step if valid and zero the gradients.
Handles gradient accumulation, scaling for mixed precision and gradient clipping
- Return type:
None
- to(device: torch.Device) None
Place the optimizer onto the given device
- Parameters:
device (torch.Device) – The device to place the optimizer on to
- Return type:
None
- backward(loss: Tensor) None
Perform the optimizer’s backward pass
- Parameters:
loss (Tensor) – The loss scalar from the forward pass
- Return type:
None
- find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) bool
Use the Learning Rate Finder to discover the optimal learning rate
- Parameters:
trainer (Trainer) – The training loop with the loaded training plugin
steps (int) – The number of iterations to run the learning rate finder for
start_lr (float) – The learning rate to start scanning from
end_lr (float) – The final learning rate to scan until
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in
- Return type:
Trueif an optimal learning rate was discovered.
- load_state_dict(state_dict: dict[str, Any]) None
Load the serialized data from a state dict into this object
- Parameters:
state_dict (dict[str, Any]) – The serialized data to load
- Return type:
None
- set_lr(lr: float) None
Manually assign the optimizer’s learning rate with the given value
- Parameters:
lr (float) – The learning rate to apply to the optimizer
- Return type:
None
- state_dict() dict[str, Any]
Serialized data as a dict for relevant options contained in this class
- Return type:
The serialized data for this object for saving and loading
- step() None
Perform the optimizer step if valid and zero the gradients.
Handles gradient accumulation, scaling for mixed precision and gradient clipping
- Return type:
None
- to(device: torch.Device) None
Place the optimizer onto the given device
- Parameters:
device (torch.Device) – The device to place the optimizer on to
- Return type:
None