Optimizer

class lib.training.optimizer.Optimizer(model: Model, config: type[OptConfig], mixed_precision: bool = False, warmup_steps: int = 0)

Bases: object

Object for managing the selected Torch optimizer

Parameters:

model (Model) – The model that is to be trained
config (type[OptConfig]) – The optimizer user configuration options
mixed_precision (bool) – True to train using mixed precision. Default: False
warmup_steps (int) – The number of steps to warmup the learning rate for. Default: 0

Methods Summary

`backward`(loss)	Perform the optimizer's backward pass
`find_learning_rate`(trainer, steps, start_lr, ...)	Use the Learning Rate Finder to discover the optimal learning rate
`load_state_dict`(state_dict)	Load the serialized data from a state dict into this object
`set_lr`(lr)	Manually assign the optimizer's learning rate with the given value
`state_dict`()	Serialized data as a dict for relevant options contained in this class
`step`()	Perform the optimizer step if valid and zero the gradients.
`to`(device)	Place the optimizer onto the given device

Methods Documentation

backward(loss: Tensor) → None

Perform the optimizer’s backward pass

Parameters:: loss (Tensor) – The loss scalar from the forward pass
Return type:: None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) → bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:

trainer (Trainer) – The training loop with the loaded training plugin
steps (int) – The number of iterations to run the learning rate finder for
start_lr (float) – The learning rate to start scanning from
end_lr (float) – The final learning rate to scan until
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) → None

Load the serialized data from a state dict into this object

Parameters:: state_dict (dict[str, Any]) – The serialized data to load
Return type:: None

set_lr(lr: float) → None

Manually assign the optimizer’s learning rate with the given value

Parameters:: lr (float) – The learning rate to apply to the optimizer
Return type:: None

state_dict() → dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:: The serialized data for this object for saving and loading

step() → None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:: None

to(device: torch.Device) → None

Place the optimizer onto the given device

Parameters:: device (torch.Device) – The device to place the optimizer on to
Return type:: None

backward(loss: Tensor) → None

Perform the optimizer’s backward pass

Parameters:: loss (Tensor) – The loss scalar from the forward pass
Return type:: None

find_learning_rate(trainer: Trainer, steps: int, start_lr: float, end_lr: float, strength: T.Literal['default', 'aggressive', 'extreme'], mode: T.Literal['set', 'graph_and_set', 'graph_and_exit']) → bool

Use the Learning Rate Finder to discover the optimal learning rate

Parameters:

trainer (Trainer) – The training loop with the loaded training plugin
steps (int) – The number of iterations to run the learning rate finder for
start_lr (float) – The learning rate to start scanning from
end_lr (float) – The final learning rate to scan until
strength (T.Literal['default', 'aggressive', 'extreme']) – How aggressively to set the optimal learning rate
mode (T.Literal['set', 'graph_and_set', 'graph_and_exit']) – The mode to run the Learning Rate Finder in

Return type:

True if an optimal learning rate was discovered.

load_state_dict(state_dict: dict[str, Any]) → None

Load the serialized data from a state dict into this object

Parameters:: state_dict (dict[str, Any]) – The serialized data to load
Return type:: None

set_lr(lr: float) → None

Manually assign the optimizer’s learning rate with the given value

Parameters:: lr (float) – The learning rate to apply to the optimizer
Return type:: None

state_dict() → dict[str, Any]

Serialized data as a dict for relevant options contained in this class

Return type:: The serialized data for this object for saving and loading

step() → None

Perform the optimizer step if valid and zero the gradients.

Handles gradient accumulation, scaling for mixed precision and gradient clipping

Return type:: None

to(device: torch.Device) → None

Place the optimizer onto the given device

Parameters:: device (torch.Device) – The device to place the optimizer on to
Return type:: None