espnet2.schedulers.tristage_lr.TristageLR
espnet2.schedulers.tristage_lr.TristageLR
class espnet2.schedulers.tristage_lr.TristageLR(optimizer: Optimizer, max_steps: int | float = 25000, warmup_ratio: float = 0.1, hold_ratio: float = 0.4, decay_ratio: float = 0.5, init_lr_scale: float = 0.01, final_lr_scale: float = 0.01, last_epoch: int = -1)
Bases: _LRScheduler
, AbsBatchStepScheduler
Tri-stage learning rate scheduler with warmup, hold, and exponential decay.
This scheduler adjusts the learning rate in three phases: : 1. Warmup: The learning rate increases linearly from init_lr_scale * base_lr to base_lr over the first warmup_ratio * max_steps steps. 2. Hold: The learning rate is held constant at base_lr for hold_ratio * max_steps steps. 3. Decay: The learning rate decays exponentially from base_lr to final_lr_scale * base_lr over decay_ratio * max_steps steps.
Reference: : Adapted from the tri-stage LR scheduler in fairseq: https://github.com/facebookresearch/fairseq/blob/main/fairseq/ optim/lr_scheduler/tri_stage_lr_scheduler.py
- Parameters:
- optimizer – Wrapped optimizer.
- max_steps – Total number of steps.
- warmup_ratio – Fraction of steps for linear warmup.
- hold_ratio – Fraction of steps to hold constant.
- decay_ratio – Fraction of steps for exponential decay.
- init_lr_scale – Initial learning rate is init_lr_scale * base_lr.
- final_lr_scale – Final learning rate is final_lr_scale * base_lr.
- last_epoch – The index of the last step. Default is -1 (fresh start).
get_lr()