espnet2.schedulers.tristage_lr.TristageLR

Less than 1 minute

espnet2.schedulers.tristage_lr.TristageLR

class espnet2.schedulers.tristage_lr.TristageLR(optimizer: Optimizer, max_steps: int | float = 25000, warmup_ratio: float = 0.1, hold_ratio: float = 0.4, decay_ratio: float = 0.5, init_lr_scale: float = 0.01, final_lr_scale: float = 0.01, last_epoch: int = -1)

Bases: _LRScheduler, AbsBatchStepScheduler

Tri-stage learning rate scheduler with warmup, hold, and exponential decay.

This scheduler adjusts the learning rate in three phases: : 1. Warmup: The learning rate increases linearly from init_lr_scale * base_lr to base_lr over the first warmup_ratio * max_steps steps. 2. Hold: The learning rate is held constant at base_lr for hold_ratio * max_steps steps. 3. Decay: The learning rate decays exponentially from base_lr to final_lr_scale * base_lr over decay_ratio * max_steps steps.

Reference: : Adapted from the tri-stage LR scheduler in fairseq: https://github.com/facebookresearch/fairseq/blob/main/fairseq/ optim/lr_scheduler/tri_stage_lr_scheduler.py

Parameters:
- optimizer – Wrapped optimizer.
- max_steps – Total number of steps.
- warmup_ratio – Fraction of steps for linear warmup.
- hold_ratio – Fraction of steps to hold constant.
- decay_ratio – Fraction of steps for exponential decay.
- init_lr_scale – Initial learning rate is init_lr_scale * base_lr.
- final_lr_scale – Final learning rate is final_lr_scale * base_lr.
- last_epoch – The index of the last step. Default is -1 (fresh start).

get_lr()

Compute learning rate using chainable form of the scheduler.