Train Options

Train Options#

train_options:
type: dict, optional
argument path: train_options

Options that defines the training behaviour of DeePTB.

num_epoch:
type: int
argument path: train_options/num_epoch

Total number of training epochs. It is worth noted, if the model is reloaded with -r or –restart option, epoch which have been trained will counted from the time that the checkpoint is saved.

batch_size:
type: int, optional, default: 1
argument path: train_options/batch_size

The batch size used in training, Default: 1

ref_batch_size:
type: int, optional, default: 1
argument path: train_options/ref_batch_size

The batch size used in reference data, Default: 1

val_batch_size:
type: int, optional, default: 1
argument path: train_options/val_batch_size

The batch size used in validation data, Default: 1

optimizer:
type: dict, optional, default: {}
argument path: train_options/optimizer

The optimizer setting for selecting the gradient optimizer of model training. Optimizer supported includes Adam, SGD and LBFGS

For more information about these optmization algorithm, we refer to:

Depending on the value of type, different sub args are accepted.

type:
type: str (flag key), default: Adam
argument path: train_options/optimizer/type
possible choices: Adam, SGD

select type of optimizer, support type includes: Adam, SGD and LBFGS. Default: Adam

When type is set to Adam:

lr:
type: float, optional, default: 0.001
argument path: train_options/optimizer[Adam]/lr

learning rate. Default: 1e-3

betas:
type: list, optional, default: [0.9, 0.999]
argument path: train_options/optimizer[Adam]/betas

coefficients used for computing running averages of gradient and its square Default: (0.9, 0.999)

eps:
type: float, optional, default: 1e-08
argument path: train_options/optimizer[Adam]/eps

term added to the denominator to improve numerical stability, Default: 1e-8

weight_decay:
type: float, optional, default: 0
argument path: train_options/optimizer[Adam]/weight_decay

weight decay (L2 penalty), Default: 0

amsgrad:
type: bool, optional, default: False
argument path: train_options/optimizer[Adam]/amsgrad

whether to use the AMSGrad variant of this algorithm from the paper On the [Convergence of Adam and Beyond](https://openreview.net/forum?id=ryQu7f-RZ) ,Default: False

When type is set to SGD:

lr:
type: float, optional, default: 0.001
argument path: train_options/optimizer[SGD]/lr

learning rate. Default: 1e-3

momentum:
type: float, optional, default: 0.0
argument path: train_options/optimizer[SGD]/momentum

momentum factor Default: 0

weight_decay:
type: float, optional, default: 0.0
argument path: train_options/optimizer[SGD]/weight_decay

weight decay (L2 penalty), Default: 0

dampening:
type: float, optional, default: 0.0
argument path: train_options/optimizer[SGD]/dampening

dampening for momentum, Default: 0

nesterov:
type: bool, optional, default: False
argument path: train_options/optimizer[SGD]/nesterov

enables Nesterov momentum, Default: False

lr_scheduler:
type: dict, optional, default: {}
argument path: train_options/lr_scheduler

The learning rate scheduler tools settings, the lr scheduler is used to scales down the learning rate during the training process. Proper setting can make the training more stable and efficient. The supported lr schedular includes: Exponential Decaying (exp), Linear multiplication (linear)

Depending on the value of type, different sub args are accepted.

type:
type: str (flag key), default: exp
argument path: train_options/lr_scheduler/type
possible choices: exp, linear, rop

select type of lr_scheduler, support type includes exp, linear

When type is set to exp:

gamma:
type: float, optional, default: 0.999
argument path: train_options/lr_scheduler[exp]/gamma

Multiplicative factor of learning rate decay.

When type is set to linear:

start_factor:
type: float, optional, default: 0.3333333
argument path: train_options/lr_scheduler[linear]/start_factor

The number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.

end_factor:
type: float, optional, default: 0.3333333
argument path: train_options/lr_scheduler[linear]/end_factor

The number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.

total_iters:
type: int, optional, default: 5
argument path: train_options/lr_scheduler[linear]/total_iters

The number of iterations that multiplicative factor reaches to 1. Default: 5.

When type is set to rop:

rop: reduce on plateau

mode:
type: str, optional, default: min
argument path: train_options/lr_scheduler[rop]/mode

One of min, max. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in max mode it will be reduced when the quantity monitored has stopped increasing. Default: ‘min’.

factor:
type: float, optional, default: 0.1
argument path: train_options/lr_scheduler[rop]/factor

Factor by which the learning rate will be reduced. new_lr = lr * factor. Default: 0.1.

patience:
type: int, optional, default: 10
argument path: train_options/lr_scheduler[rop]/patience

Number of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the 3rd epoch if the loss still hasn’t improved then. Default: 10.

threshold:
type: float, optional, default: 0.0001
argument path: train_options/lr_scheduler[rop]/threshold

Threshold for measuring the new optimum, to only focus on significant changes. Default: 1e-4.

threshold_mode:
type: str, optional, default: rel
argument path: train_options/lr_scheduler[rop]/threshold_mode

One of rel, abs. In rel mode, dynamic_threshold = best * ( 1 + threshold ) in ‘max’ mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. Default: ‘rel’.

cooldown:
type: int, optional, default: 0
argument path: train_options/lr_scheduler[rop]/cooldown

Number of epochs to wait before resuming normal operation after lr has been reduced. Default: 0.

min_lr:
type: list | float, optional, default: 0
argument path: train_options/lr_scheduler[rop]/min_lr

A scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. Default: 0.

eps:
type: float, optional, default: 1e-08
argument path: train_options/lr_scheduler[rop]/eps

Minimal decay applied to lr. If the difference between new and old lr is smaller than eps, the update is ignored. Default: 1e-8.

save_freq:
type: int, optional, default: 10
argument path: train_options/save_freq

Frequency, or every how many iteration to saved the current model into checkpoints, The name of checkpoint is formulated as latest|best_dptb|nnsk_b<bond_cutoff>_c<sk_cutoff>_w<sk_decay_w>. Default: 10

validation_freq:
type: int, optional, default: 10
argument path: train_options/validation_freq

Frequency or every how many iteration to do model validation on validation datasets. Default: 10

display_freq:
type: int, optional, default: 1
argument path: train_options/display_freq

Frequency, or every how many iteration to display the training log to screem. Default: 1

max_ckpt:
type: int, optional, default: 4
argument path: train_options/max_ckpt

The maximum number of saved checkpoints, Default: 4

loss_options:
type: dict
argument path: train_options/loss_options
train:
type: dict
argument path: train_options/loss_options/train

Loss options for training.

Depending on the value of method, different sub args are accepted.

method:
type: str (flag key)
argument path: train_options/loss_options/train/method
possible choices: eigvals, skints, hamil_abs, hamil_blas

The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:

  • eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.

  • hamil:

  • hamil_abs:

  • hamil_blas:

When method is set to eigvals:

diff_on:
type: bool, optional, default: False
argument path: train_options/loss_options/train[eigvals]/diff_on

Whether to use random differences in loss function. Default: False

eout_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/train[eigvals]/eout_weight

The weight of eigenvalue out of range. Default: 0.01

diff_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/train[eigvals]/diff_weight

The weight of eigenvalue difference. Default: 0.01

diff_valence:
type: dict | NoneType, optional, default: None
argument path: train_options/loss_options/train[eigvals]/diff_valence

set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference

spin_deg:
type: int, optional, default: 2
argument path: train_options/loss_options/train[eigvals]/spin_deg

The spin degeneracy of band structure. Default: 2

When method is set to skints:

skdata:
type: str
argument path: train_options/loss_options/train[skints]/skdata

The path to the skfile or sk database.

When method is set to hamil_abs:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/train[hamil_abs]/onsite_shift

Whether to use onsite shift in loss function. Default: False

When method is set to hamil_blas:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/train[hamil_blas]/onsite_shift

Whether to use onsite shift in loss function. Default: False

validation:
type: dict, optional
argument path: train_options/loss_options/validation

Loss options for validation.

Depending on the value of method, different sub args are accepted.

method:
type: str (flag key)
argument path: train_options/loss_options/validation/method
possible choices: eigvals, skints, hamil_abs, hamil_blas

The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:

  • eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.

  • hamil:

  • hamil_abs:

  • hamil_blas:

When method is set to eigvals:

diff_on:
type: bool, optional, default: False
argument path: train_options/loss_options/validation[eigvals]/diff_on

Whether to use random differences in loss function. Default: False

eout_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/validation[eigvals]/eout_weight

The weight of eigenvalue out of range. Default: 0.01

diff_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/validation[eigvals]/diff_weight

The weight of eigenvalue difference. Default: 0.01

diff_valence:
type: dict | NoneType, optional, default: None
argument path: train_options/loss_options/validation[eigvals]/diff_valence

set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference

spin_deg:
type: int, optional, default: 2
argument path: train_options/loss_options/validation[eigvals]/spin_deg

The spin degeneracy of band structure. Default: 2

When method is set to skints:

skdata:
type: str
argument path: train_options/loss_options/validation[skints]/skdata

The path to the skfile or sk database.

When method is set to hamil_abs:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/validation[hamil_abs]/onsite_shift

Whether to use onsite shift in loss function. Default: False

When method is set to hamil_blas:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/validation[hamil_blas]/onsite_shift

Whether to use onsite shift in loss function. Default: False

reference:
type: dict, optional
argument path: train_options/loss_options/reference

Loss options for reference data in training.

Depending on the value of method, different sub args are accepted.

method:
type: str (flag key)
argument path: train_options/loss_options/reference/method
possible choices: eigvals, skints, hamil_abs, hamil_blas

The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:

  • eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.

  • hamil:

  • hamil_abs:

  • hamil_blas:

When method is set to eigvals:

diff_on:
type: bool, optional, default: False
argument path: train_options/loss_options/reference[eigvals]/diff_on

Whether to use random differences in loss function. Default: False

eout_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/reference[eigvals]/eout_weight

The weight of eigenvalue out of range. Default: 0.01

diff_weight:
type: float, optional, default: 0.01
argument path: train_options/loss_options/reference[eigvals]/diff_weight

The weight of eigenvalue difference. Default: 0.01

diff_valence:
type: dict | NoneType, optional, default: None
argument path: train_options/loss_options/reference[eigvals]/diff_valence

set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference

spin_deg:
type: int, optional, default: 2
argument path: train_options/loss_options/reference[eigvals]/spin_deg

The spin degeneracy of band structure. Default: 2

When method is set to skints:

skdata:
type: str
argument path: train_options/loss_options/reference[skints]/skdata

The path to the skfile or sk database.

When method is set to hamil_abs:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/reference[hamil_abs]/onsite_shift

Whether to use onsite shift in loss function. Default: False

When method is set to hamil_blas:

onsite_shift:
type: bool, optional, default: False
argument path: train_options/loss_options/reference[hamil_blas]/onsite_shift

Whether to use onsite shift in loss function. Default: False