Train Options#
- train_options:
- type:
dict
, optionalargument path:train_options
Options that defines the training behaviour of DeePTB.
- num_epoch:
- type:
int
argument path:train_options/num_epoch
Total number of training epochs. It is worth noted, if the model is reloaded with -r or –restart option, epoch which have been trained will counted from the time that the checkpoint is saved.
- batch_size:
- type:
int
, optional, default:1
argument path:train_options/batch_size
The batch size used in training, Default: 1
- ref_batch_size:
- type:
int
, optional, default:1
argument path:train_options/ref_batch_size
The batch size used in reference data, Default: 1
- val_batch_size:
- type:
int
, optional, default:1
argument path:train_options/val_batch_size
The batch size used in validation data, Default: 1
- optimizer:
- type:
dict
, optional, default:{}
argument path:train_options/optimizer
The optimizer setting for selecting the gradient optimizer of model training. Optimizer supported includes Adam, SGD and LBFGS
For more information about these optmization algorithm, we refer to:
Adam: [Adam: A Method for Stochastic Optimization.](https://arxiv.org/abs/1412.6980)
SGD: [Stochastic Gradient Descent.](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html)
LBFGS: [On the limited memory BFGS method for large scale optimization.](http://users.iems.northwestern.edu/~nocedal/PDFfiles/limited-memory.pdf)
Depending on the value of type, different sub args are accepted.
- type:
- type:
str
(flag key), default:Adam
argument path:train_options/optimizer/type
select type of optimizer, support type includes: Adam, SGD and LBFGS. Default: Adam
When type is set to
Adam
:- lr:
- type:
float
, optional, default:0.001
argument path:train_options/optimizer[Adam]/lr
learning rate. Default: 1e-3
- betas:
- type:
list
, optional, default:[0.9, 0.999]
argument path:train_options/optimizer[Adam]/betas
coefficients used for computing running averages of gradient and its square Default: (0.9, 0.999)
- eps:
- type:
float
, optional, default:1e-08
argument path:train_options/optimizer[Adam]/eps
term added to the denominator to improve numerical stability, Default: 1e-8
- weight_decay:
- type:
float
, optional, default:0
argument path:train_options/optimizer[Adam]/weight_decay
weight decay (L2 penalty), Default: 0
- amsgrad:
- type:
bool
, optional, default:False
argument path:train_options/optimizer[Adam]/amsgrad
whether to use the AMSGrad variant of this algorithm from the paper On the [Convergence of Adam and Beyond](https://openreview.net/forum?id=ryQu7f-RZ) ,Default: False
When type is set to
SGD
:- lr:
- type:
float
, optional, default:0.001
argument path:train_options/optimizer[SGD]/lr
learning rate. Default: 1e-3
- momentum:
- type:
float
, optional, default:0.0
argument path:train_options/optimizer[SGD]/momentum
momentum factor Default: 0
- weight_decay:
- type:
float
, optional, default:0.0
argument path:train_options/optimizer[SGD]/weight_decay
weight decay (L2 penalty), Default: 0
- dampening:
- type:
float
, optional, default:0.0
argument path:train_options/optimizer[SGD]/dampening
dampening for momentum, Default: 0
- nesterov:
- type:
bool
, optional, default:False
argument path:train_options/optimizer[SGD]/nesterov
enables Nesterov momentum, Default: False
- lr_scheduler:
- type:
dict
, optional, default:{}
argument path:train_options/lr_scheduler
The learning rate scheduler tools settings, the lr scheduler is used to scales down the learning rate during the training process. Proper setting can make the training more stable and efficient. The supported lr schedular includes: Exponential Decaying (exp), Linear multiplication (linear)
Depending on the value of type, different sub args are accepted.
- type:
- type:
str
(flag key), default:exp
argument path:train_options/lr_scheduler/type
select type of lr_scheduler, support type includes exp, linear
When type is set to
exp
:- gamma:
- type:
float
, optional, default:0.999
argument path:train_options/lr_scheduler[exp]/gamma
Multiplicative factor of learning rate decay.
When type is set to
linear
:- start_factor:
- type:
float
, optional, default:0.3333333
argument path:train_options/lr_scheduler[linear]/start_factor
The number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.
- end_factor:
- type:
float
, optional, default:0.3333333
argument path:train_options/lr_scheduler[linear]/end_factor
The number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.
- total_iters:
- type:
int
, optional, default:5
argument path:train_options/lr_scheduler[linear]/total_iters
The number of iterations that multiplicative factor reaches to 1. Default: 5.
When type is set to
rop
:rop: reduce on plateau
- mode:
- type:
str
, optional, default:min
argument path:train_options/lr_scheduler[rop]/mode
One of min, max. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in max mode it will be reduced when the quantity monitored has stopped increasing. Default: ‘min’.
- factor:
- type:
float
, optional, default:0.1
argument path:train_options/lr_scheduler[rop]/factor
Factor by which the learning rate will be reduced. new_lr = lr * factor. Default: 0.1.
- patience:
- type:
int
, optional, default:10
argument path:train_options/lr_scheduler[rop]/patience
Number of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the 3rd epoch if the loss still hasn’t improved then. Default: 10.
- threshold:
- type:
float
, optional, default:0.0001
argument path:train_options/lr_scheduler[rop]/threshold
Threshold for measuring the new optimum, to only focus on significant changes. Default: 1e-4.
- threshold_mode:
- type:
str
, optional, default:rel
argument path:train_options/lr_scheduler[rop]/threshold_mode
One of rel, abs. In rel mode, dynamic_threshold = best * ( 1 + threshold ) in ‘max’ mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. Default: ‘rel’.
- cooldown:
- type:
int
, optional, default:0
argument path:train_options/lr_scheduler[rop]/cooldown
Number of epochs to wait before resuming normal operation after lr has been reduced. Default: 0.
- min_lr:
- type:
list
|float
, optional, default:0
argument path:train_options/lr_scheduler[rop]/min_lr
A scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. Default: 0.
- eps:
- type:
float
, optional, default:1e-08
argument path:train_options/lr_scheduler[rop]/eps
Minimal decay applied to lr. If the difference between new and old lr is smaller than eps, the update is ignored. Default: 1e-8.
- save_freq:
- type:
int
, optional, default:10
argument path:train_options/save_freq
Frequency, or every how many iteration to saved the current model into checkpoints, The name of checkpoint is formulated as latest|best_dptb|nnsk_b<bond_cutoff>_c<sk_cutoff>_w<sk_decay_w>. Default: 10
- validation_freq:
- type:
int
, optional, default:10
argument path:train_options/validation_freq
Frequency or every how many iteration to do model validation on validation datasets. Default: 10
- display_freq:
- type:
int
, optional, default:1
argument path:train_options/display_freq
Frequency, or every how many iteration to display the training log to screem. Default: 1
- max_ckpt:
- type:
int
, optional, default:4
argument path:train_options/max_ckpt
The maximum number of saved checkpoints, Default: 4
- loss_options:
- type:
dict
argument path:train_options/loss_options
- train:
- type:
dict
argument path:train_options/loss_options/train
Loss options for training.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str
(flag key)argument path:train_options/loss_options/train/method
The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals
:- diff_on:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/train[eigvals]/diff_on
Whether to use random differences in loss function. Default: False
- eout_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/train[eigvals]/eout_weight
The weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/train[eigvals]/diff_weight
The weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict
|NoneType
, optional, default:None
argument path:train_options/loss_options/train[eigvals]/diff_valence
set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int
, optional, default:2
argument path:train_options/loss_options/train[eigvals]/spin_deg
The spin degeneracy of band structure. Default: 2
When method is set to
skints
:- skdata:
- type:
str
argument path:train_options/loss_options/train[skints]/skdata
The path to the skfile or sk database.
When method is set to
hamil_abs
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/train[hamil_abs]/onsite_shift
Whether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/train[hamil_blas]/onsite_shift
Whether to use onsite shift in loss function. Default: False
- validation:
- type:
dict
, optionalargument path:train_options/loss_options/validation
Loss options for validation.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str
(flag key)argument path:train_options/loss_options/validation/method
The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals
:- diff_on:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/validation[eigvals]/diff_on
Whether to use random differences in loss function. Default: False
- eout_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/validation[eigvals]/eout_weight
The weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/validation[eigvals]/diff_weight
The weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict
|NoneType
, optional, default:None
argument path:train_options/loss_options/validation[eigvals]/diff_valence
set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int
, optional, default:2
argument path:train_options/loss_options/validation[eigvals]/spin_deg
The spin degeneracy of band structure. Default: 2
When method is set to
skints
:- skdata:
- type:
str
argument path:train_options/loss_options/validation[skints]/skdata
The path to the skfile or sk database.
When method is set to
hamil_abs
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/validation[hamil_abs]/onsite_shift
Whether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/validation[hamil_blas]/onsite_shift
Whether to use onsite shift in loss function. Default: False
- reference:
- type:
dict
, optionalargument path:train_options/loss_options/reference
Loss options for reference data in training.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str
(flag key)argument path:train_options/loss_options/reference/method
The loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals
:- diff_on:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/reference[eigvals]/diff_on
Whether to use random differences in loss function. Default: False
- eout_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/reference[eigvals]/eout_weight
The weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float
, optional, default:0.01
argument path:train_options/loss_options/reference[eigvals]/diff_weight
The weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict
|NoneType
, optional, default:None
argument path:train_options/loss_options/reference[eigvals]/diff_valence
set the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int
, optional, default:2
argument path:train_options/loss_options/reference[eigvals]/spin_deg
The spin degeneracy of band structure. Default: 2
When method is set to
skints
:- skdata:
- type:
str
argument path:train_options/loss_options/reference[skints]/skdata
The path to the skfile or sk database.
When method is set to
hamil_abs
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/reference[hamil_abs]/onsite_shift
Whether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas
:- onsite_shift:
- type:
bool
, optional, default:False
argument path:train_options/loss_options/reference[hamil_blas]/onsite_shift
Whether to use onsite shift in loss function. Default: False