Train Options#
- train_options:
- type:
dict, optionalargument path:train_optionsOptions that defines the training behaviour of DeePTB.
- num_epoch:
- type:
intargument path:train_options/num_epochTotal number of training epochs. It is worth noted, if the model is reloaded with -r or –restart option, epoch which have been trained will counted from the time that the checkpoint is saved.
- batch_size:
- type:
int, optional, default:1argument path:train_options/batch_sizeThe batch size used in training, Default: 1
- ref_batch_size:
- type:
int, optional, default:1argument path:train_options/ref_batch_sizeThe batch size used in reference data, Default: 1
- val_batch_size:
- type:
int, optional, default:1argument path:train_options/val_batch_sizeThe batch size used in validation data, Default: 1
- optimizer:
- type:
dict, optional, default:{}argument path:train_options/optimizerThe optimizer setting for selecting the gradient optimizer of model training. Optimizer supported includes Adam, SGD and LBFGS
For more information about these optmization algorithm, we refer to:
Adam: [Adam: A Method for Stochastic Optimization.](https://arxiv.org/abs/1412.6980)
SGD: [Stochastic Gradient Descent.](https://pytorch.org/docs/stable/generated/torch.optim.SGD.html)
LBFGS: [On the limited memory BFGS method for large scale optimization.](http://users.iems.northwestern.edu/~nocedal/PDFfiles/limited-memory.pdf)
Depending on the value of type, different sub args are accepted.
- type:
- type:
str(flag key), default:Adamargument path:train_options/optimizer/typeselect type of optimizer, support type includes: Adam, SGD and LBFGS. Default: Adam
When type is set to
Adam:- lr:
- type:
float, optional, default:0.001argument path:train_options/optimizer[Adam]/lrlearning rate. Default: 1e-3
- betas:
- type:
list, optional, default:[0.9, 0.999]argument path:train_options/optimizer[Adam]/betascoefficients used for computing running averages of gradient and its square Default: (0.9, 0.999)
- eps:
- type:
float, optional, default:1e-08argument path:train_options/optimizer[Adam]/epsterm added to the denominator to improve numerical stability, Default: 1e-8
- weight_decay:
- type:
float, optional, default:0argument path:train_options/optimizer[Adam]/weight_decayweight decay (L2 penalty), Default: 0
- amsgrad:
- type:
bool, optional, default:Falseargument path:train_options/optimizer[Adam]/amsgradwhether to use the AMSGrad variant of this algorithm from the paper On the [Convergence of Adam and Beyond](https://openreview.net/forum?id=ryQu7f-RZ) ,Default: False
When type is set to
SGD:- lr:
- type:
float, optional, default:0.001argument path:train_options/optimizer[SGD]/lrlearning rate. Default: 1e-3
- momentum:
- type:
float, optional, default:0.0argument path:train_options/optimizer[SGD]/momentummomentum factor Default: 0
- weight_decay:
- type:
float, optional, default:0.0argument path:train_options/optimizer[SGD]/weight_decayweight decay (L2 penalty), Default: 0
- dampening:
- type:
float, optional, default:0.0argument path:train_options/optimizer[SGD]/dampeningdampening for momentum, Default: 0
- nesterov:
- type:
bool, optional, default:Falseargument path:train_options/optimizer[SGD]/nesterovenables Nesterov momentum, Default: False
- lr_scheduler:
- type:
dict, optional, default:{}argument path:train_options/lr_schedulerThe learning rate scheduler tools settings, the lr scheduler is used to scales down the learning rate during the training process. Proper setting can make the training more stable and efficient. The supported lr schedular includes: Exponential Decaying (exp), Linear multiplication (linear)
Depending on the value of type, different sub args are accepted.
- type:
- type:
str(flag key), default:expargument path:train_options/lr_scheduler/typeselect type of lr_scheduler, support type includes exp, linear
When type is set to
exp:- gamma:
- type:
float, optional, default:0.999argument path:train_options/lr_scheduler[exp]/gammaMultiplicative factor of learning rate decay.
When type is set to
linear:- start_factor:
- type:
float, optional, default:0.3333333argument path:train_options/lr_scheduler[linear]/start_factorThe number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.
- end_factor:
- type:
float, optional, default:0.3333333argument path:train_options/lr_scheduler[linear]/end_factorThe number we multiply learning rate in the first epoch. The multiplication factor changes towards end_factor in the following epochs. Default: 1./3.
- total_iters:
- type:
int, optional, default:5argument path:train_options/lr_scheduler[linear]/total_itersThe number of iterations that multiplicative factor reaches to 1. Default: 5.
When type is set to
rop:rop: reduce on plateau
- mode:
- type:
str, optional, default:minargument path:train_options/lr_scheduler[rop]/modeOne of min, max. In min mode, lr will be reduced when the quantity monitored has stopped decreasing; in max mode it will be reduced when the quantity monitored has stopped increasing. Default: ‘min’.
- factor:
- type:
float, optional, default:0.1argument path:train_options/lr_scheduler[rop]/factorFactor by which the learning rate will be reduced. new_lr = lr * factor. Default: 0.1.
- patience:
- type:
int, optional, default:10argument path:train_options/lr_scheduler[rop]/patienceNumber of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the 3rd epoch if the loss still hasn’t improved then. Default: 10.
- threshold:
- type:
float, optional, default:0.0001argument path:train_options/lr_scheduler[rop]/thresholdThreshold for measuring the new optimum, to only focus on significant changes. Default: 1e-4.
- threshold_mode:
- type:
str, optional, default:relargument path:train_options/lr_scheduler[rop]/threshold_modeOne of rel, abs. In rel mode, dynamic_threshold = best * ( 1 + threshold ) in ‘max’ mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. Default: ‘rel’.
- cooldown:
- type:
int, optional, default:0argument path:train_options/lr_scheduler[rop]/cooldownNumber of epochs to wait before resuming normal operation after lr has been reduced. Default: 0.
- min_lr:
- type:
list|float, optional, default:0argument path:train_options/lr_scheduler[rop]/min_lrA scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. Default: 0.
- eps:
- type:
float, optional, default:1e-08argument path:train_options/lr_scheduler[rop]/epsMinimal decay applied to lr. If the difference between new and old lr is smaller than eps, the update is ignored. Default: 1e-8.
- save_freq:
- type:
int, optional, default:10argument path:train_options/save_freqFrequency, or every how many iteration to saved the current model into checkpoints, The name of checkpoint is formulated as latest|best_dptb|nnsk_b<bond_cutoff>_c<sk_cutoff>_w<sk_decay_w>. Default: 10
- validation_freq:
- type:
int, optional, default:10argument path:train_options/validation_freqFrequency or every how many iteration to do model validation on validation datasets. Default: 10
- display_freq:
- type:
int, optional, default:1argument path:train_options/display_freqFrequency, or every how many iteration to display the training log to screem. Default: 1
- max_ckpt:
- type:
int, optional, default:4argument path:train_options/max_ckptThe maximum number of saved checkpoints, Default: 4
- loss_options:
- type:
dictargument path:train_options/loss_options- train:
- type:
dictargument path:train_options/loss_options/trainLoss options for training.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str(flag key)argument path:train_options/loss_options/train/methodThe loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals:- diff_on:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/train[eigvals]/diff_onWhether to use random differences in loss function. Default: False
- eout_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/train[eigvals]/eout_weightThe weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/train[eigvals]/diff_weightThe weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict|NoneType, optional, default:Noneargument path:train_options/loss_options/train[eigvals]/diff_valenceset the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int, optional, default:2argument path:train_options/loss_options/train[eigvals]/spin_degThe spin degeneracy of band structure. Default: 2
When method is set to
skints:- skdata:
- type:
strargument path:train_options/loss_options/train[skints]/skdataThe path to the skfile or sk database.
When method is set to
hamil_abs:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/train[hamil_abs]/onsite_shiftWhether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/train[hamil_blas]/onsite_shiftWhether to use onsite shift in loss function. Default: False
- validation:
- type:
dict, optionalargument path:train_options/loss_options/validationLoss options for validation.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str(flag key)argument path:train_options/loss_options/validation/methodThe loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals:- diff_on:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/validation[eigvals]/diff_onWhether to use random differences in loss function. Default: False
- eout_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/validation[eigvals]/eout_weightThe weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/validation[eigvals]/diff_weightThe weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict|NoneType, optional, default:Noneargument path:train_options/loss_options/validation[eigvals]/diff_valenceset the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int, optional, default:2argument path:train_options/loss_options/validation[eigvals]/spin_degThe spin degeneracy of band structure. Default: 2
When method is set to
skints:- skdata:
- type:
strargument path:train_options/loss_options/validation[skints]/skdataThe path to the skfile or sk database.
When method is set to
hamil_abs:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/validation[hamil_abs]/onsite_shiftWhether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/validation[hamil_blas]/onsite_shiftWhether to use onsite shift in loss function. Default: False
- reference:
- type:
dict, optionalargument path:train_options/loss_options/referenceLoss options for reference data in training.
Depending on the value of method, different sub args are accepted.
- method:
- type:
str(flag key)argument path:train_options/loss_options/reference/methodThe loss function type, defined by a string like <fitting target>_<loss type>, Default: eigs_l2dsf. supported loss functions includes:
eigvals: The mse loss predicted and labeled eigenvalues and Delta eigenvalues between different k.
hamil:
hamil_abs:
hamil_blas:
When method is set to
eigvals:- diff_on:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/reference[eigvals]/diff_onWhether to use random differences in loss function. Default: False
- eout_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/reference[eigvals]/eout_weightThe weight of eigenvalue out of range. Default: 0.01
- diff_weight:
- type:
float, optional, default:0.01argument path:train_options/loss_options/reference[eigvals]/diff_weightThe weight of eigenvalue difference. Default: 0.01
- diff_valence:
- type:
dict|NoneType, optional, default:Noneargument path:train_options/loss_options/reference[eigvals]/diff_valenceset the difference of the number of valence electrons in DFT and TB. eg {‘A’:6,’B’:7}, Default: None, which means no difference
- spin_deg:
- type:
int, optional, default:2argument path:train_options/loss_options/reference[eigvals]/spin_degThe spin degeneracy of band structure. Default: 2
When method is set to
skints:- skdata:
- type:
strargument path:train_options/loss_options/reference[skints]/skdataThe path to the skfile or sk database.
When method is set to
hamil_abs:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/reference[hamil_abs]/onsite_shiftWhether to use onsite shift in loss function. Default: False
When method is set to
hamil_blas:- onsite_shift:
- type:
bool, optional, default:Falseargument path:train_options/loss_options/reference[hamil_blas]/onsite_shiftWhether to use onsite shift in loss function. Default: False