AutoTrain 文档

文本分类与回归参数

您正在查看 版本,它需要从源代码安装. 如果你想使用常规的 pip install,请查看最新的稳定版本 (v0.8.8).
Hugging Face's logo
加入 Hugging Face 社区

并获取增强文档体验

开始使用

文本分类与回归参数

--batch-size BATCH_SIZE
                    Training batch size to use
--seed SEED           Random seed for reproducibility
--epochs EPOCHS       Number of training epochs
--gradient_accumulation GRADIENT_ACCUMULATION
                    Gradient accumulation steps
--disable_gradient_checkpointing
                    Disable gradient checkpointing
--lr LR               Learning rate
--log {none,wandb,tensorboard}
                    Use experiment tracking
--text-column TEXT_COLUMN
                    Specify the column name in the dataset that contains the text data. Useful for distinguishing between multiple text fields.
                    Default is 'text'.
--target-column TARGET_COLUMN
                    Specify the column name that holds the target or label data for training. Helps in distinguishing different potential
                    outputs. Default is 'target'.
--max-seq-length MAX_SEQ_LENGTH
                    Set the maximum sequence length (number of tokens) that the model should handle in a single input. Longer sequences are
                    truncated. Affects both memory usage and computational requirements. Default is 128 tokens.
--warmup-ratio WARMUP_RATIO
                    Define the proportion of training to be dedicated to a linear warmup where learning rate gradually increases. This can help
                    in stabilizing the training process early on. Default ratio is 0.1.
--optimizer OPTIMIZER
                    Choose the optimizer algorithm for training the model. Different optimizers can affect the training speed and model
                    performance. 'adamw_torch' is used by default.
--scheduler SCHEDULER
                    Select the learning rate scheduler to adjust the learning rate based on the number of epochs. 'linear' decreases the
                    learning rate linearly from the initial lr set. Default is 'linear'. Try 'cosine' for a cosine annealing schedule.
--weight-decay WEIGHT_DECAY
                    Set the weight decay rate to apply for regularization. Helps in preventing the model from overfitting by penalizing large
                    weights. Default is 0.0, meaning no weight decay is applied.
--max-grad-norm MAX_GRAD_NORM
                    Specify the maximum norm of the gradients for gradient clipping. Gradient clipping is used to prevent the exploding gradient
                    problem in deep neural networks. Default is 1.0.
--logging-steps LOGGING_STEPS
                    Determine how often to log training progress. Set this to the number of steps between each log output. -1 determines logging
                    steps automatically. Default is -1.
--eval-strategy {steps,epoch,no}
                    Specify how often to evaluate the model performance. Options include 'no', 'steps', 'epoch'. 'epoch' evaluates at the end of
                    each training epoch by default.
--save-total-limit SAVE_TOTAL_LIMIT
                    Limit the total number of model checkpoints to save. Helps manage disk space by retaining only the most recent checkpoints.
                    Default is to save only the latest one.
--auto-find-batch-size
                    Enable automatic batch size determination based on your hardware capabilities. When set, it tries to find the largest batch
                    size that fits in memory.
--mixed-precision {fp16,bf16,None}
                    Choose the precision mode for training to optimize performance and memory usage. Options are 'fp16', 'bf16', or None for
                    default precision. Default is None.
< > 在 GitHub 上更新

© . This site is unofficial and not affiliated with Hugging Face, Inc.