增加了超参数调优文档
This commit is contained in:
parent
1c91ba42ab
commit
485b4e497a
224
docs/dev/hyper_parameters.md
Normal file
224
docs/dev/hyper_parameters.md
Normal file
@ -0,0 +1,224 @@
|
||||
# 模型超参数设置
|
||||
|
||||
## 1. PyTorch神经网络
|
||||
|
||||
### 火箭炮配置
|
||||
|
||||
1. 输入层 -> 隐藏层1:Linear(input_size -> 32) + ReLU + BatchNorm
|
||||
2. 隐藏层1 -> 隐藏层2:Linear(32 -> 16) + ReLU + BatchNorm
|
||||
3. 隐藏层2 -> 隐藏层3:Linear(16 -> 8) + ReLU + BatchNorm
|
||||
4. 隐藏层3 -> 输出层:Linear(8 -> 1)
|
||||
|
||||
```python
|
||||
learning_rate = 0.0003
|
||||
weight_decay = 0.001
|
||||
optimizer = AdamW(
|
||||
betas=(0.8, 0.9),
|
||||
eps=1e-8
|
||||
)
|
||||
loss_function = SmoothL1Loss(beta=0.1)
|
||||
scheduler = 带预热的余弦退火
|
||||
gradient_clip = max_norm=0.1
|
||||
```
|
||||
|
||||
### 巡飞弹配置
|
||||
|
||||
生产商特征网络(2层):
|
||||
|
||||
1. Linear(5 -> 4) + ReLU + BatchNorm + Dropout(0.2)
|
||||
|
||||
装备特征网络(4层):
|
||||
|
||||
1. Linear(input_size-5 -> 64) + LeakyReLU + BatchNorm + Dropout
|
||||
2. Linear(64 -> 32) + LeakyReLU + BatchNorm + Dropout
|
||||
3. Linear(32 -> 16) + LeakyReLU + BatchNorm + Dropout
|
||||
|
||||
合并网络(4层):
|
||||
|
||||
1. Linear(20 -> 32) + LeakyReLU + BatchNorm + Dropout
|
||||
2. Linear(32 -> 16) + LeakyReLU + BatchNorm + Dropout
|
||||
3. Linear(16 -> 8) + LeakyReLU + BatchNorm
|
||||
4. Linear(8 -> 1)
|
||||
|
||||
```python
|
||||
learning_rate = 0.001
|
||||
weight_decay = 0.001
|
||||
optimizer = Adam(betas=(0.9, 0.999))
|
||||
loss_function = MSELoss()
|
||||
scheduler = 余弦退火
|
||||
```
|
||||
|
||||
## 2. XGBoost
|
||||
|
||||
```python
|
||||
n_estimators = 50
|
||||
learning_rate = 0.03
|
||||
max_depth = 3
|
||||
min_child_weight = 5
|
||||
subsample = 0.6
|
||||
colsample_bytree = 0.6
|
||||
reg_alpha = 0.5
|
||||
reg_lambda = 2.0
|
||||
gamma = 1
|
||||
random_state = 42
|
||||
```
|
||||
|
||||
## 3. LightGBM
|
||||
|
||||
```python
|
||||
n_estimators = 50
|
||||
learning_rate = 0.03
|
||||
max_depth = 3
|
||||
num_leaves = 8
|
||||
subsample = 0.6
|
||||
colsample_bytree = 0.6
|
||||
reg_alpha = 0.5
|
||||
reg_lambda = 2.0
|
||||
min_child_samples = 10
|
||||
min_split_gain = 1.0
|
||||
random_state = 42
|
||||
```
|
||||
|
||||
## 4. GBM(梯度提升机)
|
||||
|
||||
```python
|
||||
n_estimators = 50
|
||||
learning_rate = 0.03
|
||||
max_depth = 3
|
||||
min_samples_split = 10
|
||||
min_samples_leaf = 5
|
||||
subsample = 0.6
|
||||
min_impurity_decrease = 0.01
|
||||
random_state = 42
|
||||
```
|
||||
|
||||
## 5. Random Forest(随机森林)
|
||||
|
||||
```python
|
||||
n_estimators = 100
|
||||
max_depth = 4
|
||||
min_samples_split = 5
|
||||
min_samples_leaf = 3
|
||||
max_features = 'sqrt'
|
||||
bootstrap = True
|
||||
random_state = 42
|
||||
```
|
||||
|
||||
## 6. PLS回归
|
||||
|
||||
```python
|
||||
n_components = min(3, 特征数量//5)
|
||||
scale = True
|
||||
max_iter = 500
|
||||
tol = 1e-6
|
||||
```
|
||||
|
||||
## 超参数调优策略
|
||||
|
||||
### 1. 样本量增加时的调整策略
|
||||
|
||||
#### PyTorch神经网络
|
||||
|
||||
- 增加网络深度和宽度
|
||||
- 可以在现有层之间添加更多隐藏层
|
||||
- 适当增加每层神经元数量
|
||||
- 调整学习率和优化器
|
||||
- 可以使用更大的学习率(如0.001-0.005)
|
||||
- 减小weight_decay(如0.0005)
|
||||
- 减少正则化强度
|
||||
- 降低Dropout率(如0.1)
|
||||
- 可以移除部分BatchNorm层
|
||||
|
||||
#### 树模型(XGBoost/LightGBM/GBM)
|
||||
|
||||
- 增加树的数量(n_estimators:100-500)
|
||||
- 增加树的深度(max_depth:4-6)
|
||||
- 减小正则化参数
|
||||
- reg_alpha:0.3
|
||||
- reg_lambda:1.0
|
||||
- 增大子采样比例(subsample:0.8-0.9)
|
||||
|
||||
#### Random Forest
|
||||
|
||||
- 增加树的数量(n_estimators:200-500)
|
||||
- 增加树的深度(max_depth:6-8)
|
||||
- 减小最小分裂样本数
|
||||
- min_samples_split:3
|
||||
- min_samples_leaf:2
|
||||
|
||||
#### PLS回归
|
||||
|
||||
- 增加组件数量(n_components)
|
||||
- 可以考虑使用非线性核函数
|
||||
|
||||
### 2. 特征数量变化的调整策略
|
||||
|
||||
#### 特征数量增加时
|
||||
|
||||
- 增强特征选择和降维
|
||||
- 增加正则化强度
|
||||
- 考虑使用特征筛选方法
|
||||
- 可以使用自动特征选择算法
|
||||
|
||||
#### 特征数量减少时
|
||||
|
||||
- 简化模型结构
|
||||
- 减少正则化强度
|
||||
- 增加每个特征的权重
|
||||
|
||||
### 3. 自动化调优建议
|
||||
|
||||
1. 使用网格搜索(Grid Search)
|
||||
- 适用于参数空间较小时
|
||||
- 可以详尽搜索最优参数
|
||||
|
||||
2. 使用随机搜索(Random Search)
|
||||
- 适用于参数空间较大时
|
||||
- 比网格搜索更高效
|
||||
|
||||
3. 使用贝叶斯优化
|
||||
- 适用于计算资源有限时
|
||||
- 能更智能地搜索参数空间
|
||||
|
||||
4. 交叉验证策略
|
||||
- 样本量大时:使用K折交叉验证(K=5或10)
|
||||
- 样本量小时:使用留一法交叉验证
|
||||
|
||||
### 4. 性能监控指标
|
||||
|
||||
在调参过程中需要监控:
|
||||
|
||||
1. 训练集和验证集的损失曲线
|
||||
2. 模型复杂度vs性能提升
|
||||
3. 训练时间vs性能提升
|
||||
4. 过拟合风险
|
||||
|
||||
### 5. 调优注意事项
|
||||
|
||||
1. 保持可解释性
|
||||
- 模型复杂度增加时,确保结果仍可解释
|
||||
- 记录参数调整的原因和效果
|
||||
|
||||
2. 计算资源平衡
|
||||
- 在性能提升和计算成本间找到平衡点
|
||||
- 考虑模型部署的实际环境限制
|
||||
|
||||
3. 稳定性要求
|
||||
- 确保模型在不同数据分布下仍能稳定工作
|
||||
- 定期使用新数据验证模型性能
|
||||
|
||||
## 参数说明
|
||||
|
||||
所有模型都设置了 `random_state=42` 以确保结果可重现。这些参数是经过调优的,针对小样本量的特点,采用了较为保守的设置:
|
||||
|
||||
- 较小的学习率:避免过拟合,提高模型稳定性
|
||||
- 较浅的树深度:防止模型过于复杂
|
||||
- 较强的正则化:增强模型泛化能力
|
||||
- 适当的子采样比例:提高模型鲁棒性
|
||||
|
||||
这些参数设置主要考虑了以下因素:
|
||||
|
||||
1. 样本量较小
|
||||
2. 特征维度适中
|
||||
3. 需要较强的泛化能力
|
||||
4. 预测稳定性要求高
|
||||
Loading…
Reference in New Issue
Block a user