修改--数据集划分时可以设定随机种子
This commit is contained in:
parent
5297a81a81
commit
92bc054265
@ -16,7 +16,7 @@ dataset:
|
||||
model:
|
||||
save_dir: "models"
|
||||
batch_size: 32
|
||||
num_workers: 4
|
||||
num_workers: 1
|
||||
|
||||
# 系统监控配置
|
||||
monitor:
|
||||
|
||||
@ -30,6 +30,16 @@
|
||||
测试集真值:
|
||||
暂无
|
||||
注: 不同训练集文件对应不同的现场情况, 不要混用.
|
||||
测试结果:
|
||||
train_FD001.txt
|
||||
训练集:
|
||||
mae: 16.xx
|
||||
验证集:
|
||||
mae: 30.xx
|
||||
结果分析:
|
||||
对于存在时序信息的数据,只用随机森林来拟合,效果不好.
|
||||
虽然数据有26列特征,但有的数据完全一致根本没有参考价值.
|
||||
|
||||
2.能源行业-风电功率预测
|
||||
(1)基于天气与历史数据预测风力发电量
|
||||
(2)XGBoost,LSTM,Prophet时间序列模型
|
||||
|
||||
@ -424,7 +424,8 @@ class DataManager:
|
||||
self,
|
||||
df: pd.DataFrame,
|
||||
test_size: float = 0,
|
||||
val_size: float = 0
|
||||
val_size: float = 0,
|
||||
random_state: int = 42
|
||||
) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame]:
|
||||
"""划分数据集"""
|
||||
try:
|
||||
@ -433,7 +434,7 @@ class DataManager:
|
||||
train_val_data, test_data = train_test_split(
|
||||
df,
|
||||
test_size=test_size,
|
||||
random_state=42
|
||||
random_state=random_state
|
||||
)
|
||||
else:
|
||||
train_val_data = df
|
||||
|
||||
Loading…
Reference in New Issue
Block a user