我正在 YouTube 上观看一个名为PyTorch for Deep Learning & Machine Learning的教程。
我尝试根据视频信息构建一个非常简单的线性回归模型。下面是包含模型和训练循环的代码。我还提供了输出。
由于某种原因,该模型没有接受训练。我将参数分配给优化器,创建损失函数,然后反向传播,最后使用 更新参数step()
。正如您从输出中看到的,损失具有相当奇怪的值。我不明白为什么它不起作用。
# Imports
import torch
from torch import nn
from torch import optim
# Create model class
class LinearRegressionModel(nn.Module):
def __init__(self):
super().__init__()
self.linear_layer = nn.Linear(in_features=1, out_features=1)
# Forward method to define the computation in the model
def forward(self, X: torch.Tensor) -> torch.Tensor:
return self.linear_layer(X)
# Set manual seed
torch.manual_seed(42)
# Create data
weight, bias = 0.7, 0.3
start, end, step = 0, 1, 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight*X + bias + 0.035*torch.randn_like(X)
# Create train/test split
train_split = int(0.8*len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]
# Create model
model = LinearRegressionModel()
print(model, end="\n\n")
print(model.state_dict(), end="\n\n")
# Create loss function
loss_fn = nn.L1Loss()
# Create optimiser
optimiser = optim.SGD(params=model.parameters(), lr=1e2)
# Training loop
epochs = 200
for epoch in range(1, epochs+1):
# Set model to training mode
model.train()
# Forward pass
y_pred = model(X_train)
# Calculate loss
loss = loss_fn(y_pred, y_train)
# Zero gradients in optimiser
optimiser.zero_grad()
# Backpropate
loss.backward()
# Step model's lparameters
optimiser.step()
### Evaluate the current state
model.eval()
with torch.inference_mode():
test_pred = model(X_test)
test_loss = loss_fn(test_pred, y_test)
# Print the current state
if epoch == 1 or epoch % 10 == 0:
print("Epoch: {:3} | Loss: {:.2f} | Test loss {:.2f}".format(epoch,loss,test_loss))
print()
print(model.state_dict())
输出:
LinearRegressionModel(
(linear_layer): Linear(in_features=1, out_features=1, bias=True)
)
OrderedDict([('linear_layer.weight', tensor([[0.8294]])), ('linear_layer.bias', tensor([-0.5927]))])
Epoch: 1 | Loss: 0.85 | Test loss: 133.93
Epoch: 10 | Loss: 114.36 | Test loss: 0.78
Epoch: 20 | Loss: 114.36 | Test loss: 0.78
Epoch: 30 | Loss: 114.36 | Test loss: 0.78
Epoch: 40 | Loss: 114.36 | Test loss: 0.78
Epoch: 50 | Loss: 114.36 | Test loss: 0.78
Epoch: 60 | Loss: 114.36 | Test loss: 0.78
Epoch: 70 | Loss: 114.36 | Test loss: 0.78
Epoch: 80 | Loss: 114.36 | Test loss: 0.78
Epoch: 90 | Loss: 114.36 | Test loss: 0.78
Epoch: 100 | Loss: 114.36 | Test loss: 0.78
Epoch: 110 | Loss: 114.36 | Test loss: 0.78
Epoch: 120 | Loss: 114.36 | Test loss: 0.78
Epoch: 130 | Loss: 114.36 | Test loss: 0.78
Epoch: 140 | Loss: 114.36 | Test loss: 0.78
Epoch: 150 | Loss: 114.36 | Test loss: 0.78
Epoch: 160 | Loss: 114.36 | Test loss: 0.78
Epoch: 170 | Loss: 114.36 | Test loss: 0.78
Epoch: 180 | Loss: 114.36 | Test loss: 0.78
Epoch: 190 | Loss: 114.36 | Test loss: 0.78
Epoch: 200 | Loss: 114.36 | Test loss: 0.78
OrderedDict([('linear_layer.weight', tensor([[0.8294]])), ('linear_layer.bias', tensor([-0.5927]))])
我认为这个方法效果很好;您只需要进行以下更改:
lr=1e-2
.我得到以下输出:
高学习率通常会跳过最优值。所以,早先没有发挥作用。
我希望这可以帮助你。谢谢!