Beyond The Memory

MLP (Multi-Layer Perceptron) 学习笔记

📚 基本概念

MLP = Multi-Layer Perceptron（多层感知机）

MLP 是最基础的神经网络结构，也称为：

全连接网络（Fully Connected Network）
前馈神经网络（Feed-forward Neural Network）

🏗️ 网络结构

输入层 → 隐藏层1 → 隐藏层2 → ... → 输出层
  ↓         ↓          ↓              ↓
 [x₁]    [h₁₁]      [h₂₁]         [y₁]
 [x₂] →  [h₁₂]  →   [h₂₂]    →    [y₂]
 [x₃]    [h₁₃]      [h₂₃]         [y₃]

数学表示

对于一层：

h = activation(W × x + b)

其中：

x: 输入向量
W: 权重矩阵
b: 偏置向量
activation: 激活函数（ReLU, ELU, Tanh等）
h: 输出向量

⚡ 核心特点

1. 全连接（Fully Connected）

每一层的所有神经元都与下一层的所有神经元相连
每个连接都有独立的权重参数
参数量 = input_size × output_size + bias

2. 前馈（Feed-forward）

信息单向流动，从输入到输出
没有循环或反馈连接（区别于 RNN）
计算过程是确定性的

3. 非线性激活

每层之间使用激活函数引入非线性
常见激活函数：
- ReLU: max(0, x) - 最常用，计算快
- ELU: x if x>0 else α(e^x - 1) - 平滑，负值有梯度
- Tanh: (e^x - e^-x)/(e^x + e^-x) - 输出范围 [-1, 1]
- Sigmoid: 1/(1 + e^-x) - 输出范围 [0, 1]

💻 PyTorch 实现示例

基础 MLP

import torch.nn as nn

class SimpleMLP(nn.Module):
    def __init__(self, input_dim, hidden_dims, output_dim):
        super().__init__()
        self.network = nn.Sequential(
            nn.Linear(input_dim, hidden_dims[0]),
            nn.ReLU(),
            nn.Linear(hidden_dims[0], hidden_dims[1]),
            nn.ReLU(),
            nn.Linear(hidden_dims[1], output_dim)
        )

    def forward(self, x):
        return self.network(x)

# 使用示例
mlp = SimpleMLP(input_dim=64, hidden_dims=[256, 128], output_dim=12)

带 BatchNorm 的 MLP

def mlp_batchnorm_factory(input_dims, hidden_dims, out_dims, activation):
    layers = []
    dims = [input_dims] + hidden_dims + [out_dims]

    for i in range(len(dims) - 1):
        layers.append(nn.Linear(dims[i], dims[i+1]))
        if i < len(dims) - 2:  # 不在最后一层添加激活
            layers.append(nn.BatchNorm1d(dims[i+1]))
            layers.append(activation())

    return layers

🤖 在强化学习中的应用

1. Actor Network（策略网络）

# 输入：状态观测 → 输出：动作
self.actor = nn.Sequential(
    nn.Linear(obs_dim, 256),
    nn.ELU(),
    nn.Linear(256, 128),
    nn.ELU(),
    nn.Linear(128, action_dim)
)

2. Critic Network（价值网络）

# 输入：状态观测 → 输出：状态价值
self.critic = nn.Sequential(
    nn.Linear(obs_dim, 256),
    nn.ELU(),
    nn.Linear(256, 128),
    nn.ELU(),
    nn.Linear(128, 1)  # 输出单个价值
)

3. Encoder Network（编码器）

# 输入：高维观测 → 输出：低维特征
self.encoder = nn.Sequential(
    nn.Linear(high_dim, 512),
    nn.ELU(),
    nn.Linear(512, 256),
    nn.ELU(),
    nn.Linear(256, latent_dim)
)

📊 项目中的 MLP 示例

MlpBarlowTwinsActor

这是一个复杂的 MLP 架构，结合了 Barlow Twins 自监督学习：

class MlpBarlowTwinsActor(nn.Module):
    def __init__(self, num_prop, num_hist, mlp_encoder_dims,
                 actor_dims, latent_dim, num_actions, activation):
        super().__init__()

        # 1. 历史编码器：压缩历史观测
        self.mlp_encoder = nn.Sequential(
            *mlp_batchnorm_factory(
                input_dims=num_prop * num_hist,  # 展平的历史观测
                hidden_dims=mlp_encoder_dims,     # [256, 128]
                activation=activation
            )
        )

        # 2. 潜在层：提取关键特征
        self.latent_layer = nn.Sequential(
            nn.Linear(mlp_encoder_dims[-1], 32),
            nn.BatchNorm1d(32),
            nn.ELU(),
            nn.Linear(32, latent_dim)
        )

        # 3. 速度预测层
        self.vel_layer = nn.Linear(mlp_encoder_dims[-1], 3)

        # 4. Actor 网络：生成动作
        self.actor = nn.Sequential(
            *mlp_factory(
                input_dims=latent_dim + num_prop + 3,  # 潜在特征 + 当前观测 + 速度
                out_dims=num_actions,
                hidden_dims=actor_dims
            )
        )

    def forward(self, actor_obs):
        # actor_obs shape: (batch, history_length, num_prop)
        obs = actor_obs[:, -1, :]              # 当前帧
        obs_hist = actor_obs[:, 1:6, :]       # 历史帧
        b, _, _ = obs_hist.size()

        # 编码历史
        latent = self.mlp_encoder(obs_hist.reshape(b, -1))
        z = self.latent_layer(latent)
        vel = self.vel_layer(latent)

        # 拼接特征并生成动作
        actor_input = torch.cat([vel, z, obs], dim=-1)
        mean = self.actor(actor_input)
        return mean

网络流程：

历史观测 → MLP编码器 → 潜在特征 + 速度预测
                              ↓
              当前观测 → 拼接 → MLP Actor → 动作输出

🎯 MLP vs 其他网络

网络类型	特点	适用场景
MLP	全连接，参数多	表格数据、特征提取
CNN	局部连接，共享权重	图像、空间数据
RNN/LSTM	循环连接，记忆状态	序列数据、时间序列
Transformer	自注意力机制	长序列、NLP

⚠️ MLP 的优缺点

优点 ✅

简单直观：易于理解和实现
通用性强：适用于各种任务
并行计算：所有层可以批量并行处理
成熟稳定：训练技巧丰富

缺点 ❌

参数量大：全连接导致参数爆炸
不考虑结构：忽略输入的空间/时间结构
容易过拟合：参数多需要正则化
需要固定输入：输入维度必须固定

🛠️ 常用技巧

1. 防止过拟合

# Dropout
nn.Dropout(p=0.2)

# L2 正则化（在优化器中）
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3, weight_decay=1e-4)

# BatchNorm
nn.BatchNorm1d(hidden_dim)

2. 梯度问题

# 梯度裁剪
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)

# 使用更好的初始化
nn.init.xavier_uniform_(layer.weight)
nn.init.kaiming_normal_(layer.weight, mode='fan_out', nonlinearity='relu')

3. 加速训练

# Layer Normalization
nn.LayerNorm(hidden_dim)

# 残差连接
class ResidualMLP(nn.Module):
    def forward(self, x):
        return x + self.mlp(x)  # 跳跃连接