AMAZING WORK - Based the updated model snippet and results, I’ll provide new and additional suggestions to further refine AdaptiveGESAL, targeting an RMSE of 10–16 cycles while maintaining efficiency and scalability.
The Accuracy (±50 cycles) => 100.0% is excellent, indicating robust generalization within the ±50 cycle tolerance, but RMSE/MAE show room for precision improvement.
The temporal layers (Conv1d, LSTM) are working well, but i belive having deeper or more specialized layers could capture finer degradation patterns.
Include parallel Conv1d layers with different kernel sizes (e.g., 3, 5, 7) to capture short- and long-term trends, then concatenate outputs before the LSTM:
self.conv1d_short = nn.Conv1d(input_dim, hidden_dim // 3, kernel_size=3, padding=1)
self.conv1d_med = nn.Conv1d(input_dim, hidden_dim // 3, kernel_size=5, padding=2)
self.conv1d_long = nn.Conv1d(input_dim, hidden_dim // 3, kernel_size=7, padding=3)
def forward(self, x):
x = x.unsqueeze(1) # (batch, 1, features)
short = self.activation(self.conv1d_short(x))
med = self.activation(self.conv1d_med(x))
long = self.activation(self.conv1d_long(x))
x = torch.cat([short, med, long], dim=2).squeeze(1)
x, _ = self.lstm(x)
# Continue with SVF and output layers
ANd improves temporal context, reducing MAE.
self.lstm = nn.LSTM(hidden_dim, hidden_dim // 2, batch_first=True, bidirectional=True, num_layers=1)
x, _ = self.lstm(x) # Output shape: (batch, seq_len, hidden_dim)
x = x.squeeze(1) * 2 # Scale to match original hidden_dim
Then increases model capacity for complex patterns while maintaining efficiency via SVF like the below
original_fc1 = nn.Linear(256, 128)
original_fc2 = nn.Linear(128, 64)
original_fc3 = nn.Linear(64, 32)
self.svf1 = SVFLinear(original_fc1, dropout_rate=0.2, l2_lambda=0.01)
self.svf2 = SVFLinear(original_fc2, dropout_rate=0.2, l2_lambda=0.01)
self.svf3 = SVFLinear(original_fc3, dropout_rate=0.2, l2_lambda=0.01)
self.output_layer = nn.Linear(32, 1)