This page presents the full results for Experiment 2, which assesses the impact of mode seeking regularization and the efficacy of the fine-tuning phase. We evaluate a grid of mode seeking hyperparameters (λ, ε) by training 3 independent models for each combination and each dataset. We inspect the results obtained after the adversarial phase, then after the fine-tuning phase, and we compare the results with and without pre-training the State Prediction Network (SPN). λ = 0 means that mode seeking is disabled.

Between the adversarial phase and the fine-tuning phase, the cells in the tables correspond. For example, the first model with λ=0 achieved a minimum modulation loss of 0.198. When SPN pre-training was not chosen, it achieved 0.178 after fine-tuning; when SPN pre-training was performed, it achieved 1.359 after fine-tuning.

For the modulation loss, we show the value obtained during each phase at the best checkpoint.

For the spectral loss, we show the value obtained during each phase at the checkpoint minimizing the spectral loss under the condition that the modulation loss is below 0.9. If this condition cannot be met, the color is magenta.

Slow-LFO Dataset

Modulation loss

Adversarial Phase

ε →
λ ↓
0.001 0.01 0.1
0
0.198
0.269
0.209
0.923
0.206
0.297
0.200
0.212
0.443
0.01
0.171
0.253
0.151
1.069
0.181
0.111
0.192
0.054
0.581
0.1
0.169
0.215
0.128
0.222
0.063
0.084
0.117
0.180
0.217
1
0.223
0.155
0.098
0.183
0.187
0.201
0.166
0.242
0.157

Fine-tuning Phase (no SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
0.178
0.358
0.115
1.328
0.086
0.411
0.752
0.197
0.086
0.01
0.125
0.653
0.149
0.277
0.441
0.134
0.238
0.144
0.052
0.1
0.161
0.407
0.403
0.177
0.113
0.113
0.208
0.067
0.383
1
0.743
0.485
0.469
0.270
0.161
0.159
0.108
0.295
0.406

Fine-tuning Phase (with SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
1.359
0.594
0.027
1.302
0.304
0.123
0.190
0.197
0.042
0.01
0.348
0.521
0.365
0.069
0.175
0.089
0.213
0.053
0.359
0.1
0.178
0.197
0.275
0.158
0.040
0.374
0.309
0.193
0.342
1
0.663
0.173
0.160
0.194
0.758
0.155
0.057
0.545
0.215
2025-12-12T18:05:33.611018 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/

Spectral (MR-STFT) loss

Adversarial Phase

ε →
λ ↓
0.001 0.01 0.1
0
0.892
0.914
0.870
NaN
0.901
1.000
0.943
0.957
0.940
0.01
0.949
0.919
0.928
NaN
0.936
0.933
0.919
0.920
1.045
0.1
0.906
0.924
1.116
0.963
0.897
1.006
1.090
1.090
0.968
1
1.452
1.110
1.069
1.326
1.017
1.106
1.158
1.189
1.005

Fine-tuning Phase (no SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
0.765
0.725
0.644
NaN
0.708
0.724
0.676
0.755
0.643
0.01
0.804
0.680
0.748
0.664
0.708
0.806
0.711
0.770
0.750
0.1
0.770
0.665
0.732
0.782
0.752
0.790
0.766
0.730
0.682
1
0.707
0.757
0.701
0.746
0.775
0.721
0.774
0.766
0.764

Fine-tuning Phase (with SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
NaN
0.719
0.620
NaN
0.717
0.733
0.693
0.737
0.673
0.01
0.803
0.700
0.766
0.640
0.701
0.804
0.687
0.780
0.781
0.1
0.740
0.680
0.732
0.747
0.715
0.793
0.750
0.705
0.680
1
0.715
0.757
0.739
0.726
0.777
0.772
0.806
0.777
0.781
2025-12-12T18:05:33.746610 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/

Fast-LFO Dataset

Modulation loss

Adversarial Phase

ε →
λ ↓
0.001 0.01 0.1
0
0.025
0.055
0.035
0.041
0.069
0.030
0.034
0.040
0.518
0.01
0.036
0.024
0.030
0.230
0.030
0.043
0.055
0.036
0.025
0.1
0.175
0.200
0.039
0.089
0.036
0.033
0.033
0.224
0.027
1
0.401
0.046
0.141
0.080
0.058
0.365
0.289
0.030
0.033

Fine-tuning Phase (no SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
1.038
0.593
0.388
0.137
0.318
0.304
0.802
0.360
1.045
0.01
1.038
1.032
0.515
0.980
0.820
1.044
0.233
0.309
0.436
0.1
1.029
0.549
0.392
0.626
0.694
0.692
1.007
0.666
1.016
1
0.340
0.958
0.765
0.816
0.655
0.488
0.568
0.686
0.372

Fine-tuning Phase (with SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
0.432
0.185
0.643
0.114
0.561
0.230
0.693
0.489
0.654
0.01
0.704
0.554
0.694
1.036
0.612
1.030
0.568
0.225
0.529
0.1
0.581
0.565
0.917
0.512
0.459
0.670
0.138
0.274
0.225
1
0.358
0.678
0.956
0.740
0.651
0.293
0.609
0.659
0.423
2025-12-12T18:05:33.870708 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/

Spectral (MR-STFT) loss

Adversarial Phase

ε →
λ ↓
0.001 0.01 0.1
0
0.884
0.880
0.861
0.860
0.904
0.902
0.895
0.910
0.856
0.01
0.864
0.889
0.932
0.861
0.962
0.854
0.921
0.838
0.877
0.1
1.087
0.938
0.993
0.984
0.919
0.960
1.012
1.011
1.036
1
1.044
1.315
0.974
1.627
1.017
1.614
0.952
1.016
1.127

Fine-tuning Phase (no SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
NaN
0.576
0.604
0.479
0.643
0.601
0.658
0.598
NaN
0.01
NaN
NaN
0.594
NaN
0.620
NaN
0.573
0.639
0.627
0.1
NaN
0.617
0.636
0.664
0.695
0.641
NaN
0.702
NaN
1
0.558
NaN
0.620
0.761
0.705
0.614
0.762
0.786
0.727

Fine-tuning Phase (with SPN pre-training)

ε →
λ ↓
0.001 0.01 0.1
0
0.592
0.534
0.612
0.481
0.646
0.606
0.662
0.595
0.544
0.01
0.660
0.647
0.614
NaN
0.626
NaN
0.577
0.647
0.632
0.1
0.639
0.623
NaN
0.653
0.635
0.636
0.642
0.614
0.622
1
0.572
0.676
NaN
0.711
0.708
0.587
0.768
0.732
0.740
2025-12-12T18:05:33.998762 image/svg+xml Matplotlib v3.10.7, https://matplotlib.org/