Penfjell S1.1 Monte Carlo Tests: What XAUUSD Path Stress Shows

The first Penfjell S1.1 research note explained how the strategy moved from a simple MACD transition idea into a multi-symbol MT5 model with a layered risk framework. This follow-up looks at a narrower question: what happens when one of the strongest contributors, XAUUSD, is tested against altered price paths?

As ever, this is a blog rather than a white paper, so please read it as a practical research note for retail and independent traders. It is not a performance claim, and it is not pretending to be a full institutional quant study. The aim is to explain what we tested, what looked encouraging, and where the risk still needs respect.

We already know Penfjell S1.1 tends to work best in reasonably volatile trending regimes, or in volatile trading regimes where the model can find enough movement to justify the risk. The point of this test is different. It asks whether the strategy can survive crashes, range-bound trading, and extremely volatile bull and bear markets. What we are trying to prove is not that every environment is profitable. We are trying to prove whether the risk framework holds up and whether capital is retained in extreme market environments.

Monte Carlo-style path stress testing is useful because a good backtest can hide how dependent a strategy is on the exact order of events. The question is not only whether the model made money in one historical sequence. The harder question is whether the account still survives when similar ingredients arrive in a more awkward order.

What Was Tested

The test compared one baseline XAUUSD report against three stitched XAUUSD variants. Each scenario used an initial account size of GBP 10,000.

Volatile bull trend with late pullback (Baseline): the original XAUUSD path.
Spike then range compression (Stitched 08): a sharp upward burst followed by flatter, more range-bound behaviour.
Extended volatile bull trend (Stitched 11): a persistent upward path with strong momentum and rising price pressure.
Bull-to-bear reversal / crash (Stitched 14): an earlier bullish phase followed by a severe downside break and weak recovery.

Each report covered the MT5 H4 test window from 8 January 2024 to 7 June 2026. The reports used the same broad Penfjell S1.1 model families described in the journey post, with MPTS, MDTS, and MRMR allocations each set to 0.33, and the same risk framework settings including account drawdown controls, daily drawdown monitoring, and VaR-based risk inputs.

The stitched histories should be read as path-stress tests, not forecasts. They are designed to ask how sensitive the strategy is to different price sequencing in XAUUSD.

Headline Result

The baseline test is materially stronger than all three stressed variants. That matters. The baseline produced higher net profit, higher profit factor, higher Sharpe, and a better recovery profile.

The encouraging part is that all three stitched variants remained profitable. None of them collapsed into a loss-making report. That is a useful survivability signal.

The warning is that every stitched variant showed meaningful degradation. Profit factor compressed. Sharpe fell. Recovery quality weakened. Drawdown either rose or stayed close enough to the baseline to remind us that XAUUSD concentration needs to be controlled.

Among the stressed paths, Stitched 14 looks like the best practical compromise by net profit and equity drawdown. It is not the best by profit factor or Sharpe. Stitched 11 has the weakest equity drawdown profile. Stitched 08 remains profitable, but has the lowest net profit and weakest recovery quality.

Penfjell S1.1 XAUUSD Monte Carlo path-stress comparison of net profit and equity drawdown — Baseline generated materially higher net profit. The stitched variants stayed profitable, but with lower return quality and drawdown profiles that reinforce the need for exposure controls.

Results Summary

Scenario	Account size	Net profit	Profit factor	Sharpe	Equity DD	Balance DD	Trades	Win rate	Largest loss	Max consecutive losses
Volatile bull trend with late pullback (Baseline)	GBP 10,000	GBP 5,237.99	1.546	2.785	8.91%	7.33%	190	69.47%	GBP -502.68	5 / GBP -440.28
Spike then range compression (Stitched 08)	GBP 10,000	GBP 1,598.68	1.261	1.661	9.89%	9.19%	136	67.65%	GBP -296.17	7 / GBP -926.76
Extended volatile bull trend (Stitched 11)	GBP 10,000	GBP 2,066.38	1.292	1.685	11.38%	10.73%	172	68.60%	GBP -299.72	7 / GBP -970.60
Bull-to-bear reversal / crash (Stitched 14)	GBP 10,000	GBP 2,500.72	1.276	1.555	9.25%	8.07%	169	68.64%	GBP -427.64	4 / GBP -580.98

Penfjell S1.1 XAUUSD Monte Carlo path-stress comparison of profit factor and Sharpe ratio — Profit factor and Sharpe both compressed under stitched path stress. The system remained profitable, but the quality of the edge became less comfortable.

Interpretation

The baseline is the cleanest report. Net profit was GBP 5,237.99, profit factor was 1.546, Sharpe was 2.785, and maximum equity drawdown was 8.91%. That is a strong result for this test set, but it is also the least stressed version of the path.

The stitched variants are more useful for risk thinking. They show what happens when the same broad model is exposed to less favourable sequencing. Net profit falls to between GBP 1,598.68 and GBP 2,500.72. Profit factor compresses to the 1.26 to 1.29 range. Sharpe falls to the 1.55 to 1.69 range.

That does not invalidate the model. It does mean the edge is sensitive to XAUUSD path structure. A trader should care about that because gold can dominate portfolio behaviour when it is working well, then become a source of concentration risk when the path changes.

The bull-to-bear reversal / crash scenario, Stitched 14, is the most balanced stressed report in this group. It produced the highest stressed net profit at GBP 2,500.72 and kept equity drawdown to 9.25%, close to the baseline. The trade-off is that it had the weakest Sharpe of the four reports and a lower profit factor than Stitched 08 or Stitched 11.

The extended volatile bull trend scenario, Stitched 11, is the clearest drawdown warning. It produced GBP 2,066.38 net profit, but maximum equity drawdown rose to 11.38% and the worst consecutive loss sequence reached GBP -970.60. That is exactly the type of behaviour the risk framework needs to notice early.

The spike then range compression scenario, Stitched 08, is still profitable, but it is the weakest recovery case. It produced the lowest net profit at GBP 1,598.68 and had a recovery factor of around 1.34. That is not failure, but it is not comfortable enough to ignore.

What This Means For Penfjell S1.1

The practical conclusion is simple: Penfjell S1.1 should not rely on gold performance alone.

XAUUSD can be a valuable contributor, but this test supports the need for portfolio-level controls. Position sizing, drawdown bands, symbol exposure limits, concentration monitoring, and symbol-level quarantine rules are not cosmetic features. They are the difference between treating a strong contributor as useful and letting it become the whole strategy.

For retail and independent traders, this is where Monte Carlo and path-stress testing earns its place. It does not tell you what will happen next. It helps you ask better questions before committing capital:

What happens if the edge arrives in a worse order?
What happens if the strongest symbol stops behaving like the baseline?
How much drawdown should be expected before the model is considered abnormal?
Which symbols deserve more exposure, and which ones should be capped?

The answer from this test is balanced. Penfjell S1.1 showed encouraging survivability across the stitched XAUUSD paths, but it also showed clear sensitivity to gold path structure. That is not a reason to dismiss the model. It is a reason to keep the risk framework central.

Caveat

These are XAUUSD path-stress tests over one period and multiple configurations. They are not proof that Penfjell S1.1 is robust under every future regime, every broker condition, or every symbol mix. Historical and simulated results are not forecasts, guarantees, or investment advice. Any software, analysis, or strategy output should be reviewed and executed at the client’s own discretion.