But the underlying question is why, if each event is independent and the result of a single particle travelling through one slit or another, the distribution has more than two peaks?

If the events are independent to each other, we should recover whatever 'inherent bias' there may be.

Like in a Monte Carlo integration if you want (the analogy may not be perfect).

Or if one throws randomly very many times 1 dice (with 6 faces), one will notice that only 1 in 6 trials generated face with number (say) 2. That does not mean that each event knew about others such that they sum up to 1/6 for face 2. It means that a random generator recover the probability associated with an event after many trials.

For photon hitting the screen after going though double slit , the 'inherent bias' consists in the complex interference patterns with domains where photons are less likely to end up and domains where photons are more likely to end up.

Why the 'inherent bias' (revealed after measuring an ensemble of photons) in similar to an interference patterns rather than 2 spots? I guess that is a clear signature of a wave-like character manifested at the double slit. However, at any point before and after slit we can identify the photon as a particle because if we put a screen to detect it we only get 1 single localized spot on the screen.