The issue of auditory segregation of simultaneous sound sources has been addressed in speech research but was given less attention in musical acoustics. In perception of concurrent speech, or speech with noise, the operation of time-frequency masking was often used as a research tool. In this work, an ex- tension of time-frequency masking, leading to the removal of spectro-temporal overlap between sound sources, was applied to musical instruments playing together. The perception of the original mixture was compared with the perception of the same mixture with all spectral overlap electronically removed. Ex- periments differed in the method of listening (headphones or a loudspeaker), sets of instruments mixed, and populations of participants. The main findings were: (i) in one of the experimental conditions the removal of spectro-temporal overlap was imperceptible, (ii) perception of the effect increased when removal of spectro-temporal overlap was performed in larger time-frequency regions rather than in small ones, (iii) perception of the effect decreased in loudspeaker listening. The results support both the multiple looks hypothesis and the “glimpsing” hypothesis known from speech perception.