For clarity I tried to illustrate each idea independently. I'm actually experimenting with slightly loose cycle timing for my music player, though of course I keep track of basic instruction cycle counts and frame timing.
It can't be an optimization issue because the CPU core only uses 4% of total time spent playing the music! I synthesize the sound using a band-limited technique which has no aliasing and doesn't require any post low-pass filtering, and this results in noise channel wave generation taking up, at its worst, a whopping 80% of the total time spend playing music like Blaster Master's themes (real-time playback takes from 4%-25% of the total CPU time on my old 120 MHz PowerPC Mac). This is because at high frequencies the band-limited impulse rendering is virtually equivalent to generalized resampling/low-pass filtering. And yes, I intend to document this method next so all the emulators out there can generate virtually exact emulations of NES sound with very little CPU. I intend to make my 6502 core and music player code freely available once I am satisfied with it.