I think there are different ways to push it that haven't been tried yet. :)
In the realm of software, I think much can be done with the (limited) hblank time available.
There are some nice possibilities if we can get an IRQ for every hblank. I've written code to change part of the palette on every scanline, and that's easy as long as you don't mind trashing the BG graphics.
With a table of 'proper' values to put to $2006 on every line, it should be possible to not trash the BG graphics. But hblank time is very short..
I was thinking the other day, wouldn't it be possible to increase the amount of colors + color resolution on the screen by alternating between changing the palette on one scanline, then changing the attribute table on the next? Attribute table resolution would then be 16x2, and if we're lucky we could also change 2 colors in the palette every couple of scanlines.
I'd point the IRQ vector to RAM, so every LDA instruction could load an immediate value (save a cycle). After the PPU stuff is done, it would then load from a table to set the IRQ routine up for the next hblank (yay, self-modifying code).
That would be pretty cool for still pictures. (Maybe a game background, too?) I don't know how to make a program to convert a picture to format to work with this, though. I'd pretty much have to do it manually, which would kinda suck.