To set up a sprite DMA you reserve a page of RAM (every game I've looked at uses $0200-$02FF). Once per frame, write zero to $2003 (you only need to write once because SPR-RAM is only 256 bytes), then write the page # of memory to transfer. So if using $200 it would be:
It takes 512 CPU cycles to complete the transfer. One cycle to fetch a byte, and another one to transfer it. That's very fast. I usually put this in my NMI routine, but if you do that be sure you do the DMA _after_ any writes to VRAM, because the DMA will eat up much of your vblank time. That caused some problems for me a while back.
Also, is your 'subtimer' routine intended to sync the loop? Using a delay loop isn't a very accurate way to do that. What you'd want to do is INC a RAM location in your NMI code. Then your main loop would have this: (if $FF is the location that's INC'd in the NMI)
beq wait ; if $FF = 0, NMI hasn't happened yet
sta $FF ; reset sync byte
Whatever you do, don't use an LDA $2002 / BPL loop to sync your loop. This is a common mistake. The vblank flag doesn't even stay up long enough for that to work consistently, on a real NES. (And I think that's why every NES game starts out with _2_ LDA $2002 / BPL loops.)