Can you actually get data out of the NES with that schematic? I'm pretty sure the NES's data lines are input only. Sepi's parallel interface used the clock line to increment a counter, that's the only way I know of to get parallel data out of the controller port. And that's where the complexity starts to creep in.