Fast GPIO access

So this thread is about transferring samples, but I wonder if there is anything in here which might point to a potential solution:

E.g. some protocol buffering that could be shrunk to reduce latency.