In reading the docs, I was puzzled by the choice of using nextToken / startToken. This has the effect of forcing serial access when iterating over a very large set of files. I'm pushing up several thousand (35-50k roughly) files. If I want to iterate over all of these records, I have to ask for 200 at a time sequentially. Was this done intentionally, as a form of rate limiting? If not, why not just support the usual "offset" / "limit" parameters? This would allow me to issue a number of parallel requests. Each request would still be limited to a certain number of max results; it would just be possible to parallelize the calls.
You might want to use the changes API instead. This will give you all nodes at once if you specify no checkpoint or just updates if you do. The only downside is that the response format is a bit strange.
As you have noticed, having pagination organized like this (next page token) prevents any parallelization in client, so I guess it's one of the ways of preventing service abuse. That is, in addition to request rate limiting with error 429 responses.