At Polars I developed a new query engine which uses a hybrid of push and pull. I gave a short (and not very technical) talk about the engine at our first meetup recently, which can be viewed here: https://www.youtube.com/watch?v=Ndil-eLynh4.
Each operator is a (set of) async functions which are connected to its input(s) and output(s) through capacity-1 spsc async channels. An operator pulls input by reading from a channel, and pushes output by writing into a channel. For an oversimplified example, consider a simple select operator:
while let Ok(morsel) = input.recv().await {
let result = select(morsel);
if output.send(result).await.is_err() {
break;
}
}
Note how it has two await points: on receive and send.
The nice part about this is that Rust will automatically transform these asynchronous functions to state machines which can pause execution when either a send or receive blocks, returning control to our scheduler. In the above example the operator is able to pause both to wait for more data, or to wait until the receiver is able to consume more data. This also makes for example pausing execution in the middle of a hash join probe to send some partial results onwards in the computational graph trivial. I'm not seeing how this is pull in any sense. Calling recv on the channel doesn't cause any result to be computed. The push of the previous operators will cause the compution to continue.
EDIT: Ok, I guess because they are bounded to 1, the spcs will let the pushing computation continue first after the "puller" has read the result, but it's more like pushing with back-pressure.
It is pull in the sense that an operator can call `recv().await` (the equivalent of `input.next()` in the article) at any point, which can then block the execution of the operator until more data is available.
It is push in the sense that an operator can call `send(x).await` (the equivalent of `out(x)` in the article) at any point, which can then block the execution of the operator until the data is consumed.
So it is a hybrid of both pull and push. You can, at any point, block on either pulling data or pushing data.