These dynamic tables don’t necessarily exist anywhere — it’s simply an abstraction that may, or may not, be materialized, depending on the needs of the query being performed. For example, a query that is doing a simple projection
SELECT a, b FROM events
can be executed by simply streaming each record through a stateless Flink pipeline.
Also, Flink doesn’t operate on mini-batches — it processes each event one at a time. So there’s no physical “table”, or partial table, anywhere.
But some queries do require some state, perhaps very little, such as
SELECT count(*) FROM events
which needs nothing more than a single counter, while something like
SELECT key, count(*) FROM events GROUP BY key
will use Flink’s key-partitioned state (a sharded key-value store) to persist the current counter for each key. Different nodes in the cluster will be responsible for handling events for different keys.
Just as “normal” SQL takes one or more tables as input, and produces a table as output, stream SQL takes one or streams as input, and produces a stream as output. For example, the
SELECT count(*) FROM events will produce the stream
1 2 3 4 5 ... as its result.
There are some good introductions to Flink SQL on YouTube: https://www.google.com/search?q=flink+sql+hueske+walther, and there are training materials on github with slides and exercises: https://github.com/ververica/sql-training.
CLICK HERE to find out more related problems solutions.