SQL
Last updated
Last updated
In Zeppelin, there're 2 kinds of flink sql interpreter you can use
%flink.ssql
Streaming sql interpreter which launch flink streaming job via StreamTableEnvironment
%flink.bsql
Batch sql interpreter which launch flink batch job via BatchTableEnvironment
You can write all the supported flink sql statements in Zeppelin. Type help can display all the supported sql syntax.
Flink sql interpreter in Zeppelin is equal to sql-client + many other enhanced and useful features.
In sql-client, either you run streaming sql or run batch sql in one session. You can not run them together. But in Zeppelin, you can do that. %flink.ssql
is used for running streaming sql, while %flink.bsql
is used for running batch sql. And the batch/streaming flink jobs run in the same flink session cluster.
You can write multiple sql statements in one paragraph, each sql statement is separated by semicolon.
2 kinds of sql comments are supported in Zeppelin:
Single line comment start with --
Multiple line comment around with /* */
You can set the sql parallelism via paragraph local property: parallelism
Sometimes you have multiple insert statements which read the same source, but write to different sinks. By default each insert statement would launch a separated flink job, but you can set paragraph local property: runAsOne
to be true to run them in a single flink job.
You can set flink job name for insert statement via setting property: jobName
. To be noticed, you can only set job name for insert statement, select statement is not supported yet. And this kind of setting only works for single insert statement. It doesn't work for multiple insert we talked above.
Zeppelin can visualize the select sql result of flink streaming job. Overall it supports 3 modes:
Single mode
Update mode
Append mode
Single mode is for the case when the result of sql statement is always one row, such as the following example. The output format is HTML, and you can specify paragraph local property template
for the final output content template. And you can use {i}
as placeholder for the ith
column of result.
Update mode is suitable for the case when the output is more than one rows, and always will be updated continuously. Here’s one example where we use group by.
Append mode is suitable for the scenario where output data is always appended. E.g. the following example which use tumble window.
Join Zeppelin community to discuss with others