diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_4.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_4.png index c59a5d3f..660ed6ac 100644 Binary files a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_4.png and b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_4.png differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7.png new file mode 100644 index 00000000..49bfff0b Binary files /dev/null and b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7.png differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7a.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7a.png deleted file mode 100644 index 6a68ce3f..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7a.png and /dev/null differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7b.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7b.png deleted file mode 100644 index ce5fc2bb..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_7b.png and /dev/null differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_8.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_8.png deleted file mode 100644 index e3e892ff..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_8.png and /dev/null differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_9.png b/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_9.png deleted file mode 100644 index b32fdc6b..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-earthquake-data/superset_9.png and /dev/null differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_3.png b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_3.png index 4099d4ee..0fedd1a5 100644 Binary files a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_3.png and b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_3.png differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4.png b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4.png new file mode 100644 index 00000000..d2581072 Binary files /dev/null and b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4.png differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4a.png b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4a.png deleted file mode 100644 index b553f349..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4a.png and /dev/null differ diff --git a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4b.png b/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4b.png deleted file mode 100644 index 72f31947..00000000 Binary files a/docs/modules/demos/images/nifi-kafka-druid-water-level-data/superset_4b.png and /dev/null differ diff --git a/docs/modules/demos/pages/nifi-kafka-druid-earthquake-data.adoc b/docs/modules/demos/pages/nifi-kafka-druid-earthquake-data.adoc index feb95649..73ae3bb8 100644 --- a/docs/modules/demos/pages/nifi-kafka-druid-earthquake-data.adoc +++ b/docs/modules/demos/pages/nifi-kafka-druid-earthquake-data.adoc @@ -72,7 +72,6 @@ $ stackablectl stacklet list │ kafka ┆ kafka ┆ default ┆ broker-default-0-listener-broker-kafka-tls 172.19.0.4:32321 ┆ Available, Reconciling, Running │ │ ┆ ┆ ┆ broker-default-0-listener-broker-metrics 172.19.0.4:30556 ┆ │ │ ┆ ┆ ┆ broker-default-bootstrap-kafka-tls 172.19.0.4:31352 ┆ │ -│ ┆ ┆ ┆ broker-default-bootstrap-metrics 172.19.0.4:30241 ┆ │ ├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ nifi ┆ nifi ┆ default ┆ node-https https://172.19.0.2:32348 ┆ Available, Reconciling, Running │ ├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ @@ -213,8 +212,8 @@ image::nifi-kafka-druid-earthquake-data/nifi_2.png[] You can see the started ProcessGroup consisting of three processors. The first one - `InvokeHTTP`, fetches the CSV file from the Internet and puts it into the queue of the next processor. -The second processor - `SplitRecords`, takes the single FlowFile (NiFi Record) which contains all CSV records and splits it into chunks of 2000 records, which are then separately put into the queue of the next processor. -The third one - `PublishKafkaRecord`, parses the CSV chunk, converts it to JSON records and writes them out into Kafka. +The second processor - `SplitRecord`, takes the single FlowFile (NiFi Record) which contains all CSV records and splits it into chunks of 2000 records, which are then separately put into the queue of the next processor. +The third one - `PublishKafka`, parses the CSV chunk, converts it to JSON records and writes them out into Kafka. Double-click on the `InvokeHTTP` processor to show the processor details. @@ -265,8 +264,8 @@ You can see the available data sources by clicking on `Datasources` at the top. image::nifi-kafka-druid-earthquake-data/druid_4.png[] -You can see the data source's segments by clicking on `segments` under `Availability` for the `earthquake` data source. -In this case, the `earthquake` data source is partitioned by the year of the earthquakes, resulting in 73 segments. +You can see the data source's segments by clicking on `segments` under `Availability` for the `earthquakes` data source. +In this case, the `earthquakes` data source is partitioned by the year of the earthquakes, resulting in 73 segments. image::nifi-kafka-druid-earthquake-data/druid_5.png[] @@ -338,28 +337,11 @@ image::nifi-kafka-druid-earthquake-data/superset_6.png[] To look at the geographical distribution of the earthquakes you have to click on the tab `Charts` at the top again. Afterwards click on the chart `Earthquake distribution`. -image::nifi-kafka-druid-earthquake-data/superset_7a.png[] +image::nifi-kafka-druid-earthquake-data/superset_7.png[] The distribution of the earthquakes matches the continental plate margins. This is the expected distribution from the {wikipedia}[Wikipedia article on Earthquakes]. -NOTE: The earthquakes are rendered without a background map, as this is dependent upon a mapbox API key, which cannot be hosted in a public repository. The figure below shows how this would look if the user has their own key: - -image::nifi-kafka-druid-earthquake-data/superset_7b.png[] - -// N.B. the next 2 screenshots and their explanation do not make sense until a mapbox key is activated, -// hence commented out. - -// You can move and zoom the map with your mouse to interactively explore the map. -// You can e.g. have a detailed look at the detected earthquakes in Germany. - -// image::nifi-kafka-druid-earthquake-data/superset_8.png[] - -// You can also click on the magnitudes in the legend on the top right side to enable/disable printing the earthquakes of that magnitude. -// By only enabling magnitudes greater or equal to 8 you can plot only the most severe earthquakes. - -// image::nifi-kafka-druid-earthquake-data/superset_9.png[] - === Execute arbitrary SQL statements Within Superset you can not only create dashboards but also run arbitrary SQL statements. diff --git a/docs/modules/demos/pages/nifi-kafka-druid-water-level-data.adoc b/docs/modules/demos/pages/nifi-kafka-druid-water-level-data.adoc index 9b2053d9..b3fc27c3 100644 --- a/docs/modules/demos/pages/nifi-kafka-druid-water-level-data.adoc +++ b/docs/modules/demos/pages/nifi-kafka-druid-water-level-data.adoc @@ -77,7 +77,6 @@ $ stackablectl stacklet list │ kafka ┆ kafka ┆ default ┆ broker-default-0-listener-broker-kafka-tls 172.19.0.3:31041 ┆ Available, Reconciling, Running │ │ ┆ ┆ ┆ broker-default-0-listener-broker-metrics 172.19.0.3:31503 ┆ │ │ ┆ ┆ ┆ broker-default-bootstrap-kafka-tls 172.19.0.3:30793 ┆ │ -│ ┆ ┆ ┆ broker-default-bootstrap-metrics 172.19.0.3:32540 ┆ │ ├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ nifi ┆ nifi ┆ default ┆ node-https https://172.19.0.6:32038 ┆ Available, Reconciling, Running │ ├╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ @@ -111,7 +110,7 @@ You can execute a command on the Kafka broker to list the available topics as fo [source,console] ---- -$ kubectl k exec kafka-broker-default-0 -c kafka -- \ +$ kubectl exec kafka-broker-default-0 -c kafka -- \ /stackable/kafka/bin/kafka-topics.sh \ --describe \ --bootstrap-server kafka-broker-default-headless.default.svc.cluster.local:9093 \ @@ -259,9 +258,8 @@ network traffic and storage usage. The solution is only to send a station's know process is called data normalization. The downside is that when analyzing the data, you need to combine the records from multiple tables in Druid (`stations` and `measurements`). -If you are interested in how many records have been produced to the Kafka topic so far, use the following command. It -will print the last record produced to the topic partition, formatted with the pattern specified in the `-f` parameter. -The given pattern will print some metadata of the record. +If you are interested in how many records have been produced to the Kafka topic so far, use the following command. +It prints the current offset for each partition, which tells you how many records have been produced to it. [source,console] ---- @@ -281,7 +279,7 @@ measurements:6:1344362 measurements:7:1369651 ---- -Multiplying `1,324,098` records by `8` partitions, we end up with ~ 10,592,784 records. +Summing all partition offsets gives the total number of records produced. In this example, that is roughly 10.9 million measurement records. To inspect the last produced records, use the following command. Here, we consume the last three records from partition `0` of the `measurements` topic. @@ -296,7 +294,7 @@ $ kubectl exec kafka-broker-default-0 -c kafka -- \ --offset latest \ --partition 0 \ --max-messages 3 --... +... {"timestamp":"2025-10-21T11:00:00+02:00","value":369.54,"station_uuid":"5cdc6555-87d7-4fcd-834d-cbbe24c9d08b"} {"timestamp":"2025-10-21T11:15:00+02:00","value":369.54,"station_uuid":"5cdc6555-87d7-4fcd-834d-cbbe24c9d08b"} {"timestamp":"2025-10-21T11:00:00+02:00","value":8.0,"station_uuid":"7deedc21-2878-40cc-ab47-f6da0d9002f1"} @@ -489,11 +487,7 @@ image::nifi-kafka-druid-water-level-data/superset_3.png[] Click on the dashboard called `Water level data`. It might take some time until the dashboard renders all the included charts. -image::nifi-kafka-druid-water-level-data/superset_4a.png[] - -NOTE: The charts on the right (`Current water level deviation` and `Stations distribution`) are rendered without a background map, as this is dependent upon a mapbox API key, which cannot be hosted in a public repository. The figure below shows how this would look if the user has their own key: - -image::nifi-kafka-druid-water-level-data/superset_4b.png[] +image::nifi-kafka-druid-water-level-data/superset_4.png[] === View the charts @@ -506,19 +500,18 @@ effect. image::nifi-kafka-druid-water-level-data/superset_6.png[] -// Comment out the next section as long as the mapbox api key is not active. -// === View the Station Distribution on the World Map +=== View the Station Distribution on the World Map -// To look at the stations' geographical distribution, you have to click on the tab `Charts` at the top again. Afterwards, -// click on the chart `Stations distribution`. +To look at the stations' geographical distribution, you have to click on the tab `Charts` at the top again. Afterwards, +click on the chart `Stations distribution`. -// image::nifi-kafka-druid-water-level-data/superset_7.png[] +image::nifi-kafka-druid-water-level-data/superset_7.png[] -// The stations are, of course, placed alongside waterways. They are coloured by the waters they measure, so all stations -// alongside a body of water have the same colour. You can move and zoom the map with your mouse to interactively explore -// the map. You can, e.g. have a detailed look at the water https://en.wikipedia.org/wiki/Rhine[Rhein]. +The stations are, of course, placed alongside waterways. They are coloured by the waters they measure, so all stations +alongside a body of water have the same colour. You can move and zoom the map with your mouse to interactively explore +the map. You can, e.g. have a detailed look at the water https://en.wikipedia.org/wiki/Rhine[Rhein]. -// image::nifi-kafka-druid-water-level-data/superset_8.png[] +image::nifi-kafka-druid-water-level-data/superset_8.png[] === Execute arbitrary SQL statements