I have planning to use flink high-level API to stream on a Kafka topic and perform a tumbling window and then reduce and map subsequently. For the Map function, I have used a custom class that extends RichMapFunction. The confusion is related to the open() and close() function inside the map class.
When those functions will be called, once before each window end or once per each flink task starting.
ie : If the window is 5 min, do those functions called once every 5 mins before the window iteration or once per flink task spin up ?
This is the link to the class definition : https://nightlies.apache.org/flink/flink-docs-release-1.2/api/java/org/apache/flink/api/common/functions/RichFunction.html
The statement inside the doc confused me 'this method will be invoked at the beginning of each iteration superstep' , what is really mean by this.
also, it is written that the open function is suitable for one-time setup work, but that is not written against the close function explanation(as, suitable for one-time cleanup work).
The purpose is to set up a database connection in the Flink job. Where should I establish the connection? globally as part of the construction of the map function class or in the open() function? where can I close the connection?
Thanks in advance!