1

I have planning to use flink high-level API to stream on a Kafka topic and perform a tumbling window and then reduce and map subsequently. For the Map function, I have used a custom class that extends RichMapFunction. The confusion is related to the open() and close() function inside the map class.

When those functions will be called, once before each window end or once per each flink task starting.

ie : If the window is 5 min, do those functions called once every 5 mins before the window iteration or once per flink task spin up ?

This is the link to the class definition : https://nightlies.apache.org/flink/flink-docs-release-1.2/api/java/org/apache/flink/api/common/functions/RichFunction.html

The statement inside the doc confused me 'this method will be invoked at the beginning of each iteration superstep' , what is really mean by this.

also, it is written that the open function is suitable for one-time setup work, but that is not written against the close function explanation(as, suitable for one-time cleanup work).

The purpose is to set up a database connection in the Flink job. Where should I establish the connection? globally as part of the construction of the map function class or in the open() function? where can I close the connection?

Thanks in advance!

1 Answer 1

1

This doc will help you understand when open() is called: https://nightlies.apache.org/flink/flink-docs-stable/docs/internals/task_lifecycle/

The database connection should be established in open() and closed in close()

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.