Can we use for loop to create apache beam data flow pipeline dynamically? My fear is how for loop will behave in distributed environment when i am using it with data flow runner. I am sure this will work fine with direct runner
for example can I create pipelines dynamically like this:
with beam.Pipeline(options=pipeline_options) as pipeline:
for p in cdata['tablelist']:
i_file_path = p['sourcefile']
schemauri = p['schemauri']
schema=getschema(schemauri)
dest_table_id = p['targettable']
( pipeline | "Read From Input Datafile" + dest_table_id >> beam.io.ReadFromText(i_file_path)
| "Convert to Dict" + dest_table_id >> beam.Map(lambda r: data_ingestion.parse_method(r))
| "Write to BigQuery Table" + dest_table_id >> beam.io.WriteToBigQuery('{0}:{1}'.format(project_name, dest_table_id),
schema=schema, write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND)
)