AWS Step Functions invoke Dynamic Parallel Step Functions

Question

I'm solving a problem using step function workflows.

The problem goes like this, I have a worklow of 10 AWS Batch jobs.

The first 3 jobs run in sequence and the 4-7 jobs are dynamic steps i.e, they needs to run multiple times with different parameters as specified.

And for each 4-5-6-7 workflow, there are multiple executions of 8-9-10 jobs based on number of parameters.

Looks like Map is best possible fit here but if any of the job fails in map state of 4-5-6-7 entire step fails. I don't want one execution to effect the other execution.

Approach: I have designed 3 step functions. First step function runs 1-3 jobs and last step calls for a lambda function which submits multiple executions of 4-5-6-7 jobs. And for each 4-5-6-7 execution another lambda gets triggered to submit multiple executions of 8-9-10 jobs.

I'm connecting the step functions manually through lambda functions.

Is this the correct approach or are there better ways of doing it?

Rob S. · Accepted Answer · 2020-06-10 14:11:53Z

2

I'd suggest a couple more elements to make your solution more production-ready.

First, I would suggest that you eliminate the Lambda function calls and use the "Run a Job" (.sync:2) service integration for Nested workflows. I just did a Twitch episode on this yesterday.

Second, if you want to continue after a failed execution inside your Map State, make sure that you are implementing Catchers (and optionally Retriers). I did a Twitch episode on this last Tuesday, and there's some discussion of error handling in the first video linked above.

So for your specific case, I suggest you:

start by making the 8-9-10 steps into an independent workflow (Child A)
invoke Child A from steps 4-5-6-7 via the "Run a Job" service integration inside a Map State
migrate steps 4-5-6-7 into an independent workflow (Child B)
invoke Child B from the parent workflow (steps 1-2-3), again via the "Run a Job" service integration

For more information on parallelism in Step Functions and Lambda functions, see this Twitch episode.

Code samples for all of the above are available in this repo on GitHub.

I am contributing this on behalf of my employer, Amazon. My contribution is licensed under the MIT license. See here for a more detailed explanation.

edited Jun 10, 2020 at 14:11

answered May 20, 2020 at 13:54

Rob S.

4694 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

user12073121 Over a year ago

That's a good suggestion, but in case of a failure in one of the map states how do we rerun the only the failed States of the step function. Or even in general how do we rerun failed states without executing the already executed states

user12073121 Over a year ago

This link shows a way to restart step function from any state aws.amazon.com/blogs/compute/…. If this is the only option, do we need to rerun the entire map state

Rob S. Over a year ago

You add the retrier/catcher inside the Map State Iterator. That way it is only run for iterations that do not succeed.

user12073121 Over a year ago

Assume if there is some bug in the 6th batch job in 4-5-6-7. That job failed and it fails even after retries and may be we would catch it. But how can we make sure it runs only failed jobs and not all when we fix the issue.

Rob S. Over a year ago

You could use a catcher to send failed jobs to a dead letter queue.

Collectives™ on Stack Overflow

AWS Step Functions invoke Dynamic Parallel Step Functions

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related