Workflow Runtimes Increase Exponentially As Iterations Increase #244
Replies: 1 comment 4 replies
-
Part of this is a fundamental limitation in how workflows scale. Growth isn’t linear, because each yield requires replaying all previous steps. For N yields, the total number of replays is N(N + 1)/2. So while 20 yields only require 210 replays, increasing to 200 yields (just 10× more) results in 21,000 replays, a 100× increase. Temporal manages to speed this up using a cache. Instead of replaying from persistent storage every time, it holds prior results in memory. But that’s also why they require RoadRunner and a dedicated Temporal server, their system can make smarter trade-offs than simple Laravel queues. Still, even in Temporal, this limit eventually shows up. Their solution is
Laravel Workflow doesn’t currently have |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
There is a significant performance penalty introduced into workflows that iterate over a large number of dispatched activities. I'd like to discuss some ideas to improve this performance penalty.
Background
First, allow me to explain with an example.
Here's an example workflow and activity to show what I mean:
Now, I set a baseline by processing 20 iterations:
The output shows that each iteration/activity took between 0.01 and 0.02 seconds, and the total runtime was 0.3 seconds:
I ran this again with 200 iterations (10 times the baseline):
I would hope the output would show that each iteration/activity took between 0.01 and 0.02 seconds as before. I would also hope that the total runtime for 10x as many iterations would be about 10x as long as the baseline.
Instead, you can see the the total runtime for 10x as many iterations was actually 400 times as long as the baseline. The shortest iteration was still 0.01 seconds, as before, but the longest was 15x times as long at .15 seconds.
Discussion
I suspected the penalty was introduced because
Workflow/Workflow::handle()
queries for each workflow log, one at a time (seeworkflow.php
lines 126-128 and 161-163), and the number of queries grows exponentially(?) with each iteration. For 2 iterations, it would query ~3 times, for 20 iterations it would query ~200 times, and for 200 iterations it would query ~20,000 times.I tested my theory but didn't see much improvement. I modified the Workflow class to query for all logs once during each iteration, but the performance improvement was negligible:
And, similarly, change lines 161-163:
I understand that each iteration will have a performance cost, but I want the cost to be linear. In other words, a workflow with 200 iterations should take 10 times as long as a workflow with 20 iterations.
In my real-life use of Laravel Workflow, each iteration has 3-10 dispatched activities, and there may be as many as 500 iterations. This quickly adds up to a very long-running workflow that takes much longer to run when dispatched as a workflow than it does as a single process.
What other things can I explore to reduce the penalty I'm finding?
Thanks!!
Beta Was this translation helpful? Give feedback.
All reactions