Description
Symfony version(s) affected
5.4.*
Description + Reproduction
Given the following system:
- 1: A SQS queue that will be configured as a Dead-Letter-Queue
- 2: A SQS queue that a symfony consumer should consume from, configured with 1) as DLQ
- 3: A symfony transport consumer configured to read from the queue from 2. It's AWS permissions are the minimal permissions of a queue consumer:
{
"Effect": "Allow",
"Action": [
"sqs:ReceiveMessage",
"sqs:ChangeMessageVisibility",
"sqs:GetQueueUrl",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes"
],
"Resource": "arn:aws:sqs:eu-central-1:1111111111:foobarqueue"
}
If I have a message that causes an exception to be thrown on the consumer and can't be processed, according to AWS docs, the consumer should not call DeleteMessage
, but instead simply leave the message in the queue. After a configured VisibilityTimeout (https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html) the message will be made visible again and the consumer will try to process it again. This will happen for as long until the configured maximum amount of receives is reached and then the message is moved to the DLQ.
Symfony's consumer behaviour breaks with this assumption by orchestrating the retry logic on the consumer side, which isn't great in multiple ways:
- A: The consumer also needs to be able to send messages into the queue.
- B: it breaks with the idea of SQS DLQs as the receives of the message in SQS won't increase.
- C: If the consumer can't send the message for any reason (permissions, errors, networking issues, process crashes etc.) the message might be lost.
I am wondering if the transport implementation can be configured in a way that makes it compatible again with what the expected usage of SQS is, so that DLQs will work properly again. If not, how does the implementation need to change to make the symfony consumer behaviour compatible with SQS DLQs again?
Could there be an Exception type that would cause the message to simply not be deleted and re-added to the queue so that we'll fall back to SQS' model of retries?
Possible Solution
No response
Additional Context
No response