2023-01-01 10:00:00+00:00

When dealing with high-frequency external data ingestion—such as scraping product prices and stock availability across dozens of suppliers—coupling your client-facing search APIs directly to crawling logic is a major design risk. If a crawler slows down or goes offline, your main application is impacted. An Event-Driven Architecture (EDA) isolates these workflows using message brokers.

By leveraging AWS SNS topics and SQS queues, you can design a pipeline that processes updates asynchronously, buffering traffic spikes and protecting your primary search index from database lockups.


1. Decoupling with SQS and SNS

In our architecture, the quoting API (gos-price-quoter) does not crawl data synchronously. Instead, it writes a message to an SQS queue (cpunto-parts) containing the target part number, and immediately returns the cached search results from Elasticsearch. This keeps API response times under 50ms.

The queue triggers a Lambda crawler function asynchronously. Once the crawler fetches the latest prices, it publishes the structured result to an SNS topic (PartDescriptions). This topic fans out the message to multiple subscribers, including the database persister and audit loggers.

2. The Fan-Out Pattern

The SNS fan-out pattern allows multiple downstream microservices to consume the same event independently without modifying the crawler code:

# Publishing crawl results to SNS in Python
import boto3
import json

sns_client = boto3.client('sns', region_name='us-east-1')

def publish_part_update(part_number, price_data):
    message = {
        "part_number": part_number,
        "prices": price_data,
        "timestamp": datetime.utcnow().isoformat()
    }
    response = sns_client.publish(
        TopicArn='arn:aws:sns:us-east-1:123456789012:PartDescriptions',
        Message=json.dumps(message)
    )
    return response['MessageId']

This event-driven approach ensures that downstream persistence or analysis failures never impact the upstream ingestion rate.