2023-08-07 02:00:00+00:00

In payment systems, network flakiness is a fact of life. An API request to capture a payment might timeout, leaving your server in the dark: was the payment processed by the gateway, or did it fail before charging the client? Retrying without precautions can lead to double charges, resulting in chargebacks and angry customers.

To build a resilient payment system, we must enforce strict Idempotency Keys and a robust State Machine.


1. The Payment State Machine

A payment must follow strict state transitions:

Created -> Authorizing -> Authorized -> Capturing -> Captured (or Refunded)
Transitioning from any state must lock the row in the database, preventing concurrent requests from attempting to capture or refund the same transaction simultaneously.

2. Implementing Idempotency Middleware

Every payment request must include a unique Idempotency-Key header. The payment service checks Redis or the database for this key before executing the gateway call. If the request is in progress, it blocks; if it already completed, it returns the cached response:

func (s *PaymentService) ProcessIdempotentPayment(ctx context.Context, key string, execute func() (*Response, error)) (*Response, error) {
    // Try to set lock in Redis
    success, _ := s.redis.SetNX(ctx, "idemp:"+key, "IN_PROGRESS", 5*time.Minute).Result()
    if !success {
        // Wait or fetch cached result if finished
        return s.waitForCachedResponse(ctx, key)
    }
    
    resp, err := execute()
    if err != nil {
        s.redis.Del(ctx, "idemp:"+key)
        return nil, err
    }
    
    // Save response JSON to Redis with a 24-hour TTL
    s.redis.Set(ctx, "idemp:"+key, serialize(resp), 24*time.Hour)
    return resp, nil
}

This ensures that no matter how many times a client retries due to network failure, the transaction is processed exactly once.