2023-05-11 06:34:17+00:00

When scraping data from major distributors like DigiKey, Arrow, and Avnet, static API keys are rarely sufficient. High-volume enterprise endpoints require robust authentication protocols, including dynamic Client Credentials or OAuth2 authorization flows with refreshing access tokens.

Managing these tokens across multiple async workers requires thread-safe caches and proactive refresh loops to prevent request failures during ingestion.


1. Managing the DigiKey OAuth2 Flow

DigiKey requires an authorization flow where a Refresh Token is exchanged for a temporary Access Token (valid for 24 hours). The crawler service must intercept requests, check if the cached access token is expired, and request a refresh if needed:

# Thread-safe OAuth2 Token Refresh in Python
import requests
import time

class DigiKeyTokenManager:
    def __init__(self, client_id, client_secret, refresh_token):
        self.client_id = client_id
        self.client_secret = client_secret
        self.refresh_token = refresh_token
        self.access_token = None
        self.expires_at = 0

    def get_valid_token(self):
        if self.access_token and time.time() < self.expires_at - 300: # 5-min buffer
            return self.access_token
            
        # Token is expired; exchange refresh token for new access token
        url = "https://api.digikey.com/v1/oauth2/token"
        payload = {
            "client_id": self.client_id,
            "client_secret": self.client_secret,
            "refresh_token": self.refresh_token,
            "grant_type": "refresh_token"
        }
        res = requests.post(url, data=payload)
        data = res.json()
        
        self.access_token = data["access_token"]
        self.refresh_token = data["refresh_token"] # Save updated refresh token
        self.expires_at = time.time() + int(data["expires_in"])
        return self.access_token

2. Secure Key Storage

We store these credentials in cloud parameter vaults like AWS Systems Manager (SSM) Parameter Store or Secrets Manager, rather than committing them in configuration files, keeping our keys secure and rotation-friendly.