2026-04-28 10:00:00+00:00

Integrating LLM-powered chatbots into Microsoft Teams is a highly effective way to automate workplace productivity. However, engineering a conversational assistant for Teams presents a massive technical challenge: reconstructing chat thread histories.

Unlike platforms like Slack or Discord, which offer single-endpoint APIs to fetch entire threaded conversations, Microsoft Teams uses a strictly normalized relational model.

Top-level channel posts and nested thread replies are stored as separate resources. To feed a complete conversation history into an LLM context, you must execute multiple asynchronous HTTP requests, coordinate nested pagination, merge disparate node types, and handle strict API rate limits.

This article details how to build an Asynchronous Thread History Reconstructor in Node.js using the Microsoft Graph API.


🏗️ The Relational Data Model of Teams Conversations

In Microsoft Teams, channel messages are structurally organized as hierarchical graphs:

When your bot receives an event webhook indicating a user has replied inside an active thread, the webhook payload only contains the individual reply message. It does not contain the conversational history before it.

To build a chronological chat history for your LLM prompt, you must execute the following coordination workflow:

Teams Webhook Triggers reply event 1. Get Root Message /messages/{root-id} 2. Get Replies /messages/{root-id}/replies 3. Chronological Merge Sorts root & replies by date 4. LLM Prompt Context Formatted chat history

⏳ Handling Graph API Throttling (HTTP 429)

The Microsoft Graph API implements highly aggressive throttling. Channel messaging endpoints are bound to Tier 2 service limits (typically allowing only a few requests per second per tenant).

If a thread grows active and users reply in rapid succession, your reconstruction loops will instantly hit HTTP 429 Too Many Requests.

To make your reconstruction layer resilient, you must wrap all network calls in an asynchronous retry wrapper that honors the API’s Retry-After header using exponential backoff fallback logic.


🛠️ The Complete Node.js Implementation

Below is a robust Node.js/TypeScript class that fetches root messages and channel replies asynchronously, handles multi-page pagination, mitigates Graph API rate limiting, and reconstructs a clean chronological array ready for consumption by Gemini or Claude:

// teams-history-reconstructor.ts (Asynchronous MS Graph Thread Compiler)
import { Client } from '@microsoft/microsoft-graph-client';
import { logger } from './logger.js';

interface ChatMessageContext {
  role: 'user' | 'assistant';
  authorName: string;
  content: string;
  timestamp: string;
}

export class TeamsThreadReconstructor {
  private graphClient: Client;

  constructor(accessToken: string) {
    // Initialize standard Microsoft Graph SDK Client
    this.graphClient = Client.init({
      authProvider: (done) => done(null, accessToken)
    });
  }

  /**
   * Executive wrapper with HTTP 429 Rate-Limit Handling
   */
  private async executeWithRetry<T>(fn: () => Promise<T>, retries = 3): Promise<T> {
    try {
      return await fn();
    } catch (err: any) {
      if (err.statusCode === 429 && retries > 0) {
        // Read "Retry-After" header (defaulting to 3 seconds if not present)
        const retryAfterSeconds = parseInt(err.headers?.['retry-after'] || '3', 10);
        logger.warn(`Microsoft Graph rate-limit hit. Sleeping for ${retryAfterSeconds}s...`);
        
        await new Promise((resolve) => setTimeout(resolve, retryAfterSeconds * 1000));
        return this.executeWithRetry(fn, retries - 1);
      }
      throw err;
    }
  }

  /**
   * Reconstructs thread messages in exact chronological order.
   */
  public async getThreadHistory(
    teamId: string,
    channelId: string,
    messageId: string
  ): Promise<ChatMessageContext[]> {
    const basePath = `/teams/${teamId}/channels/${channelId}/messages/${messageId}`;
    
    logger.info({ messageId }, 'Starting asynchronous thread history reconstruction...');

    try {
      // Step 1: Fetch root message and replies concurrently to optimize latency
      const [rootRes, repliesRes] = await Promise.all([
        this.executeWithRetry(() => this.graphClient.api(basePath).get()),
        this.executeWithRetry(() => this.graphClient.api(`${basePath}/replies`).get())
      ]);

      const allReplies: any[] = [...repliesRes.value];
      let nextLink = repliesRes['@odata.nextLink'];

      // Step 2: Traverse replies pagination pages if odata.nextLink is present
      while (nextLink) {
        logger.debug('Fetching next page of thread replies...');
        const pageRes = await this.executeWithRetry(() => 
          this.graphClient.api(nextLink).get()
        );
        allReplies.push(...pageRes.value);
        nextLink = pageRes['@odata.nextLink'];
      }

      // Step 3: Combine and map disparate structures into unified LLM contexts
      const mappedMessages: ChatMessageContext[] = [];

      // Helper to map Graph Message object to LLM Schema
      const mapMessage = (msg: any): ChatMessageContext => ({
        role: msg.from?.application ? 'assistant' : 'user',
        authorName: msg.from?.user?.displayName || 'System User',
        // Strip HTML tags from standard Teams Rich-Text format
        content: (msg.body?.content || '').replace(/<[^>]*>/g, '').trim(),
        timestamp: msg.createdDateTime
      });

      // Insert Root Message first
      mappedMessages.push(mapMessage(rootRes));

      // Append all replies
      allReplies.forEach(reply => {
        mappedMessages.push(mapMessage(reply));
      });

      // Step 4: Chronological sorting (replies pagination might fetch out of order)
      mappedMessages.sort((a, b) => 
        new Date(a.timestamp).getTime() - new Date(b.timestamp).getTime()
      );

      logger.info({ count: mappedMessages.length }, 'Thread history successfully compiled.');
      return mappedMessages;
    } catch (err: any) {
      logger.error({ err, messageId }, 'Failed thread reconstruction.');
      throw err;
    }
  }
}

📈 Summary of Benefits

Implementing this asynchronous thread reconstruction layer delivers massive architectural advantages:

  1. Perfect Contextual Coherence: The LLM receives the entire historical context of a thread, eliminating isolated conversational bugs and enabling deep multi-turn problem-solving inside Teams.
  2. Resilience to Scale Throttling: The automatic retry-after loop absorbs aggressive Microsoft Graph rate limits, ensuring zero transaction drops during heavy concurrent developer channel chats.
  3. Clean String Content: Regex parsing strips Teams' native verbose HTML tags, saving up to 60% of token consumption and keeping context windows clean.

By coordinating root fetches, handling paginated replies, and honoring throttling boundaries in Node.js, you can build highly reliable and deeply conversational enterprise AI assistants for Microsoft Teams.