BeepGPT: Fine-Tuning GPT Models to Predict When You Actually Need That Slack Notification
Hook
What if your Slack notifications could learn when you actually care about a conversation versus when it's just noise? DataStax built exactly that by fine-tuning GPT on your message history, then archived the project.
Context
Slack notification fatigue is a real productivity killer. You're either drowning in constant pings that break your flow, or you've muted everything and miss critical mentions. Native Slack settings offer binary choices: get notified about everything in a channel, or nothing at all. Keywords help, but they're brittle—they catch "deploy failed" but miss "the deployment isn't looking good."
BeepGPT emerged from a fascinating question: could you train an AI to understand your notification preferences by analyzing how you've historically engaged with messages? If you typically respond within minutes to security incidents but ignore most product discussions until end-of-day, could a model learn that pattern? DataStax's experiment combined Kaskada, their temporal streaming engine, with OpenAI's fine-tuning APIs to build exactly this. The project sits in their archive now, but it's a masterclass in applied ML engineering—showing how to merge streaming data processing with LLMs to solve a real problem.
Technical Insight
BeepGPT's architecture is a two-stage pipeline: batch training on historical data, then real-time inference on live messages. The key innovation is using Kaskada for temporal feature engineering before feeding data to GPT models.
Kaskada processes exported Slack JSON files to extract temporal patterns—things like "user typically responds to security-tagged messages within 5 minutes" or "ignores messages from bot accounts after 6pm." These temporal features become part of the training data. The repository shows multiple experimental approaches, but the core fine-tuning workflow looks like this:
# From their training pipeline - simplified
import openai
# Generate training examples from Slack history
# Format: {"prompt": "<message context>", "completion": "notify" or "skip"}
training_data = []
for conversation in slack_export:
for msg in conversation['messages']:
# Check if user engaged (replied, reacted, etc.)
user_engaged = check_historical_engagement(msg, user_id)
prompt = format_message_context(
message=msg['text'],
channel=conversation['channel'],
sender=msg['user'],
thread_context=get_thread_messages(msg),
temporal_features=kaskada_features # Time-based patterns
)
completion = " notify" if user_engaged else " skip"
training_data.append({"prompt": prompt, "completion": completion})
# Fine-tune GPT-3.5
file = openai.File.create(file=open("training.jsonl"), purpose='fine-tune')
fine_tune = openai.FineTune.create(training_file=file.id, model="gpt-3.5-turbo")
The temporal aspect is crucial. Unlike simple keyword matching, Kaskada lets you ask questions like "how many messages has this sender posted in the last hour?" or "what's the user's typical response latency to this channel on Tuesday afternoons?" These features help the model understand context beyond just message content.
For real-time inference, the beep-gpt.py script connects to Slack's WebSocket API and runs each new message through the fine-tuned model. The repository includes a particularly clever piece of conversation boundary detection using few-shot learning:
# Detecting when a Slack thread has "ended" - from their experiments
few_shot_prompt = """
Determine if this Slack conversation has naturally concluded.
Example 1:
Message 1: "Can someone review the API changes?"
Message 2: "LGTM, approved"
Message 3: "Thanks!"
Status: CONCLUDED
Example 2:
Message 1: "The production deploy failed"
Message 2: "Looking into it now"
Status: ONGOING
Now analyze:
{thread_messages}
Status:
"""
This conversation boundary detection solves a subtle problem: when should you stop tracking a thread? Time-based heuristics ("no messages in 30 minutes") fail because some discussions naturally pause. Using GPT to understand semantic closure is more accurate.
The repository also explores vector embeddings as an alternative approach. Instead of fine-tuning, they embed messages with text-embedding-ada-002 and use semantic similarity to match against messages the user historically engaged with. This is cheaper than fine-tuning but requires maintaining a vector database of your entire Slack history.
One particularly pragmatic touch: the create_slack_workspace.ipynb notebook generates synthetic Slack conversations using GPT to supplement real training data. For privacy-sensitive workspaces where you can't export real messages, or to balance training data across different conversation types, synthetic data generation is clever ML engineering:
# Generate synthetic Slack conversations
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{
"role": "system",
"content": "Generate realistic Slack messages for a software engineering team discussing a production incident."
}]
)
# Parse and structure as Slack JSON export format
The full ML lifecycle is here: data collection (Slack export), feature engineering (Kaskada), multiple modeling approaches (fine-tuning, embeddings, few-shot), and deployment (WebSocket bot). It's a complete reference implementation for applied LLMs.
Gotcha
The biggest issue is right in the repository name: datastax-archive. This project is no longer maintained. Kaskada itself was discontinued, so even if you wanted to replicate the temporal feature engineering, you'd need to rebuild that component with a different streaming engine like Flink or ksqlDB.
Cost is the second major limitation. Fine-tuning GPT-3.5 isn't cheap, and running inference on every Slack message in an active workspace adds up fast. A team with 50 people generating 1,000 messages per day would hit thousands of API calls daily. The repository's README explicitly warns: "This is not production-ready code." Error handling is minimal, there's no monitoring, and the bot will crash on malformed Slack events.
The public-channels-only limitation is particularly painful. Most notification urgency lives in DMs and private channels—your manager asking for an update, a customer escalation, security incidents. BeepGPT can't access those without broader OAuth scopes, and training on private data raises serious privacy concerns. You'd need careful data governance and potentially on-premises LLM deployment to handle sensitive conversations, which defeats the simplicity of using OpenAI's API.
Verdict
Use if: you're researching how to combine temporal data processing with LLMs, building a proof-of-concept for intelligent notification routing, or learning applied ML system design from a real-world example. The repository is exceptional educational material—the notebooks show experimental methodology, multiple modeling approaches, and pragmatic engineering decisions like synthetic data generation. It's perfect for understanding how to architect LLM-powered features in production-style systems. Skip if: you need a working notification solution for your team, want something production-ready, or can't justify ongoing OpenAI API costs. The archived status and Kaskada dependency mean you'd essentially be forking and rebuilding core components. For actual use, Slack's native smart notifications have improved significantly, or explore self-hosted alternatives using local LLMs (Llama, Mistral) with vector databases to avoid per-message inference costs. BeepGPT's real value is architectural inspiration, not deployed software.