Open-source bot monitoring and mitigation
November 17, 2025 · 5 min read
Bot traffic is not a single problem. It is a spectrum. On one end, search engine crawlers and uptime monitors perform useful work. On the other, credential stuffing campaigns, scraping operations, and automated account creation exploit your application at scale. In between sits a growing volume of AI agents, automated integrations, and scripts that may or may not be operating within the boundaries you intended.
The conventional approach to this problem is a proprietary SaaS bot mitigation platform. These products sit inline in your request path, analyze every request before it reaches your application, and charge based on traffic volume. They work. They are also expensive, opaque, and require you to route your entire user activity stream through a vendor's infrastructure.
There is a question worth asking before signing that contract: how much of what these platforms do actually requires a proprietary vendor, and how much can you accomplish with open-source tooling running on your own systems?
The cost problem with proprietary bot detection
Proprietary bot mitigation platforms price on request volume. Every request from every user, human or automated, passes through the vendor's infrastructure and counts toward the bill. As your traffic grows, the cost grows with it, regardless of how much of that traffic is actually automated. A product with a million monthly visitors pays to analyze a million sessions even if only a fraction of a percent is bot traffic.
The pricing reflects the vendor's infrastructure costs, but the economics penalize growth. A successful product that doubles its user base doubles its bot detection bill. Seasonal traffic spikes, viral growth periods, and marketing campaigns all increase the cost of a service that is primarily needed for the small percentage of traffic that is actually malicious.
Open-source bot monitoring eliminates the per-request pricing model entirely. The cost is the infrastructure you run it on, which you already control and provision. Traffic growth does not trigger a larger bill from a vendor. The economics scale with your infrastructure, not with a third party's pricing tiers.
Why your activity data should stay on your systems
The cost argument is reason enough to evaluate alternatives. But there is a second problem with the proprietary model that is harder to reverse once you have committed to it: data flow.
Inline bot detection platforms receive every request your users make. Every page load, every API call, every login attempt passes through their infrastructure before reaching yours. For products with significant traffic, this is a continuous stream of behavioral data flowing to a third party: IP addresses, user agent strings, session identifiers, request URLs, and timing patterns, all tied to your authenticated users.
Under data protection frameworks, this data can qualifies as personal data. Routing it through an external platform creates a processing relationship that requires legal basis, documentation, and ongoing compliance management. The security purpose does not exempt the transfer from these requirements.
Server-side behavioral monitoring inverts this model. Your application sends events to a monitoring instance running on your own infrastructure. IP intelligence is fetched inward through an enrichment API. All behavioral data, trust scores, and detection records stay in your database. There is no third party in the request path and no vendor receiving your users' activity stream.
What open source changes about bot detection
Proprietary bot detection is a black box by design. The vendor's detection logic is their competitive advantage, so you cannot inspect, modify, or audit the rules that decide which traffic is blocked and which is allowed. When a legitimate user is blocked, you file a support ticket. When a new bot pattern targets your product specifically, you wait for the vendor to update their models. When an auditor or regulator asks how your bot detection works, you point to the vendor's documentation and hope it is sufficient.
Open-source detection logic changes each of these situations. The rules that classify traffic as automated are readable. When a legitimate automated integration is incorrectly scored, you can identify the contributing rules and adjust the weights yourself. When a new attack pattern emerges, you can write a rule that addresses it on the same day. When someone asks how the detection works, the answer is in the codebase.
This transparency also matters for the grey area between good and bad bots. A scraper that collects publicly available product information might be acceptable at low volume but damaging at high volume. An AI agent accessing your API within its authorized scope is legitimate, but the same agent making requests outside its intended parameters is a problem. These distinctions require judgment specific to your product. Configurable, inspectable rules let you encode that judgment directly. A proprietary vendor's one-size-fits-all model cannot.
Open source also means the detection cannot be bypassed by identifying the monitoring vendor's infrastructure. There is no external domain to block and no third-party script to strip. The monitoring runs on your backend and is indistinguishable from your application's normal request processing.
How tirreno handles bot monitoring
tirreno approaches bot detection as a behavioral monitoring problem rather than an inline filtering problem. Your application sends events to tirreno from the backend. tirreno evaluates the behavioral signals against its rule engine, updates user trust scores, and provides enforcement through the blacklist API. Your application queries the blacklist before granting access, completing the loop.
This server-side model means there is no client-side script for bots to detect or bypass, and no browser signal to spoof. The monitoring is invisible because it operates on data your application already has: IP addresses, user agent strings, request URLs, timestamps, and user identifiers.
The bot_detection preset weights the rules most relevant to identifying automated traffic. Device rules flag requests identified as bots, unknown device types, and suspicious user agent strings. Session rules score single-event sessions, HEAD requests, empty referer headers, and unauthorized requests. IP rules evaluate datacenter ranges, TOR exit nodes, commercial VPN providers, and addresses appearing in abuse lists.
Trust scores accumulate as signals combine. A single request from a datacenter IP is a weak signal. That same IP generating hundreds of single-event sessions across dozens of accounts in an hour, with bot-identified user agents and no referer headers, accumulates a score that crosses the auto-blacklisting threshold. The blacklist API then lets your application deny access before the next request reaches application logic.
For legitimate bots that you want to allow, the approach is straightforward. Known good bots typically identify themselves through consistent user agent strings and originate from documented IP ranges. Your application can recognize these before sending events to tirreno, or you can adjust rule weights so that identified crawler traffic does not accumulate negative scores.
Getting started
You can evaluate tirreno's bot detection in an afternoon.
Install. Deploy a tirreno instance. The administration guide covers setup and configuration.
Send events. Point your application at the tirreno event API, starting with login attempts and your highest-traffic endpoints. tirreno expects a username with each event, so for authenticated users you send their actual identifier. For non-logged-in visitors, you can use the visitor's IP address with .* replacing the last octet (e.g. 192.168.1.*) as the username. This lets you monitor anonymous traffic patterns without needing a real identity. Each event also needs a timestamp, IP, user agent, and event type. The developer guide has the API schema and the blacklist API reference.
Apply the bot_detection preset. Open the rules page, activate the preset, and browse the activity page to see how your traffic is being scored. Look for clusters of single-event sessions, datacenter IP concentrations, and bot-identified user agents.
Tune. Adjust rule weights to fit your product's traffic patterns. Whitelist your known good bots. Set blacklist thresholds based on what you see in the data.
That is enough to compare what open-source security framework catches against what you are currently paying a vendor to filter. Download at tirreno.com/download.