Our Technology Stack
Learn about the cutting-edge technologies that power our products. This educational guide explains each technology, why we chose it, and how it works.
eBPF (Extended Berkeley Packet Filter)
Used in: AncientReport
What is eBPF?
eBPF is a revolutionary technology that allows you to run sandboxed programs directly in the Linux kernel without modifying the kernel source code or loading kernel modules. Think of it as a "JavaScript for the kernel" – safe, portable, and incredibly powerful.
Why We Use It
- ✓ Zero Overhead: Unlike traditional monitoring that polls /proc, eBPF hooks directly into kernel events with near-zero performance impact.
- ✓ Deep Visibility: See every system call, network packet, and file operation without instrumenting your applications.
- ✓ Security: eBPF programs are verified by the kernel before execution, preventing crashes or malicious behavior.
How It Works
eBPF programs attach to kernel hooks (like system calls or network events). When triggered, they execute in a secure sandbox and can collect data, modify behavior, or make decisions in real-time.
// Simplified eBPF workflow
1. Write eBPF program (C or Rust)
2. Compile to eBPF bytecode
3. Load into kernel via bpf() syscall
4. Attach to hook point (e.g., tracepoint:syscalls:sys_enter_*)
5. Kernel verifies and JIT-compiles
6. Program runs on every event, sends data to userspace
Rust
Used in: MetalHive Agent, MithrilLog Ingester, AncientReport Agent
What is Rust?
Rust is a systems programming language focused on safety, speed, and concurrency. It achieves memory safety without garbage collection, making it ideal for high-performance system software like our monitoring agents.
Why We Use It
- ✓ Memory Safety: The borrow checker eliminates entire classes of bugs like null pointers, buffer overflows, and data races at compile time.
- ✓ C-Level Performance: Rust compiles to native code, achieving the same speed as C/C++ without the memory management headaches.
- ✓ Async/Await: First-class support for asynchronous programming, perfect for our network-heavy agents using Tokio runtime.
- ✓ Small Binaries: Our agents compile to single, small binaries with no runtime dependencies.
Example: Agent Memory Usage
// Our Rust agent vs. typical Java/Python agent
MetalHive Agent (Rust): ~15 MB RAM
Typical Java Agent: ~200+ MB RAM
Typical Python Agent: ~80+ MB RAM
// CPU overhead during metric collection
Rust: 0.1-0.3% CPU
Python: 2-5% CPU
Go (Golang)
Used in: MetalHive Controller, MithrilLog Admin
What is Go?
Go is a statically typed, compiled language designed at Google for building simple, reliable, and efficient software. It's particularly well-suited for networking, cloud infrastructure, and concurrent systems.
Why We Use It
- ✓ Goroutines: Lightweight threads that make concurrent programming simple. Handle thousands of connections with minimal resources.
- ✓ Fast Compilation: Compiles in seconds, not minutes. Great for rapid development cycles.
- ✓ Standard Library: Excellent built-in support for HTTP, JSON, networking, and cryptography.
- ✓ Docker Ecosystem: Docker, Kubernetes, and most cloud-native tools are written in Go.
Perfect For Controllers
// Why Go is ideal for our Controller:
- HTTP API server: Built-in net/http is production-ready
- WebSocket handling: Gorilla/websocket for SSH terminals
- Docker client: Official Docker SDK for Go
- Concurrent node management: One goroutine per node
- JSON/YAML parsing: Native support, no external deps
ClickHouse
Used in: All Products (Metrics & Log Storage)
What is ClickHouse?
ClickHouse is an open-source column-oriented database designed for Online Analytical Processing (OLAP). It's built from the ground up for blazing-fast analytics on massive datasets, making it perfect for time-series data like metrics and logs.
Why We Use It
- ✓ Blazing Fast: Processes billions of rows per second. Query a month of metrics in milliseconds.
- ✓ Compression: 10:1 compression ratio means we store more data at lower cost.
- ✓ SQL Interface: Standard SQL syntax makes querying intuitive.
- ✓ Time-Series Optimized: Built-in functions for time bucketing, moving averages, and aggregations.
Performance Example
-- Query: Average CPU usage per hour for 30 days
-- Data: 100 million rows
ClickHouse: 0.02 seconds
PostgreSQL: 45+ seconds
Elasticsearch: 8+ seconds
-- Storage comparison for 1TB raw data
ClickHouse: ~100 GB (10:1 compression)
PostgreSQL: ~800 GB
MongoDB: ~1.2 TB
NATS & JetStream
Used in: MetalHive (Agent ↔ Controller), AncientReport (Metrics Streaming)
What is NATS?
NATS is a high-performance messaging system designed for cloud-native applications. JetStream adds persistence, exactly-once delivery, and stream processing capabilities.
Why We Use It
- ✓ Ultra-Low Latency: Sub-millisecond message delivery. Essential for real-time monitoring.
- ✓ Pub/Sub + Queue: Supports both publish-subscribe (broadcasts) and work queues (load balancing).
- ✓ JetStream Persistence: Messages are durably stored, surviving restarts and enabling replay.
- ✓ Lightweight: Single binary, ~15MB, runs anywhere with minimal resources.
Our Architecture
// MetalHive messaging pattern:
Agent → NATS → "metrics.{hostname}" → Controller
Agent → NATS → "heartbeat.{hostname}" → Controller
Controller → NATS → "commands.{hostname}" → Agent
// JetStream benefits:
- If controller is down, metrics queue up
- No data loss during restarts
- Historical replay for debugging
Next.js & React
Used in: All Product Dashboards
What is Next.js?
Next.js is a React framework that enables server-side rendering, static generation, and API routes. It provides the best developer experience with all the features needed for production-ready applications.
Why We Use It
- ✓ Server Components: Reduced JavaScript bundle size, faster initial page loads.
- ✓ API Routes: Backend API endpoints alongside frontend code.
- ✓ TypeScript Support: Full type safety for better code quality.
- ✓ Streaming SSE: Real-time log tailing and metric updates.
Docker & Containerization
Used in: All Products (Deployment & Orchestration)
What is Docker?
Docker is a platform for developing, shipping, and running applications in containers. Containers package code and dependencies together, ensuring consistent behavior across environments.
How We Use It
- ✓ Multi-Stage Builds: Compile in one container, run in a minimal runtime container. Smaller, more secure images.
- ✓ Docker Compose: Define entire stacks (DB, NATS, services) in one file.
- ✓ Health Checks: Automated container health monitoring and restart.
- ✓ Volume Mounts: Persistent data storage for ClickHouse and configurations.
Additional Technologies
Python
AI/ML processing, trend analysis, and LLM integrations. Uses FastAPI for high-performance APIs.
DragonflyDB
Redis-compatible cache with 25x better performance. Used for caching and real-time state in MetalHive.
TLS/mTLS
All agent-to-controller communication is encrypted. Mutual TLS for authenticated connections.
Prometheus
Auto-discovery of Prometheus exporters. Scrape existing metrics and store in ClickHouse.
Telegram Bot API
Instant alerts delivered to your team. Rich formatting with error context and quick actions.
WebSocket
Real-time SSH terminal access through the browser. Bi-directional communication for live updates.
Want to Learn More?
Explore our products to see these technologies in action.