Node Discovery

Introduction

Node discovery enables BanyanDB nodes to locate and communicate with each other in a distributed cluster. The service discovery mechanism is primarily applied in two scenarios:

Request Routing: When liaison nodes need to send requests to data nodes for query/write execution.
Data Migration: When the lifecycle service queries the node list to perform shard migration and rebalancing.

BanyanDB supports three discovery mechanisms to accommodate different deployment environments:

Etcd-based Discovery: Traditional distributed consensus approach suitable for VM deployments and multi-cloud scenarios.
DNS-based Discovery: Cloud-native solution leveraging Kubernetes service discovery infrastructure.
File-based Discovery: Static configuration file approach for simple deployments and testing environments.

This document provides guidance on configuring and operating all discovery modes.

Etcd-Based Discovery

Overview

Etcd-based discovery uses a distributed key-value store to maintain cluster membership. Each node registers itself in etcd with a time-limited lease and continuously renews the lease through heartbeat mechanisms.

Configuration Flags

The following flags control etcd-based node discovery:

# Basic configuration
--node-discovery-mode=etcd                    # Enable etcd mode (default)
--namespace=banyandb                          # Namespace in etcd (default: banyandb)
--etcd-endpoints=http://localhost:2379        # Comma-separated etcd endpoints

# Authentication
--etcd-username=myuser                        # Etcd username (optional)
--etcd-password=mypass                        # Etcd password (optional)

# TLS configuration
--etcd-tls-ca-file=/path/to/ca.crt           # Trusted CA certificate
--etcd-tls-cert-file=/path/to/client.crt     # Client certificate
--etcd-tls-key-file=/path/to/client.key      # Client private key

# Timeouts and intervals
--node-registry-timeout=2m                    # Node registration timeout (default: 2m)
--etcd-full-sync-interval=30m                 # Full state sync interval (default: 30m)

Configuration Examples

Data Node with TLS:

banyand data \
  --node-discovery-mode=etcd \
  --etcd-endpoints=https://etcd-1:2379,https://etcd-2:2379,https://etcd-3:2379 \
  --etcd-tls-ca-file=/etc/banyandb/certs/etcd-ca.crt \
  --etcd-tls-cert-file=/etc/banyandb/certs/etcd-client.crt \
  --etcd-tls-key-file=/etc/banyandb/certs/etcd-client.key \
  --namespace=production-banyandb

Data Node with Basic Authentication:

banyand data \
  --node-discovery-mode=etcd \
  --etcd-endpoints=http://etcd-1:2379,http://etcd-2:2379 \
  --etcd-username=banyandb-client \
  --etcd-password=${ETCD_PASSWORD} \
  --namespace=production-banyandb

Etcd Key Structure

BanyanDB stores node registrations using the following pattern:

/{namespace}/nodes/{node-id}

Example: /banyandb/nodes/data-node-1:17912

Key Components:

Namespace: Configurable via --namespace flag (default: banyandb). Allows multiple BanyanDB clusters to coexist in the same etcd instance
Node ID: Format {hostname}:{port} or {ip}:{port}, uniquely identifying each node

The value stored is a Protocol Buffer serialized databasev1.Node message containing node metadata, addresses, roles, and labels.

Node Lifecycle

Registration

When a node starts, it performs the following sequence:

Lease Creation: Creates a 5-second TTL lease with etcd, automatically renewed through keep-alive heartbeat.
Key-Value Registration: Uses etcd transaction to ensure the key doesn’t already exist, preventing registration conflicts.
Keep-Alive Loop: Background goroutine continuously sends heartbeat with automatic reconnection and exponential backoff on network failures.

Health Monitoring

Node health is implicitly monitored through the lease mechanism:

Nodes with expired leases (>5 seconds without keep-alive) are automatically removed by etcd.
Presence in etcd indicates the node is alive.
Dead nodes disappear within 5 seconds of failure.

Event Watching

BanyanDB implements real-time cluster membership tracking:

Watches etcd prefix /nodes/ for PUT (add/update) and DELETE events.
Revision-based streaming resumes from last known revision after reconnection.
Periodic full sync every 30 minutes (randomized) to detect missed events.

Deregistration

Graceful Shutdown:

Close signal triggers lease revocation.
Etcd automatically deletes the node’s key.
Watch streams notify all other nodes immediately.

Crash Recovery:

Lease expires after 5 seconds without renewal.
Etcd automatically removes the key.
No manual intervention required for cleanup.

DNS-Based Discovery

Overview

DNS-based discovery provides a cloud-native alternative leveraging Kubernetes' built-in service discovery infrastructure. This eliminates the operational complexity of managing a separate etcd cluster.

How it Works with Kubernetes

Create a Kubernetes headless service.
Kubernetes automatically creates DNS SRV records: _<port-name>._<proto>.<service-name>.<namespace>.svc.cluster.local.
BanyanDB queries these SRV records to discover pod IPs and ports.
Connects to each endpoint via gRPC to fetch node metadata.

Note: No external dependencies beyond DNS infrastructure.

Configuration Flags

# Mode selection
--node-discovery-mode=dns                    # Enable DNS mode

# DNS SRV addresses (comma-separated)
--node-discovery-dns-srv-addresses=_grpc._tcp.banyandb-data.default.svc.cluster.local

# Query intervals
--node-discovery-dns-fetch-init-interval=5s  # Query interval during init (default: 5s)
--node-discovery-dns-fetch-init-duration=5m  # Initialization phase duration (default: 5m)
--node-discovery-dns-fetch-interval=15s      # Query interval after init (default: 15s)

# gRPC settings
--node-discovery-grpc-timeout=5s             # Timeout for metadata fetch (default: 5s)

# TLS configuration
--node-discovery-dns-tls=true                # Enable TLS for DNS discovery (default: false)
--node-discovery-dns-ca-certs=/path/to/ca.crt,/path/to/another.crt # Ordered CA bundle matching SRV addresses

node-discovery-dns-fetch-init-interval and node-discovery-dns-fetch-init-duration define the aggressive polling strategy during bootstrap before falling back to the steady-state node-discovery-dns-fetch-interval. All metadata fetches share the same node-discovery-grpc-timeout, which is also reused by the file discovery mode. When TLS is enabled, the CA cert list must match the SRV address order, ensuring each role (e.g., data, liaison) can be verified with its own trust bundle.

Configuration Examples

Single Zone Discovery:

banyand data \
  --node-discovery-mode=dns \
  --node-discovery-dns-srv-addresses=_grpc._tcp.banyandb-data.default.svc.cluster.local \
  --node-discovery-dns-fetch-interval=10s \
  --node-discovery-grpc-timeout=5s

Multi-Zone Discovery:

banyand data \
  --node-discovery-mode=dns \
  --node-discovery-dns-srv-addresses=_grpc._tcp.data.zone1.svc.local,_grpc._tcp.data.zone2.svc.local \
  --node-discovery-dns-fetch-interval=10s

Discovery Process

The DNS discovery operates continuously in the background:

Query DNS SRV Records (every 5s during init, then every 15s)
- Queries each configured SRV address using net.DefaultResolver.LookupSRV()
- Deduplicates addresses across multiple SRV queries
Fallback on DNS Failure
- Uses cache from previous successful query
- Allows cluster to survive temporary DNS outages
Update Node Cache
- For each new address: connects via gRPC, fetches node metadata, notifies handlers.
- For removed addresses: deletes from cache, notifies handlers.

Kubernetes DNS TTL

Kubernetes DNS implements Time-To-Live (TTL) caching which affects discovery responsiveness:

Default TTL: 30 seconds (CoreDNS default since Kubernetes 1.15)
Implications: Pod IP changes may not be visible immediately. BanyanDB polls every 15 seconds, potentially using cached results for 2-3 cycles

Customizing TTL:

To reduce discovery latency, edit the CoreDNS ConfigMap:

kubectl edit -n kube-system configmap/coredns

# Modify cache plugin TTL:
# cache 5

TLS Configuration

DNS-based discovery supports per-SRV-address TLS configuration for different roles.

Configuration Rules:

Number of CA certificate paths must match number of SRV addresses.
caCertPaths[i] is used for connections to addresses from srvAddresses[i].
Multiple SRV addresses can share the same certificate file.
If multiple SRV addresses resolve to the same IP:port, certificate from first SRV address is used.

TLS Examples:

banyand data \
  --node-discovery-mode=dns \
  --node-discovery-dns-tls=true \
  --node-discovery-dns-srv-addresses=_grpc._tcp.data.namespace.svc.local,_grpc._tcp.liaison.namespace.svc.local \
  --node-discovery-dns-ca-certs=/etc/banyandb/certs/data-ca.crt,/etc/banyandb/certs/liaison-ca.crt

Certificate Auto-Reload:

BanyanDB monitors certificate files using fsnotify and automatically reloads them when changed:

Hot reload without process restart
500ms debouncing for rapid file modifications
SHA-256 validation to detect actual content changes

This enables zero-downtime certificate rotation.

Kubernetes Deployment

Headless Service

Create a headless service for automatic DNS SRV record generation:

apiVersion: v1
kind: Service
metadata:
  name: banyandb-data
  namespace: default
  labels:
    app: banyandb
    component: data
spec:
  clusterIP: None  # Headless service
  selector:
    app: banyandb
    component: data
  ports:
  - name: grpc
    port: 17912
    protocol: TCP
    targetPort: grpc

This creates DNS SRV record: _grpc._tcp.banyandb-data.default.svc.cluster.local

File-Based Discovery

Overview

File-based discovery provides a simple static configuration approach where nodes are defined in a YAML file. This mode is ideal for testing environments, small deployments, or scenarios where dynamic service discovery infrastructure is unavailable.

The service periodically reloads the configuration file and automatically updates the node registry when changes are detected.

How it Works

Read node configurations from a YAML file on startup
Attempt to connect to each node via gRPC to fetch full metadata
Successfully connected nodes are added to the cache
Nodes that fail to connect are skipped and will be attempted again on the next periodic file reload
Reload the file at the node-discovery-file-fetch-interval cadence as a backup to fsnotify-based reloads, reprocessing every entry (including nodes that previously failed)
Notify registered handlers when nodes are added or removed

Configuration Flags

# Mode selection
--node-discovery-mode=file                    # Enable file mode

# File path (required for file mode)
--node-discovery-file-path=/path/to/nodes.yaml

# gRPC settings
--node-discovery-grpc-timeout=5s             # Timeout for metadata fetch (default: 5s)

# Interval settings
--node-discovery-file-fetch-interval=5m               # Polling interval to reprocess the discovery file (default: 5m)
--node-discovery-file-retry-initial-interval=1s       # Initial retry delay for failed node metadata fetches (default: 1s)
--node-discovery-file-retry-max-interval=2m           # Upper bound for retry delay backoff (default: 2m)
--node-discovery-file-retry-multiplier=2.0            # Multiplicative factor applied between retries (default: 2.0)

node-discovery-file-fetch-interval controls the periodic full reload that acts as a safety net even if filesystem events are missed.
node-discovery-file-retry-* flags configure the exponential backoff used when a node cannot be reached over gRPC. Failed nodes are retried starting from the initial interval, multiplied by the configured factor until the max interval is reached.

YAML Configuration Format

nodes:
  - name: liaison-0
    grpc_address: 192.168.1.10:18912
  - name: data-hot-0
    grpc_address: 192.168.1.20:17912
    tls_enabled: true
    ca_cert_path: /etc/banyandb/certs/ca.crt
  - name: data-cold-0
    grpc_address: 192.168.1.30:17912

Configuration Fields:

name (required): Identifier for the node
grpc_address (required): gRPC endpoint in host:port format
tls_enabled (optional): Enable TLS for gRPC connection (default: false)
ca_cert_path (optional): Path to CA certificate file (required when TLS is enabled)

Configuration Examples

Basic Configuration:

banyand data \
  --node-discovery-mode=file \
  --node-discovery-file-path=/etc/banyandb/nodes.yaml

With Custom Polling and Retry Settings:

banyand data \
  --node-discovery-mode=file \
  --node-discovery-file-path=/etc/banyandb/nodes.yaml \
  --node-discovery-file-fetch-interval=20m \
  --node-discovery-file-retry-initial-interval=5s \
  --node-discovery-file-retry-max-interval=1m \
  --node-discovery-file-retry-multiplier=1.5

Node Lifecycle

Initial/Interval Load

When the service starts:

Reads the YAML configuration file
Validates required fields (grpc_address)
Attempts gRPC connection to each node
Successfully connected nodes → added to cache

Error Handling

Startup Errors:

Missing or invalid file path → service fails to start
Invalid YAML format → service fails to start
Missing required fields → service fails to start

Runtime Errors:

gRPC connection failure → node skipped; retried on next file reload
File read error → keep existing cache, log error
File deleted → keep existing cache, log error

Choosing a Discovery Mode

Etcd Mode - Best For

Traditional VM/bare-metal deployments
Multi-cloud deployments requiring global service discovery
Environments where etcd is already deployed
Need for strong consistency guarantees

DNS Mode - Best For

Kubernetes-native deployments
Simplified operations (no external etcd management)
Cloud-native architecture alignment
StatefulSets with stable network identities
Rapid deployment without external dependencies

File Mode - Best For

Development and testing environments
Small static clusters (< 10 nodes)
Air-gapped deployments without service discovery infrastructure
Proof-of-concept and demo setups
Environments where node membership is manually managed
Scenarios requiring predictable and auditable node configuration

Edit this page