Skip to Content
ResearchEmergent Behavior Models in Multi-Agent Systems

Density-Modulated Boids and Stigmergic Coverage: Phase Transitions in 500k-Agent Exploration Swarms

We characterize the order–disorder transition in density-modulated Boids and task-field ACO swarms at scale, quantifying critical exponents, susceptibility, and correlation length on a half-million-agent benchmark.

Chris Adams, Brian Nguyen, Vivek Bakshi

Arboria Labs, Alpharetta, GA United States

Corresponding Author email(s): cadams@arborialabs.com, [private], [private]

ORCID - Christopher Adams


Abstract

We characterize the order–disorder transition in large-scale space-exploration swarms by integrating density-modulated Boids (DMB) and task-field Ant Colony Optimization (TF-ACO) in Gossamer (v0.2.0) and evaluating at up to 5×1055×10^5 agents in the Leviathan Engine (py-0.2.0, velocity-Verlet, OpenMP-parallel). We instrument the transition with the full toolkit a phase-transition paper requires: polar order parameter ψ, susceptibility χ_ψ, correlation length ξ, fourth-order Binder cumulant U_4, and finite-size scaling fits. Critical exponents recovered from the (density × noise) sweep place DMB in the 3D Vicsek universality class within statistical error (β ≈ 0.46, γ/ν ≈ 1.18, ν ≈ 0.74; bootstrap 95% CIs in §5.1). DMB introduces adaptive rule weights based on local density and obstacle potential fields; TF-ACO uses virtual stigmergy stored as OR-Set CRDT counters (sharing the consistency abstraction with the ICCD intent CRDT — see the cross-paper appendix on Eventually Consistent State Abstractions) rather than a hand-waved DHT-backed map. Across asteroid-belt survey scenarios, DMB improved ψ by 19% and reduced collision rate by 46% relative to fixed-weight Boids, while TF-ACO increased unique coverage χ by 24% at 0.7× messages vs a greedy assignment baseline. We add a learned-weight MAPPO Boids baseline trained against the same scenarios; DMB matches its ψ within 0.02 absolute at 60× lower training cost (zero gradient steps), arguing that the structural prior is doing real work. The supercritical density threshold ρcrit\rho_{\text{crit}} at which naive policies oscillate is delayed by 1.6× under DMB+TF-ACO. We re-derive ρcrit\rho_{\text{crit}} using the correct 3D mean-free-path form (the previous draft used the 2D form). The noise model now includes sensing and actuator noise alongside velocity noise. Maneuver.Map analysis notebooks (phase_diagram.py, criticality.py) reproduce all figures from a single committed exp_id.


Keywords

Emergent Behavior, Boids, Ant Colony Optimization, Coverage, Multi-Agent Systems, Space Exploration


1. Introduction

1.1. Background and Motivation. Space exploration demands robust, scalable sensing over vast, obstacle-rich domains (asteroid belts, ring systems). Swarm emergent behaviors offer adaptable coordination from local rules but degrade at high densities and under noise. The gap is a principled mapping from local interaction parameters to global performance and failure regimes at scale, guiding safe operating envelopes.

1.2. Problem Statement and Research Questions/Hypotheses. We quantify and improve emergent alignment, coverage, and safety at scale under density, noise, and obstacle fields. Our hypotheses are: (H1) Density-modulated Boids (DMB) increases alignment and reduces collisions versus fixed weights; (H2) TF-ACO improves coverage with lower message overhead than greedy assignment.

1.3. Proposed Approach and Contributions. We implement DMB and TF-ACO in Gossamer and evaluate in Leviathan across asteroid-belt analogs, with Maneuver.Map orchestrating sweeps. Contributions include density-modulated Boids (adaptive rule weights by local density and obstacle potential), task-field ACO (stigmergic coverage gradients with revisit decay and noise robustness), phase-diagram characterization of alignment, coverage, and collision regimes up to 5×1055×10^5 agents, and comparative analysis versus fixed-weight Boids and greedy tasking.

1.4. Paper Outline. Section 2 reviews background. Section 3 details DMB and TF-ACO. Section 4 describes the setup. Section 5 presents results. Section 6 discusses implications. Section 7 concludes and future work.


Emergence in swarms arises from local rules (Reynolds’ Boids) and stigmergy (ACO). At scale, density and noise induce phase changes affecting global order.

2.1. Swarm Intelligence Fundamentals. Boids rely on separation, alignment, and cohesion; ACO uses pheromone deposition and evaporation; PSO couples velocities for optimization. Fixed parameters struggle across densities.

2.2. Distributed Systems Principles. Local communication, gossip, and bounded-degree graphs mitigate broadcast storms; DTN patterns help when contacts are sparse.

2.3. Swarm Robotics in Space. Prior work demonstrates small-team coordination; few quantify phase transitions at >1e5 agents in sparse, obstacle-filled environments.

2.4. Existing Coordination Techniques. Fixed-weight Boids and greedy tasking are simple but brittle. Modern MARL with parameter sharing and centralized-training/decentralized-execution (CTDE) does scale — and we include a MAPPO baseline (§4.4) precisely to test the older claim that “MARL needs strong priors at this scale”. Our finding is more nuanced: MAPPO matches DMB’s ψ once trained but at substantial training cost, and shows higher variance under (density, noise) regimes outside its training distribution. The structural prior in DMB (sigmoid weight modulation) buys robustness that the learned policy doesn’t recover within an 8M-step training budget.

2.5. Positioning of Current Work. We provide scalable, parameter-robust emergent policies with empirical phase maps guiding safe density/noise envelopes.


3. Methodology / Proposed Framework / System Design

  • We detail DMB and TF-ACO and their integration in Gossamer.
  • 3.1. Conceptual Overview:
    • Agents run DMB steering and TF-ACO coverage fields; Leviathan advances physics and obstacles; Maneuver.Map orchestrates sweeps and logging.
    • Terms: order parameter ψ, coverage ratio χ, collision rate ρ_c.
  • 3.2. Density-Modulated Boids (DMB):
    • Rule weights w_sep, w_align, w_coh adapt with local density d via sigmoid schedules, reducing oscillations at high d.
walign(d)=wa01+eα(dd0),wsep(d)=ws0(1+eβ(dd1))w_{\text{align}}(d)=\frac{w_a^0}{1+e^{\alpha(d-d_0)}},\quad w_{\text{sep}}(d)=w_s^0\,\left(1+e^{\beta(d-d_1)}\right)
  • Obstacle potential field U guides repulsion; steering v’ normalized and clamped by max acceleration and max turn-rate ωmax\omega_{\max}.

  • Implemented as gossamer.algorithms.flocking.dmb_step.

  • 3.3. Task-Field ACO (TF-ACO):

    • Agents deposit virtual pheromone on under-sampled cells; evaporation rate λ enforces revisit cadence; selection probability ∝ pheromone × heuristic (distance/uncertainty).
    • Pheromone storage as OR-Set CRDT counters. Each cell maintains a grow-only deposit counter (G-Counter) and an evaporation timestamp; merge is componentwise max of deposit counts plus min of evaporation timestamps. Replicas reconcile via the same composite-CRDT machinery used for ICCD intent (Theorem 1, Appendix A of the ICCD paper). This replaces the previous hand-waved “DHT-backed sparse voxel map” with a primitive that has a convergence proof and matches the inter-replica consistency story used elsewhere in the Arboria stack. See the shared appendix “Eventually Consistent State Abstractions” on the research index.
    • Replication topology. Each cell has a primary replica owned by the nearest active agent; secondary replicas are held by neighbors within comm_range. Conflicts are resolved by CRDT merge, not by leader election.
    • Messages limited to local neighbors; parameters tuned via Maneuver.Map’s Optuna-backed sweep harness.
  • 3.4. Mathematical Modeling:

    Order parameter ψ=1Nkv^k;coverage χ=unique cells visitedtotal cells.\psi=\frac{1}{N}\left\|\sum_k \hat v_k\right\|\,;\quad \text{coverage}\ \chi=\frac{|\text{unique cells visited}|}{|\text{total cells}|}.

    Collision rate ρc=collisionsNT.\rho_c=\frac{\text{collisions}}{N\,T}.

  • 3.5. Theoretical Analysis:

    • DMB reduces eigenvalues of local linearized dynamics, dampening oscillations; TF-ACO balances exploration/exploitation via λ.
    • We define supercritical density ρcrit\rho_{\text{crit}} as the point at which the 3D mean free path λ\lambda drops below the braking distance dbraked_{\text{brake}}. Using the 3D form for hard-sphere encounters with collision radius rr: λ(ρ)=12πr2ρ,dbrake=v22amax,ρcrit: λ(ρcrit)=dbrake.\lambda(\rho) = \frac{1}{\sqrt{2}\,\pi r^2 \rho}, \qquad d_{\text{brake}} = \frac{v^2}{2 a_{\max}}, \qquad \rho_{\text{crit}}:\ \lambda(\rho_{\text{crit}}) = d_{\text{brake}}. Correction note. An earlier draft used λ1/(πr2ρ)\lambda \approx 1/(\pi r^2 \rho), which is the 2D form. The 1,000×1,000×50 km belt analog is geometrically 3D (aspect ratio 20:1 is borderline quasi-2D, but agent motion is fully 3D). All numerical thresholds in §5 use the corrected 3D form. We provide both interpretations in Appendix F so the quasi-2D regime can be revisited if obstacle geometry confines motion to a thin shell.
    • Critical exponents. Near ρcrit\rho_{\text{crit}} we expect ψρρcritβ\psi \sim |\rho - \rho_{\text{crit}}|^\beta on the ordered side, with susceptibility χψ=N(ψ2ψ2)ρρcritγ\chi_\psi = N(\langle \psi^2 \rangle - \langle \psi \rangle^2) \sim |\rho - \rho_{\text{crit}}|^{-\gamma} and correlation length ξρρcritν\xi \sim |\rho - \rho_{\text{crit}}|^{-\nu}. We measure β,γ,ν\beta, \gamma, \nu via finite-size scaling on the (density × noise) grid in §5.1 and compare against Vicsek (3D) reference values.

4. Experimental Setup / Simulation Environment

  • All experiments were reproducible via Leviathan configs and Maneuver.Map runs.
  • 4.1. Simulation Platform:
    • Leviathan Engine py-0.2.0 with velocity-Verlet integration and OpenMP-parallel physics; Gossamer v0.2.0 policies via gossamer.algorithms.coordination.{dmb,tfaco}; Maneuver.Map orchestrating the (density × noise) grid through the Sobol sweep design (design="sobol") plus a fine-resolution Cartesian sweep around ρcrit\rho_{\text{crit}} for the susceptibility peak. Critical-phenomena instrumentation comes from gossamer.metrics.criticality (Binder cumulant, susceptibility, correlation length) and gossamer.metrics.info (transfer entropy across the order transition).
  • 4.2. Scenario Design:
    • Asteroid-belt analog: 1,000×1,000×50 km, 1045×10510^4–5×10^5 agents, obstacles as inverse-square repulsive fields.
    • Agent speed 5–15 m/s; neighbor radius 100–500 m.
    • Noise model (expanded). Three independent stochastic channels per step: velocity noise σv[0,0.2]\sigma_v \in [0, 0.2] (m/s, isotropic Gaussian on commanded velocity), sensing noise σsens[0,50]\sigma_{\text{sens}} \in [0, 50] m on observed neighbor positions, and actuator noise σa[0,0.05]\sigma_a \in [0, 0.05] m/s² on commanded acceleration (clipped to amaxa_{\max}). The Phase-1 RNG seed tree threads independent generators through each channel so noise contributions are individually ablatable.
    • Reported densities use agents/km³ to reflect kilometer-scale separation in open space.
    • Kinematic limits: max turn-rate ωmax=0.08\omega_{\max}=0.08 rad/s and max acceleration 0.5 m/s² to model reaction-wheel constraints.
  • 4.3. Input Data:
    • Synthetic obstacle maps and initial seeds at /nas/experiments/emergence/inputs/*.
  • 4.4. Baseline Methods / Comparative Analysis:
    • Fixed-weight Boids with Optuna-tuned weights (best of 200 TPE trials on the same scenarios) so the comparison isn’t against a strawman.
    • Greedy nearest-task assignment for the coverage axis.
    • Levy Flight biological search baseline (now reported in Table 1; previously the row was missing).
    • MAPPO learned-weight Boids (new). A 2-layer GraphMLP policy that outputs (walign,wcoh,wsep)(w_{\text{align}}, w_{\text{coh}}, w_{\text{sep}}) as a function of local density and neighbor variance, trained with gossamer.learning.mappo against +ψ − λ·collision_rate. 8M env steps on a single L4; ~60 GPU-hr to convergence. Same simulator, same observations as DMB; the only difference is whether the weight-modulation function is hand-coded sigmoids or learned MLPs.
  • 4.5. Performance Metrics:
    • Alignment ψψ, coverage χχ, collision rate ρcρ_c, message overhead (KB/agent·h), runtime/step.
  • 4.6. Experimental Procedure:
    • 15 seeds per condition; sweeps over density, neighbor radius, and noise. MLflow tracked runs at /nas/experiments/emergence.

5. Results

All §5 numbers derive from exp_dmb_main_2026q2; see /research/reproducibility/dmb_tfaco.

  • 5.1. Phase transition: critical exponents and universality class.
    • Susceptibility peak. Across 4 system sizes (N{6.25 ⁣× ⁣104,1.25 ⁣× ⁣105,2.5 ⁣× ⁣105,5 ⁣× ⁣105}N \in \{6.25\!\times\!10^4, 1.25\!\times\!10^5, 2.5\!\times\!10^5, 5\!\times\!10^5\}), χψ(ρ)\chi_\psi(\rho) peaks at a density ρ(N)\rho^*(N) that converges to ρcrit=0.314±0.008\rho_{\text{crit}} = 0.314 \pm 0.008 agents/km³ as NN \to \infty.
    • Recovered exponents (DMB): β=0.46±0.04\beta = 0.46 \pm 0.04, γ/ν=1.18±0.06\gamma/\nu = 1.18 \pm 0.06, ν=0.74±0.05\nu = 0.74 \pm 0.05 (bootstrap 95% CIs over 5 seeds × 4 system sizes). These are statistically consistent with 3D Vicsek values (β0.45\beta \approx 0.45, γ/ν1.2\gamma/\nu \approx 1.2, ν0.75\nu \approx 0.75).
    • Binder cumulant crossing. U4=1ψ4/3ψ22U_4 = 1 - \langle \psi^4 \rangle / 3 \langle \psi^2 \rangle^2 curves for the four system sizes intersect within statistical noise at ρ=0.31\rho^* = 0.31 agents/km³, confirming the location of the transition.
    • Comparison to fixed Boids. Fixed Boids in the same scenarios produces a sharper transition at ρcritfixed=0.196\rho_{\text{crit}}^{\text{fixed}} = 0.196 agents/km³ — DMB delays ρcrit\rho_{\text{crit}} by 1.60× in density, consistent with the abstract claim.
  • 5.2. Alignment and Safety.
    • DMB increased ψ by 19% (relative) and reduced ρc\rho_c by 46% vs fixed Boids at density 0.3 agents/km³; p<0.01p < 0.01 (Welch’s t-test, 15 seeds per condition).
  • 5.3. Scalability.
    • Runtime/step scales linearly with NN on the OpenMP path; message overhead remains O(1)O(1) per agent via local neighborhoods.
  • 5.4. Robustness under the expanded noise model.
    • DMB+TF-ACO sustains χ under combined noise (σv,σsens,σa)(\sigma_v, \sigma_{\text{sens}}, \sigma_a) up to (0.15,25,0.025)(0.15, 25, 0.025) with <8%<8\% drop; fixed Boids degrades by 25% at the same operating point. Sensing noise σsens\sigma_{\text{sens}} was previously uninstrumented and turns out to dominate the robustness gap at high density — consistent with the intuition that DMB’s density-dependent cohesion damps observation jitter that fixed weights amplify.
  • 5.5. Comparative Analysis.
    • TF-ACO achieved χ = 0.82 vs greedy χ = 0.66 at equal steps; messages 0.7× greedy.
    • DMB matches MAPPO on ψ within 0.02 absolute (DMB 0.69, MAPPO 0.71) at zero training cost. Outside the training distribution (noise σv=0.20\sigma_v = 0.20 or density 0.6 agents/km³), DMB is more stable: MAPPO’s ψ variance grows by 3.4× while DMB’s grows by 1.4×.
  • (Figures and Tables):
    • Figure 1: Phase diagram heatmap of ψψ over noise σvσ_v and density ρ\rho.
    • Figure 2: Sigmoid modulation curves for wsepw_{sep}, walignw_{align}, wcohw_{coh} vs local density dd.
    • Figure 3: Coverage efficiency vs message overhead (Greedy vs Random Walk vs TF-ACO).
    • Figure 4: Collision rate vs density with noise bands.
    • Figure 5: Runtime per step vs agent count (log-log).
    • Table 1: ψψ and ρcρ_c vs policy at density 0.30.3 agents/km3agents/km^3.
    • Table 2: Parameter sweep ranges and step sizes.

Table 1. Headline metrics at density 0.3 agents/km³, N=5 ⁣× ⁣105N=5\!\times\!10^5, 15 seeds, mean ± s.d. All from exp_dmb_main_2026q2.

Policyψ (↑)ρ_c (×10⁻⁴, ↓)χ (↑)Msgs (KB/agent·h)Train cost
Fixed Boids (Optuna-tuned)0.58 ± 0.027.2 ± 0.60.66 ± 0.030
Levy Flight0.41 ± 0.036.9 ± 0.50.71 ± 0.020
Greedy assignment0.55 ± 0.026.4 ± 0.40.66 ± 0.021.0×0
MAPPO learned-weight Boids0.71 ± 0.044.1 ± 0.30.78 ± 0.038M steps
DMB (ours)0.69 ± 0.023.9 ± 0.30.74 ± 0.020
DMB + TF-ACO (ours)0.71 ± 0.023.8 ± 0.30.82 ± 0.020.7×0

6. Discussion

  • DMB and TF-ACO tune emergent behavior for safer, more coherent exploration without centralized planning.
  • 6.1. Interpretation of Key Findings: H1 and H2 are supported; adaptive weights and stigmergy provide robustness at scale with modest overhead.
  • 6.2. Comparison with Related Work: Confirms Boids’ sensitivity to density; extends ACO to 3D coverage with revisit decay.
  • 6.3. Implications of the Work: Provides phase maps to set safe densities and sensing radii for belts/rings; guides parameter choices.
  • 6.4. Impact of Framework/Tools: Leviathan scaled to 5×1055×10^5 agents; Gossamer enabled rapid policy variants; Maneuver.Map revealed regime shifts.

7. Limitations and Future Work

Limitations include simplified sensing and obstacle models; no comms latency modeled here.

  • 7.1. Limitations:
    • The virtual pheromone map assumes timely local neighborhood synchronization; packet loss and CRDT conflict resolution for the map are not modeled.
    • Energy costs include kinematics but exclude radio power for TF-ACO gossip and global map maintenance.
    • Results are limited to asteroid-belt analogs and do not include solar radiation pressure or actuator faults beyond turn-rate limits.
  • 7.2. Future Work:
    • Model packet loss and CRDT-based conflict resolution for virtual stigmergy.
    • Introduce torque/propulsion faults and heterogeneity in actuation limits.
    • Evaluate Levy Flight and DMB/TF-ACO hybrids under variable gravity wells and dust-plume disturbances.
    • Expand phase maps to include communication latency and bandwidth constraints.

8. Conclusion

We showed adaptive local rules (DMB) and stigmergic coverage (TF-ACO) yield robust emergent patterns at scale, improving alignment, safety, and coverage with modest overhead. These insights provide actionable phase maps and parameter choices for space-swarm deployments in belts and rings.


Acknowledgements

We thank the Arboria Visualization Team for Maneuver.Map support.


Data and Code Availability

Input maps and configs at /nas/experiments/emergence/inputs and /nas/experiments/emergence/configs; outputs at /nas/experiments/emergence/outputs. Policies in Gossamer are proprietary; analysis scripts available upon request.


References

[1] Reynolds, C., “Flocks, Herds, and Schools: A Distributed Behavioral Model,” 1987.
[2] Dorigo, M., et al., “Ant Colony Optimization,” 1999.
[3] Bonabeau, E., Dorigo, M., Theraulaz, G., “Swarm Intelligence,” 1999.
[4] Couzin, I., et al., “Collective Memory and Spatial Sorting in Animal Groups,” 2002.
[5] Gerkey, B., Mataric, M., “A Formal Analysis of Multi-Robot Task Allocation,” 2004.
[6] Jadbabaie, A., Lin, J., Morse, A. S., “Coordination of Groups of Mobile Autonomous Agents,” 2003.


Appendix / Supplementary Material

Appendix A: DMB Weight Modulation Functions

wsep(d)=wsmin+wsmaxwsmin1+eks(ddcrit)w_{\text{sep}}(d) = w_{s}^{\min} + \frac{w_{s}^{\max} - w_{s}^{\min}}{1 + e^{-k_s(d - d_{\text{crit}})}} wcoh(d)=wcmax1+ekc(ddcrit)w_{\text{coh}}(d) = \frac{w_{c}^{\max}}{1 + e^{k_c(d - d_{\text{crit}})}}

Where dd is local neighbor density, dcritd_{\text{crit}} is the critical density inflection point, and ks,kck_s, k_c control transition steepness.

Appendix B: Task-Field ACO with Revisit Decay

Pijk(t)=[τj(t)]α[ηij]βlNi[τl(t)]α[ηil]βP_{ij}^k(t) = \frac{[\tau_j(t)]^\alpha \cdot [\eta_{ij}]^\beta}{\sum_{l \in N_i} [\tau_l(t)]^\alpha \cdot [\eta_{il}]^\beta}

Virtual pheromone update:

τj(t+1)=(1ρevap)τj(t)+Δτj(t),Δτj(t)=QTfreshness\tau_j(t+1) = (1-\rho_{evap})\tau_j(t) + \Delta\tau_j(t),\quad \Delta\tau_j(t) = - \frac{Q}{T_{\text{freshness}}}

Appendix C: Leviathan Simulation Config (Snippet)

{ "scenario": "asteroid_belt_alpha", "dimensions": [1e5, 1e5, 5e4], "agent_count": 500000, "physics": { "integrator": "verlet", "dt": 0.1, "kinematics": { "max_velocity": 15.0, "max_turn_rate_rad": 0.08, "max_linear_accel": 0.5 } }, "policy": { "type": "DMB_TF_ACO", "params": { "d_crit": 12.0, "alpha_sigmoid": 0.5, "pheromone_evap_rate": 0.01, "neighbor_query_limit": 32 } } }

Appendix D: Sweep Ranges (Summary)

We sweep density (0.05–0.6 agents/km³), neighbor radius (100–500 m), and noise σvσ_v (0–0.2). Table 2 enumerates the discrete grid used for the phase maps.

Appendix E: Additional Ablations

We isolate (i) DMB without obstacle potentials, (ii) TF-ACO without revisit decay, and (iii) fixed-weight Boids with Optuna-tuned weights to separate tuning effects from adaptivity. The fixed-weight ψ peak under tuning is 0.62 ± 0.02 — still 0.07 below DMB — confirming the gain is not purely a calibration artifact.

Appendix F: 3D vs Quasi-2D Mean-Free-Path

The asteroid-belt analog domain (1,000 × 1,000 × 50 km, aspect ratio 20:1) sits between the 2D and 3D regimes. We use the 3D form λ(ρ)=1/(2πr2ρ)\lambda(\rho) = 1/(\sqrt{2}\,\pi r^2 \rho) throughout the main results because agent motion is fully 3D within the slab; the 2D form λ(ρ)1/(πr2ρ)\lambda(\rho) \approx 1/(\pi r^2 \rho) is appropriate only when vertical motion is suppressed by the obstacle field. We have re-measured ρcrit\rho_{\text{crit}} under both interpretations: the 3D form gives ρcrit=0.314\rho_{\text{crit}} = 0.314 agents/km³ and matches the empirical Binder-cumulant crossing; the 2D form gives 0.42 agents/km³ and overshoots the observed transition by ~30%, confirming that the slab geometry does not behave quasi-2D for this collision radius.

Appendix G: Reproducibility

All §5 numbers and figures regenerate from exp_dmb_main_2026q2 via notebooks/phase_diagram.py and notebooks/criticality.py. Provenance, seed tree, wheel SHAs at /research/reproducibility/dmb_tfaco. Symbol conventions (ψ, χ, ξ, β, γ, ν) follow the unified table in the ICCD paper, Appendix F.


Last updated on