Input-Envelope-Output: Auditable Generative Music Rewards in Sensory-Sensitive Contexts

Cong Ye1,†,*, Songlin Shang2,†, Xiaoxu Ma3, Xiangbo Zhang4
1Wenzhou-Kean University, 2University of Minnesota, 3Georgia Institute of Technology (ECE), 4Georgia Institute of Technology (Math)
Co-first author, *Corresponding Author

MusiBubbles, a web-based prototype, implements the I-E-O framework to provide auditable generative music rewards in sensory-sensitive contexts.

Abstract

Generative feedback in sensory-sensitive contexts poses a core design challenge: large individual differences in sensory tolerance make it difficult to sustain engagement without compromising safety. This tension is exemplified in autism spectrum disorder (ASD), where auditory sensitivities are common yet highly heterogeneous. Existing interactive music systems typically encode safety implicitly within direct input-output (I-O) mappings, which can preserve novelty but make system behavior hard to predict or audit. We instead propose a constraint-first Input-Envelope-Output (I-E-O) framework that makes safety explicit and verifiable while preserving action-output causality. I-E-O introduces a low-risk envelope layer between user input and audio output to specify safe bounds, enforce them deterministically, and log interventions for audit. From this architecture, we derive four verifiable design principles and instantiate them in MusiBubbles, a web-based prototype. Contributions include the I-E-O architecture, MusiBubbles as an exemplar implementation, and a reproducibility package to support adoption in ASD and other sensory-sensitive domains.

Framework

I-E-O Framework Diagram

We propose a constraint-first generative reward framework that interposes an explicit low-risk envelope between user input and audio output. We term this the Input–Envelope–Output (I–E–O) paradigm, in contrast to conventional I–O mappings. The envelope declares conservative bounds on engine parameters (tempo, gain, accent ratio), enforces them via deterministic clamping, and logs all interventions for post-hoc audit.

MusiBubbles Prototype

MusiBubbles Interface

MusiBubbles is a reference implementation of the I–E–O framework. The game presents a web-based bubble popping interface inspired by tablet motor training paradigms, featuring five vertical lanes mapping to C Major Pentatonic (C--D--E--G--A) to reduce dissonance risk. The system includes an expert panel for real-time monitoring of session parameters and enforcement status.

Evaluation

Evaluation Results

We evaluate envelope enforcement using N=660 synthetic actionTrace samples under a paired comparison design. The results show that post-enforcement values remain within their declared bounds, and the distributions of signal metrics shift monotonically with constraint tightness, demonstrating the effectiveness and tunability of the I-E-O framework.

BibTeX

@inproceedings{ye2026input,
  title={Input-Envelope-Output: Auditable Generative Music Rewards in Sensory-Sensitive Contexts},
  author={Ye, Cong and Shang, Songlin and Ma, Xiaoxu and Zhang, Xiangbo},
  booktitle={Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems},
  year={2026},
  series = {CHI EA '26},
  publisher = {ACM},
  address = {Barcelona, Spain},
  note = {Poster Track},
  doi = {10.48550/arXiv.2602.22813},
  url = {https://arxiv.org/abs/2602.22813}
}