Generative feedback in sensory-sensitive contexts poses a core design challenge: large individual differences in sensory tolerance make it difficult to sustain engagement without compromising safety. This tension is exemplified in autism spectrum disorder (ASD), where auditory sensitivities are common yet highly heterogeneous. Existing interactive music systems typically encode safety implicitly within direct input-output (I-O) mappings, which can preserve novelty but make system behavior hard to predict or audit. We instead propose a constraint-first Input-Envelope-Output (I-E-O) framework that makes safety explicit and verifiable while preserving action-output causality. I-E-O introduces a low-risk envelope layer between user input and audio output to specify safe bounds, enforce them deterministically, and log interventions for audit. From this architecture, we derive four verifiable design principles and instantiate them in MusiBubbles, a web-based prototype. Contributions include the I-E-O architecture, MusiBubbles as an exemplar implementation, and a reproducibility package to support adoption in ASD and other sensory-sensitive domains.
We propose a constraint-first generative reward framework that interposes an explicit low-risk envelope between user input and audio output. We term this the Input–Envelope–Output (I–E–O) paradigm, in contrast to conventional I–O mappings. The envelope declares conservative bounds on engine parameters (tempo, gain, accent ratio), enforces them via deterministic clamping, and logs all interventions for post-hoc audit.
MusiBubbles is a reference implementation of the I–E–O framework. The game presents a web-based bubble popping interface inspired by tablet motor training paradigms, featuring five vertical lanes mapping to C Major Pentatonic (C--D--E--G--A) to reduce dissonance risk. The system includes an expert panel for real-time monitoring of session parameters and enforcement status.
We evaluate envelope enforcement using N=660 synthetic actionTrace samples under a paired comparison design. The results show that post-enforcement values remain within their declared bounds, and the distributions of signal metrics shift monotonically with constraint tightness, demonstrating the effectiveness and tunability of the I-E-O framework.
@inproceedings{ye2026input,
title={Input-Envelope-Output: Auditable Generative Music Rewards in Sensory-Sensitive Contexts},
author={Ye, Cong and Shang, Songlin and Ma, Xiaoxu and Zhang, Xiangbo},
booktitle={Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems},
year={2026},
series = {CHI EA '26},
publisher = {ACM},
address = {Barcelona, Spain},
note = {Poster Track},
doi = {10.48550/arXiv.2602.22813},
url = {https://arxiv.org/abs/2602.22813}
}