110

arXiv:2512.15743v1 Announce Type: new
Abstract: We present a framework for generating physically realizable assembly instructions from natural language descriptions. Unlike unconstrained text-to-3D approaches, our method operates within a discrete parts vocabulary, enforcing geometric validity, connection constraints, and buildability ordering. Using LDraw as a text-rich intermediate representation, we demonstrate that large language models can be guided with tools to produce valid step-by-step construction sequences and assembly instructions for brick-based prototypes of more than 3000 assembly parts. We introduce a Python library for programmatic model generation and evaluate buildable outputs on complex satellites, aircraft, and architectural domains. The approach aims for demonstrable scalability, modularity, and fidelity that bridges the gap between semantic design intent and manufacturable output. Physical prototyping follows from natural language specifications. The work proposes a novel elemental lingua franca as a key missing piece from the previous pixel-based diffusion methods or computer-aided design (CAD) models that fail to support complex assembly instructions or component exchange. Across four original designs, this novel "bag of bricks" method thus functions as a physical API: a constrained vocabulary connecting precisely oriented brick locations to a "bag of words" through which arbitrary functional requirements compile into material reality. Given such a consistent and repeatable AI representation opens new design options while guiding natural language implementations in manufacturing and engineering prototyping.