Smart buildings represent a significant trend in the future of the construction industry. The performance of human-computer interaction plays a vital role in achieving this from a human perspective. However, existing human-computer interaction algorithms are often limited to simple commands and fail to meet the complex and diverse needs of users. To address this issue, this paper introduces large language models (LLMs) and AI agents into smart buildings, proposing a general AI agent framework based on the ReAct strategy. The LLM serves as the system’s brain, responsible for reasoning and action planning, while tool calling mechanism puts the LLM’s plans into practice. Through this framework, developers can rely on prompt engineering alone to enable the LLM to interpret user intent accurately, perform appropriate actions, and manage conversation history effectively, without any pre-training or fine-tuning. To examine this framework, an experiment was conducted in a virtual building, which showed that the proposed agent successfully completed 91% of simulated tasks. Additionally, the agent was deployed on a single-board computer to control devices in a model building, demonstrating its effectiveness in the real world. The successful operation of the agent in this environment highlighted the potential applications of the proposed framework using existing IoT systems, providing a new perspective for the upgrading of human-computer interaction systems in smart buildings in the near future.
smart building; human-computer interaction; large language model; AI agent; reasoning and acting; Internet of Things