Researchers create 3D-GPT, an AI that crafts virtual worlds using ChatGPT-like prompts

Preview of a scene generated by 3D-GPT AI

Researchers have designed an artificial intelligence model capable of generating 3D worlds from simple textual commands, akin to ChatGPT.

Teams from the Australian National University, the University of Oxford, and the Beijing Academy of Artificial Intelligence have created 3D-GPT, a system that models 3D visuals based on textual descriptions provided by the user.

Outlined in a research paper on arXiv, this AI can produce 3D assets efficiently and intuitively, far surpassing traditional 3D modeling methods.

By leveraging multiple artificial intelligence agents, 3D-GPT distributes modeling tasks, assigning each agent a distinct function.

Diagram showing the functioning of AI agents used by 3D-GPT

As explained in the paper, a primary dispatch agent analyzes textual instructions, deducing required functions.

A second conceptualization agent refines any prominent or vague details in the initial description.

Finally, a modeling agent generates Python code to operate the 3D software Blender to bring the 3D models in alignment with the description.

The outcome? With a prompt like “a misty spring morning with dew-covered flowers in a lush meadow bordered by budding trees,” 3D-GPT manages to craft a complete 3D scene truly reflecting the provided text.

A promising potential for the 3D world

While the visuals aren’t photorealistic, the outcomes are nothing short of impressive, showcasing how such an approach could simplify 3D content creation. Additionally, the modular architecture of 3D-GPT allows individual improvement of each AI agent, paving the way for ongoing enhancements.

Another advantage of 3D-GPT is its ability to generate code to control existing 3D software, like Blender, instead of building models from scratch.

Such a project could revolutionize the 3D modeling industry by making the creation process more accessible. In the age of metaverses, which are still striving to gain traction, where 3D content plays an essential role, tools like 3D-GPT could prove invaluable.

It’s evident that such a system could also find its place in the video game industry or virtual reality sector.

Still, despite its promising potential, the 3D-GPT system, still under development, has its limitations. Its creators present it as a framework highlighting the potential of large language models (LLM) in 3D modeling.

The 3D-GPT code will be shared on GitHub by its authors after the research paper’s acceptance.

Also read on The Coding Love: