VLA gen UI - Generative UI Global Hackathon: Agentic Interfaces
AI Tinkerers - Prague
Hackathon Showcase

VLA gen UI

Team consisting of CTU AI graduate and undergraduate researchers specializing in computer vision, C++ mixed-reality algorithms, and open-source generative AI prototyping.

2 members Watch Demo

VLA models are robot policies which execute plain language tasks like “pick up a cube”. Why should we be limited to text description of tasks? What if the model itself provided the UI for the actions that it can do. that’s what we are building.

This combines pi0.5 for the VLA model, Gemini Embodied Reasoning 1.6 to analyse the scene and identify actions. Finally a regular Gemini model to draw the UI to control these actions.

What you see is working app on a server gpu. I am unaware of anything similar in research or in practice and it wouldnt be possible a few weeks ago before the new gemini ER model.

Started from scratch, just had our experience

AI Tinkerers Google DeepMind Google ER model MuJoCO Tensor Ventures