Train a robotics AI model
How to train a robotics AI model with a dataset?
To train an AI model for your robot, you need a robotics dataset. For that, you need to have first recorded a dataset.
PLB/simple-lego-pickup-mono-2
Train GR00T-N1-2B in one click from the phosphobot dashboard
You can fine-tune Nvidia GR00T-N1-2B on your dataset right from the phosphobot dashboard. This is the easiest way to train a AI robotics model.
- Launch the phosphobot server and go to the phosphobot dashboard in your browser: http://localhost
- Create a phospho account or log in by clicking on the Sign in button in the top right corner.
- (If not already done) Add your Hugging Face token in the Admin Settings tab with Write authorization. This will sync your datasets to Hugging Face. Then, record a dataset using teleoperation. Read the full guide here.
Garbage in, garbage out. Our tests show that training works with about 30 episodes. It’s better for the task to be specific. Have good lighting and similar setup.
- In the AI Training and Control section, enter the the name of your dataset on Hugging Face (example:
PLB/simple-lego-pickup-mono-2
).
- Hit the Train AI Model button. Your model starts training. Training can take up to 3 hours. Follow the training using the button View trained models.
Your trained model is uploaded to HuggingFace on the phospho-app account. Its name is something like phospho-app/YOUR_DATASET_NAME-A_RANDOM_ID
.
Next up, you can start controlling your robot with the trained model.
How to train the ACT (Action Chunking Transformer) model with LeRobot?
The ACT model is a transformer-based model that learns to chunk actions in a sequence. It is trained on a dataset of action sequences and their corresponding chunked actions.
LeRobot is a research-oriented library by Hugging Face that provides a simple interface to train AI models. It is still a work in progress, but it is already very powerful.
Follow our guide to train the ACT model on your dataset.
Train your ACT model with LeRobot
Train your ACT model with LeRobot
How to train the ACT model on Replicate? Cloud-based training
Training ACT on your own machine can be hard. Video codecs, GPU acceleration, training time, and other factors can make it hard to train the model locally.
To help you, we provide a training script that you can run on the Replicate platform, a cloud service that provides GPU instances and scripts to train your AI models and run inference.
You’ll need to provide a Hugging Face dataset ID and token on which to train the policy.
Train your ACT model on Replicate
Train your ACT model on Replicate
How to train the Pi0 (Pi-Zero) model with the SO-100 robot arm?
Pi0 is a powerful VLA (Vision Language Action model) by Physical Intelligence. They released an open weight model that can be trained on your own dataset.
To train the Pi0 model, you need to use the openpi repository with a few tweaks for the SO-100 robot arm compatibility. We added support for the SO-100 arm in this fork of the openpi repository.
- Clone our custom openpi repository and install it:
- Install UV if you don’t have it already:
- Setup the environment using uv:
- (Optional) If you want to use Weights & Biases for tracking training metrics, log in with:
- Edit the config file
src/openpi/training/config.py
to change the dataset to your own.
By default, the training config uses the PLB/Orange-brick-in-black-box
SO-100 dataset.
PLB/Orange-brick-in-black-box
dataset, it’s “Put the orange brick in the black pot”- Start the training:
You need a GPU with at least 70GB of memory to train the pi0 model. We recommend using a A100 (80GB) or a H100.
- Once training is done, push your model to Hugging Face:
Next steps
Test the model you just trained on your robot. See the Use AI models page for more information.
Use a robotics AI model
Let a trained AI model control your robot
Was this page helpful?