Sergey Levine @svlevine, Twitter Profile

Sergey Levine @svlevine

2 years ago

RL is supposed to be slow and inefficient, right? Turns out that carefully implemented model-free RL can learn to walk from scratch in the real world in under 20 minutes! We took our robot for a "walk in the park" to test out our implementation. sites.google.com/berkeley.edu/w… thread->

8 137 715 0 111

Download Gif

Sergey Levine @svlevine

2 years ago

Careful implementation of actor-critic methods can train very fast if we set up the task properly. We trained the robot entirely in the real world in both indoor and outdoor locations, each time learning policies in ~20 min.

1 1 34 0 0

Download Gif

Sergey Levine @svlevine

2 years ago

Here are some examples of training (more videos on the website). Note that the policy on each terrain is different, dealing with dense mulch, soft surfaces, etc. With these training speeds, the robot adapts in real time.

3 2 21 0 0

Download Gif

Sergey Levine @svlevine

2 years ago

By Laura Smith & @ikostrikov Arxiv: arxiv.org/abs/2208.07860 Website: sites.google.com/berkeley.edu/w… From here, exploring lifelong learning, continual adaptation, and other cool real world RL applications will be really exciting!

2 9 62 0 15

Sergey Levine @svlevine

2 years ago

BTW, much as I want to say we had some brilliant idea that made this possible, truth is that the key is really just good implementation, so the takeaway is "RL done right works pretty well". Though I am *very* impressed how well @ikostrikov & Laura made this work.

5 7 96 0 10

Not A Robot @dav_ell

2 years ago

@svlevine @ikostrikov Can you comment on examples of what's kinds of things were implemented better than traditional codebases (e.g. stable_baselines3)?

1 0 0 0 0

Sergey Levine @svlevine

2 years ago

@dav_ell @ikostrikov We'll get the code released shortly so you can see for yourself :) but there is a bit of discussion in the arxiv paper

1 0 3 0 0

Ilya Kostrikov @ikostrikov

2 years ago

@svlevine @dav_ell We've partially released our code here: github.com/ikostrikov/wal… The code for interfacing a real robot is coming soon.

0 1 6 0 0