Skip to content

Commit 66b9ceb

Browse files
committed
added results
1 parent 0ad3335 commit 66b9ceb

File tree

3 files changed

+3853
-1
lines changed

3 files changed

+3853
-1
lines changed

index.html

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,39 @@ <h2 class="title is-3">Architecture</h2>
230230

231231
<!-- TODO: show use cases -->
232232

233-
<!-- TODO: show results -->
233+
<!-- TODO: explain results -->
234+
235+
<!-- Results -->
236+
<section class="section hero is-small">
237+
<div class="container is-max-desktop">
238+
<div class="columns is-centered has-text-centered">
239+
<div class="column">
240+
<div class="hero-body">
241+
<h2 class="title is-3">Results</h2>
242+
<img src="static/images/bar_plots_combined.svg" alt="Success rate bar plots."
243+
width="100%">
244+
<div class="content has-text-justified">
245+
Fig. 2: Success rate plots of different VLA comparisons.
246+
<i>Left:</i>
247+
The Pi Zero model fine-tuned on four datasets from different setups.
248+
Each fine-tuning dataset contains of less then 150 episodes and each model is evaluated on 50 rollouts.
249+
<i>Center:</i>
250+
Different models fine-tuned on 143 episodes on our FR3 setup (real) with a down-sampled frequency of 5Hz and evaluated on the real-world setup and the replicated simulated scene on 30 real-world and 100 simulated rollouts.
251+
<i>Bottom:</i>
252+
Different data mixes of synthetic and real data evaluated on the real-world setup and the simulated scene on 30 real-world and 100 simulated rollouts.
253+
The number denotes the amount of episodes from the respective domain used in the training mix.
254+
</div>
255+
<img src="static/images/success_rate_sim_real.svg" alt="Success rate plot over training checkpoints."
256+
width="100%">
257+
<div class="content has-text-justified">
258+
Fig. 3: Evaluation success rates measured for each checkpoint throughout the training process in the real and replicated simulated domain. Each checkpoint is evaluated on 20 real and 100 simulated rollouts. Left: Trained on 143 episodes on our FR3 dataset. Right: Trained on a mix of 143 episodes from our FR3 dataset and 500 episodes from the scripted dataset of the replicated simulated domain.
259+
</div>
260+
</div>
261+
</div>
262+
</div>
263+
</div>
264+
</section>
265+
<!-- End Results -->
234266

235267

236268

0 commit comments

Comments
 (0)