Reproduce the reconstruction result in Fig. 12
Hi, I am trying to reproduce the LIBERO result in Fig. 12. I first normalized the action to [-1, 1] with 1st and 99th quantile. Then, I inputted the action with a chunk size of 10. The reconstruction error is about 8e-3. I am wondering how to reproduce the 1e-3 error result in Fig. 12 for LIBERO dataset. Thanks!
Hi @kiddyna - I'm not able to reproduce your error with the following code applied to the raw LIBERO dataset:
import os
from transformers import AutoProcessor
import numpy as np
import h5py
fast_processor = AutoProcessor.from_pretrained("physical-intelligence/fast", trust_remote_code=True)
chunk_size = 10
actions_extracted = []
LIBERO_DIR = "/data/libero"
# Extract action chunks
for base_dir in ["libero_90", "libero_10", "libero_goal", "libero_object", "libero_spatial"]:
base_dir = os.path.join(LIBERO_DIR, base_dir)
for filename in os.listdir(base_dir):
with h5py.File(os.path.join(base_dir, filename), "r") as datafile:
for demo in datafile['data']:
demo = datafile['data'][demo]
actions = demo['actions']
for t in range(actions.shape[0] - chunk_size):
actions_extracted.append(actions[t:t+chunk_size])
actions_extracted = np.array(actions_extracted)
# Normalization
p01 = np.percentile(actions_extracted, 1, axis=(0, 1))
p99 = np.percentile(actions_extracted, 99, axis=(0, 1))
actions_extracted_normalized = 2 * (actions_extracted - p01) / (p99 - p01) - 1
# Only test on a subset of the data, this is awfully slow otherwise
test_set = actions_extracted_normalized[::100]
tokens_encoded = fast_processor(test_set)
chunk = fast_processor.decode(tokens_encoded)
print("Reconstruction MSE:", np.square((chunk - test_set)).mean())
I get reconstruction MSE of 0.00041
, much lower than you're seeing. Would you mind sending me the code you used to extract the chunks, or a minimal reproduction of the issue?
Thanks for your reply! I was using L1 loss instead of L2 loss. If using L2 loss, the reconstruction result matches with the reported result in Fig. 12.