physical-intelligence/fast · Reproduce the reconstruction result in Fig. 12

22 days ago

Hi, I am trying to reproduce the LIBERO result in Fig. 12. I first normalized the action to [-1, 1] with 1st and 99th quantile. Then, I inputted the action with a chunk size of 10. The reconstruction error is about 8e-3. I am wondering how to reproduce the 1e-3 error result in Fig. 12 for LIBERO dataset. Thanks!

kylestach99

21 days ago

Hi @kiddyna - I'm not able to reproduce your error with the following code applied to the raw LIBERO dataset:

import os
from transformers import AutoProcessor
import numpy as np
import h5py

fast_processor = AutoProcessor.from_pretrained("physical-intelligence/fast", trust_remote_code=True)

chunk_size = 10
actions_extracted = []

LIBERO_DIR = "/data/libero"

# Extract action chunks
for base_dir in ["libero_90", "libero_10", "libero_goal", "libero_object", "libero_spatial"]:
    base_dir = os.path.join(LIBERO_DIR, base_dir)
    for filename in os.listdir(base_dir):
        with h5py.File(os.path.join(base_dir, filename), "r") as datafile:
            for demo in datafile['data']:
                demo = datafile['data'][demo]
                actions = demo['actions']
                for t in range(actions.shape[0] - chunk_size):
                    actions_extracted.append(actions[t:t+chunk_size])

actions_extracted = np.array(actions_extracted)

# Normalization
p01 = np.percentile(actions_extracted, 1, axis=(0, 1))
p99 = np.percentile(actions_extracted, 99, axis=(0, 1))
actions_extracted_normalized = 2 * (actions_extracted - p01) / (p99 - p01) - 1

# Only test on a subset of the data, this is awfully slow otherwise
test_set = actions_extracted_normalized[::100]
tokens_encoded = fast_processor(test_set)
chunk = fast_processor.decode(tokens_encoded)
print("Reconstruction MSE:", np.square((chunk - test_set)).mean())

I get reconstruction MSE of 0.00041, much lower than you're seeing. Would you mind sending me the code you used to extract the chunks, or a minimal reproduction of the issue?

kiddyna

21 days ago

Thanks for your reply! I was using L1 loss instead of L2 loss. If using L2 loss, the reconstruction result matches with the reported result in Fig. 12.

kiddyna changed discussion status to closed 21 days ago