HackerRank-ASTRA: Evaluating Correctness & Consistency of Large Language Models on cross-domain multi-file project problems Paper • 2502.00226 • Published 12 days ago • 2