Spaces:
Running on CPU Upgrade
https://huggingface.co/spaces/gaia-benchmark/leaderboard is down
Greeting! I noticed an error on a leaderboard:
runtime error
Exit code: 1. Reason: 6.0k [00:00<00:00, 39.0kB/s]
Generating test split: 0%| | 0/4022 [00:00<?, ? examples/s][A
Generating test split: 100%|ββββββββββ| 4022/4022 [00:00<00:00, 917967.61 examples/s]
Generating validation split: 0%| | 0/236 [00:00<?, ? examples/s][A
Generating validation split: 100%|ββββββββββ| 236/236 [00:00<00:00, 125409.32 examples/s]
Map: 0%| | 0/3242 [00:00<?, ? examples/s][A
Map: 100%|ββββββββββ| 3242/3242 [00:00<00:00, 15401.86 examples/s]trust_remote_code is not supported anymore.
Please check that the Hugging Face dataset 'gaia-benchmark/GAIA_internal' isn't based on a loading script and remove trust_remote_code.
If the dataset is based on a loading script, please ask the dataset author to remove it and convert it to a standard format like Parquet.
GAIA_internal.py: 0%| | 0.00/4.27k [00:00<?, ?B/s][A
GAIA_internal.py: 100%|ββββββββββ| 4.27k/4.27k [00:00<00:00, 20.6MB/s]
Traceback (most recent call last):
File "/app/app.py", line 67, in
gold_dataset = load_dataset(INTERNAL_DATA_DATASET, f"{YEAR_VERSION}_all", token=TOKEN, trust_remote_code=True)
File "/usr/local/lib/python3.13/site-packages/datasets/load.py", line 1688, in load_dataset
builder_instance = load_dataset_builder(
path=path,
...<10 lines>...
**config_kwargs,
)
File "/usr/local/lib/python3.13/site-packages/datasets/load.py", line 1315, in load_dataset_builder
dataset_module = dataset_module_factory(
path,
...<5 lines>...
cache_dir=cache_dir,
)
File "/usr/local/lib/python3.13/site-packages/datasets/load.py", line 1207, in dataset_module_factory
raise e1 from None
File "/usr/local/lib/python3.13/site-packages/datasets/load.py", line 1167, in dataset_module_factory
raise RuntimeError(f"Dataset scripts are no longer supported, but found {filename}")
RuntimeError: Dataset scripts are no longer supported, but found GAIA_internal.py
Curious if anyone is facing those issues as well?
Thanks for flagging. Two PRs up that should fix this together:
- Space: https://huggingface.co/spaces/gaia-benchmark/leaderboard/discussions/98 (drops
trust_remote_code=Truefrom allload_datasetcalls) - Dataset: https://huggingface.co/datasets/gaia-benchmark/GAIA_internal/discussions/1 (converts
gaia-benchmark/GAIA_internalfrom a loading script to Parquet, mirroring the layout of the publicgaia-benchmark/GAIA)
Both need to land for the Space to boot under datasets 4.x. Verified locally that row counts match the leaderboard's reference (test=301, validation=165, plus the expected per-level breakdown). Will ping back here once merged.