2026-05-25datacode

CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities

Junyuan Liu, Xinglei Wang, Zichao Zeng, Jiazhuang Feng, Quan Qin, Ilya Ilyankou, Guangsheng Dong, Tao Cheng

PDF preview unavailable

Key claim

CityRep enables fair evaluation of urban representation models.

CityRep is a new benchmark for evaluating urban representation learning that mitigates spatial leakage and supports fair comparisons across different cities and tasks. The key finding is that performance varies significantly based on the evaluation split used, highlighting the importance of rigorous benchmarking in this field.

In plain English

Novelty

8.0/10

CityRep introduces a comprehensive evaluation framework for urban representation learning, addressing limitations in current benchmarks.

Reliability

8.0/10

The study evaluates multiple models across various cities and tasks, providing solid evidence for its claims.

Deep reliability assessment

The methodology supports a more robust evaluation of urban representations by using spatially structured splits to mitigate spatial leakage, but it may overclaim generalizability across all urban contexts without considering local variations. The benchmark's extensibility and reproducibility are strong points, yet the results may not fully capture the complexities of urban environments.

Reproducibility

Yes, the paper provides open source code and datasets for the CityRep benchmark.

Discussion questions

1.How might the findings change if evaluated in cities with significantly different urban structures or data availability?
2.What are the implications of using this benchmark for urban planning and policy-making in diverse geographical contexts?
3.What would be the impact on the results if a different spatial split methodology was employed?

Key figure

Figure 1 illustrates the framework of the CityRep Benchmark, showing how it standardizes the evaluation of heterogeneous urban representations across multiple cities and tasks using spatial block splits.

Benchmark results

CityRepF1: 0.346vs TESSERAN/ASOTA

CityRepR2: 0.631vs TESSERAN/ASOTA

CityRepR2: 0.695vs TESSERAN/ASOTA

GitHub1 repo

inwind0212/CityRepOfficial