← Back to feed
2026-05-25datacode

CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities

Junyuan Liu, Xinglei Wang, Zichao Zeng, Jiazhuang Feng, Quan Qin, Ilya Ilyankou, Guangsheng Dong, Tao Cheng

PDF preview unavailable
Read on arXiv →

Key claim

CityRep enables fair evaluation of urban representation models.

CityRep is a new benchmark for evaluating urban representation learning that mitigates spatial leakage and supports fair comparisons across different cities and tasks. The key finding is that performance varies significantly based on the evaluation split used, highlighting the importance of rigorous benchmarking in this field.

In plain English

CityRep is a new benchmark for evaluating urban representation learning that mitigates spatial leakage and supports fair comparisons across different cities and tasks. The key finding is that performance varies significantly based on the evaluation split used, highlighting the importance of rigorous benchmarking in this field.

Novelty
8.0/10

CityRep introduces a comprehensive evaluation framework for urban representation learning, addressing limitations in current benchmarks.

Reliability
8.0/10

The study evaluates multiple models across various cities and tasks, providing solid evidence for its claims.

Deep reliability assessment

The methodology supports a more robust evaluation of urban representations by using spatially structured splits to mitigate spatial leakage, but it may overclaim generalizability across all urban contexts without considering local variations. The benchmark's extensibility and reproducibility are strong points, yet the results may not fully capture the complexities of urban environments.

Reproducibility

Yes, the paper provides open source code and datasets for the CityRep benchmark.

Discussion questions

  1. 1.How might the findings change if evaluated in cities with significantly different urban structures or data availability?
  2. 2.What are the implications of using this benchmark for urban planning and policy-making in diverse geographical contexts?
  3. 3.What would be the impact on the results if a different spatial split methodology was employed?

Key figure

Figure 1 illustrates the framework of the CityRep Benchmark, showing how it standardizes the evaluation of heterogeneous urban representations across multiple cities and tasks using spatial block splits.

Benchmark results

CityRepF1: 0.346vs TESSERAN/ASOTA
CityRepR2: 0.631vs TESSERAN/ASOTA
CityRepR2: 0.695vs TESSERAN/ASOTA
GitHub1 repo
inwind0212/CityRepOfficial
CITYREP: A Unified Benchmark for Urban Representations Across Cities, Tasks, and Modalities — Frontier Papers