Xinjian Zhao (赵鑫鉴)

The Chinese University of Hong Kong, Shenzhen

I am a third-year PhD student in Computer Science at The Chinese University of Hong Kong, Shenzhen, and I am fortunate to be advised by Prof. Tianshu Yu. During my PhD, I also spent time as a visiting student at the Institute of Automation, Chinese Academy of Sciences, hosted by Prof. Shu Wu, and as a research intern at Shanghai Artificial Intelligence Laboratory. Before my PhD, I completed an M.S. in Data Science at City University of Hong Kong under the supervision of Dr. Ruocheng Guo and a B.S. in Computer Science at Shandong University under the supervision of Prof. Xuemeng Song. I am interested in graph learning, graph-augmented intelligence, and impactful applications in science and industry.

CV Google Scholar Email GitHub Research Papers

Selected Papers

denotes equal contribution.

Molecular Embeddings

MLLMs as context-aware molecular embedding backbones

Molecular ML
MLLM
Embedding

1 / 10

MolEmb: Multimodal Large Language Models Can Be Strong Molecular Embedding Models

Xinjian Zhao, Xiangru Jian, Yaoyao Xu, Xiaozhuang Song, Wei Pang, Lei Bai, Tianshu Yu

FM4LS@ICML 2026

PDF

MolEmb asks whether multimodal large language models can become the embedding layer for molecular intelligence, rather than only serving as generative scientific assistants. By conditioning molecular representations on images, SMILES, and natural-language scientific intent, it moves molecular embedding from fixed-vector encoding toward context-aware retrieval and decision support.

Key Insight

MLLMs can be repositioned as context-aware molecular embedding backbones, not only as generative scientific assistants.

Manufacturing MLLMs

Fine-grained multimodal evaluation in industrial scenarios

Manufacturing
MLLM
Benchmark

2 / 10

FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

Xiangru Jian, Hao Xu, Wei Pang, Xinjian Zhao, Chengyu Tao, Qixin Zhang, Xikun Zhang, Chao Zhang, Guanzhi Deng, Alex Xue, Juan Du, Tianshu Yu, Garth Tarr, Linqi Song, Qiuzhuang Sun, Dacheng Tao

Preprint, 2026

Paper Website

FORGE treats manufacturing as a hard testbed for deploying multimodal intelligence in the physical world, where small visual differences can imply costly process failures. Across workpiece verification, surface inspection, and assembly verification, it shows that current MLLMs often recognize objects but fail to bind visual evidence to manufacturing rules, tolerances, and domain constraints.

Key Insight

Manufacturing MLLMs should be evaluated by process-level judgment, not object recognition alone.

Synthesis Layer

Unifying trends across vision and graph learning

Survey
Vision
Graph ML

3 / 10

When Vision Meets Graphs: A Survey on Graph Reasoning and Learning

Xinjian Zhao, Wei Pang, Zhixuan Yu, Xiangru Jian, Xiaozhuang Song, Yaoyao Xu, Zhongkai Xue, Dingshuo Chen, Shu Wu, Philip Torr, Tianshu Yu

The 35th International Joint Conference on Artificial Intelligence (IJCAI), 2026

Paper

This survey frames Vision Meets Graphs as a new interface between visual intelligence and structural learning, where graph depictions become direct computational inputs rather than auxiliary figures. It reorganizes scattered progress in graph reasoning, graph learning, and scientific graph understanding into a taxonomy that clarifies how visual perception, symbolic topology, and cross-modal reasoning can be fused.

Key Insight

This is a first systematic formulation of the Vision Meets Graphs paradigm, recasting graph depictions as first-class inputs for structural reasoning.

Structure via Pixels

How far vision models can reason about graph structure

Vision Models
Structural Reasoning
NeurIPS

4 / 10

The Underappreciated Power of Vision Models for Graph Structural Understanding

Xinjian Zhao*, Wei Pang*, Zhongkai Xue*, Xiangru Jian*, Lei Zhang, Yaoyao Xu, Xiaozhuang Song, Shu Wu, Tianshu Yu

Conference on Neural Information Processing Systems (NeurIPS), 2025

Paper

This work challenges the assumption that graph structure must be understood through graph-native message passing by testing whether structural intelligence can emerge from pixels. We show that pure vision models can compete with GNNs on graph-level benchmarks, and GraphAbstract further reveals their advantage on global structural perception and scale-invariant reasoning.

Key Insight

Several GNN expressiveness bottlenecks are visually simple, suggesting that global visual perception can complement local message passing.

Benchmark Taxonomy

Graph-theoretic capabilities of LLM systems

LLM
Graph Reasoning
Benchmark

5 / 10

GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks

Hao Xu*, Xiangru Jian*, Xinjian Zhao*, Wei Pang*, Chao Zhang, Suyucheng Wang, Qixin Zhang, Zhengyuan Dong, Joao Monteiro, Bang Liu, Qiuzhuang Sun, Tianshu Yu

International Conference on Learning Representations (ICLR), 2026

Paper

GraphOmni turns graph theory into a stress test for general reasoning agents: tasks are structurally complex, scalable in difficulty, and exactly verifiable. By decomposing LLM graph reasoning across task type, graph structure, input representation, and prompting strategy, it exposes failure modes that disappear under a single aggregate score.

Key Insight

Graph-theoretic tasks offer a scalable and exactly verifiable lens for diagnosing long-range reasoning in LLMs.

Distributional Geometry

Edge layout distributions as structural priors

Graph Representation
Robustness
KDD

6 / 10

Graph Learning with Distributional Edge Layouts

Xinjian Zhao, Chaolong Ying, Yaoyao Xu, Tianshu Yu

ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2025

Paper

DEL starts from a stronger view of graph learning: a graph should not be represented only as a fixed adjacency object, but as a system whose structural identity is revealed through its responses. By placing graphs in a physical world and sampling steady-state edge layouts under energy constraints, DEL produces distributional signatures that separate hard graph pairs and strengthen GNN/Graph Transformer backbones.

Key Insight

If GNNs act as graph representation compressors, DEL enriches the signal by modeling a graph's physical-world response distribution.

Spectral Views

Augmentation choices in graph contrastive learning

Graph SSL
Spectral Methods
TMLR

7 / 10

Rethinking Spectral Augmentation for Contrast-based Graph Self-Supervised Learning

Xiangru Jian*, Xinjian Zhao*, Wei Pang*, Chaolong Ying, Yimu Wang, Yaoyao Xu, Tianshu Yu

Transactions on Machine Learning Research (TMLR), 2025

Paper

This work pushes graph SSL away from augmentation folklore by asking whether spectral augmentation helps because of spectral information, or simply because it imposes useful invariance pressure. For widely used shallow GNN encoders, we show that InfoNCE behavior is often dominated by perturbation strength rather than spectral characteristics, explaining why simple edge perturbations can match or outperform heavier spectral operations.

Key Insight

For shallow GNN encoders, perturbation strength can dominate spectral properties, shifting graph SSL design from spectral engineering to invariance design.

AI4Science Transfer

Negative alignment for protein modeling

AI4Science
Protein LM
ECML-PKDD

8 / 10

Boosting Protein Language Models with Negative Sample Mining

Yaoyao Xu*, Xinjian Zhao*, Xiaozhuang Song, Benyou Wang, Tianshu Yu

European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), 2024

Paper

Protein language models can over-rely on co-evolution knowledge, causing representations to preserve misleading evolutionary correlations when transferred to downstream protein tasks. We mine negative protein pairs from disparate categories and reduce their residue-level alignment in cross-attention space, steering PLMs away from co-evolution shortcuts and toward task-relevant discrimination.

Key Insight

Negative mining acts as an alignment-level intervention that reduces spurious co-evolution coupling in protein language models.

Topological Priors

Persistent homology for graph pooling

Topology
Graph Pooling
NeurIPS

9 / 10

Boosting Graph Pooling with Persistent Homology

Chaolong Ying, Xinjian Zhao, Tianshu Yu

Conference on Neural Information Processing Systems (NeurIPS), 2024

Paper

This work treats graph pooling as a structural abstraction problem rather than a node-pruning heuristic: coarsening should decide which topology deserves to survive. By injecting persistent homology into the pooling process itself, it aligns PH filtration with cutoff-style pooling so global topological evidence influences what structures are preserved.

Key Insight

Persistent homology is most useful when it guides pooling decisions before global topology is compressed away.

Curriculum Robustness

Adversarial scheduling for graph contrastive learning

Contrastive Learning
Robustness
Preprint

10 / 10

Adversarial Curriculum Graph Contrastive Learning with Pair-wise Augmentation

Xinjian Zhao, Liang Zhang, Yang Liu, Ruocheng Guo, Xiangyu Zhao

Preprint, 2024

Paper

ACGCL treats graph contrastive learning as controlled optimization over sample difficulty, rather than a recipe of random augmentations and fixed positives/negatives. It uses pair-wise augmentation to generate similarity-controlled mirror subgraphs, then applies adversarial curriculum training so optimization gradually moves toward harder and more informative samples.

Key Insight

Graph contrastive learning becomes more reliable when view similarity and sample difficulty are explicitly controlled.

Other Works

HT-GNN: Hyper-Temporal Graph Neural Network for Customer Lifetime Value Prediction in Baidu Ads 2026 · Technical Report
Xiaohui Zhao, Xinjian Zhao, Jiahui Zhang, Guoyu Liu, Houzhi Wang, Shu Wu
Embedding in Recommender Systems: A Survey TOIS · 2026 · ACM Transactions on Information Systems
Maolin Wang*, Xinjian Zhao*, Wanyu Wang*, Sheng Zhang, Jiansheng Li, Bowen Yu, Binhao Wang, Shucheng Zhou, Dawei Yin, Qing Li, Ruocheng Guo, Xiangyu Zhao
AOT*: Efficient Synthesis Planning via LLM-Empowered AND-OR Tree Search ACL Findings · 2026 · Findings of the Association for Computational Linguistics
Xiaozhuang Song, Xuanhao Pan, Xinjian Zhao, Hangting Ye, Shufei Zhang, Jian Tang, Tianshu Yu

Trajectory

Core Research Trunk: Structure-Aware Intelligence

Core question: how can models learn and understand graph structure, and how can graph structure in turn improve general-purpose models and intelligence?

Towards Graph Foundation Models 2023-2026

Capability diagnostics and model-side evidence for graph foundation models.

Adversarial Curriculum GCL (Preprint 2024) · Spectral Augmentation Revisited (TMLR 2025) · Distributional Edge Layouts (KDD 2025) · Vision Models for Graph Structure (NeurIPS 2025)

Graph-Augmented Intelligence 2024-2026

Using graph structure to diagnose and improve general intelligence.

GraphOmni (ICLR 2026) · When Vision Meets Graphs (Survey 2026)

Structure Priors in Scientific and Real-World Applications 2024-2026

Injecting structure priors into scientific and real-world applications, including AI4Science and recommender systems.

MolEmb (FM4LS@ICML 2026) · Persistent Homology Pooling (NeurIPS 2024) · Negative Mining for Protein LMs (ECML-PKDD 2024) · AOT* (ACL Findings 2026) · FORGE (Preprint 2026)

Mentorship

Co-supervision with Prof. Tianshu Yu.

2 students mentored