Fields of Interest

Natural Language Processing, Generative AI, Machine Learning


Research Content

Currently, I am working on detecting useless reviews using a multimodal approach. The main contributions of this work are as follows:

  1. Three dedicated credibility detectors: We design the Text-Temporal Credibility (TTC), Multi-Consistency Visual (MCV), and Reviewer Credibility (RC) detectors to capture complementary credibility signals from textual-temporal behaviors, visual authenticity (including deepfake detection), and reviewer profiles, leveraging techniques such as large language models (LLMs) and advanced image forensics.
  2. Multi-level feature fusion: We propose a Multi-Level Fusion Module that integrates features at fine-grained, mid-level, and global levels via co-attention, CrossNet, and score gating, producing holistic and robust credibility predictions.
  3. LLM-augmented multimodal dataset: We construct and annotate a large-scale multimodal dataset based on Yelp reviews, augmented with LLM-generated text and images, comprising over 33,000 reviews and 50,000 images, to support supervised model training.

Extensive experiments on this dataset demonstrate that MDCFN significantly outperforms state-of-the-art baselines across multiple evaluation metrics, highlighting both its effectiveness and generalization potential.