Constructing Topical Concept Hierarchical Taxonomy of Tourist Attraction

Abstract

  • A hierarchical co-clustering module by using non-negative matrix tri-factorization for allocating attractions and things of interest to topic when splitting a coarse topic into fine-grained ones.
  • A concept extraction module for extracting concept of every topic that maintain strong discriminative power at different levels of the taxonomy.

理论数学表达

Non-negative Matrix Factorization

The model is to approximate the input attraction-ToI matrix with three factor matrices that assign cluster labels to tourist attractions and Things of Interest (ToI) simultaneously by solving the following optimization problem:

where $X $ is the input attraction-word content matrix, and $U ∈ R^{m×c}{+}$ and $V ∈ R^{n×c}{+}$ are orthogonal nonnegative matrices indicating low-dimensional representations of attractions and things of interest, respectively. The orthogonal and nonnegative conditions of the two matrices $U$ and $V$ enforce the model to provide a hard assignment of cluster label for attractions and things of interest. $H ∈ R^{c×c}_{+}$ provides a condensed view of $X$ .

Read more

Convolutional Neural Networks

Attention

  • 本文适合已经对向后传播(Backpropagation)神经网络有所了解的同学进一步学习卷积神经网络(CNN),感到困难的同学可以自行学习BP后再阅读。
  • This article is suitable for students who are already familiar with Backpropagation Neural Networks to further study Convolutional Neural Networks (CNN). Students who find it difficult can learn BP on their own before reading.
Read more