Popis: |
Chinese food names are important language resources, which can be used in analysis of food reviews. Since the naming of Chinese food names are quite flexible and food reviews are typical spoken language, it is not easy to construct a general list for them. In this paper, we propose an approach to extracting Chinese food names from a large unlabeled Chinese corpus. At first, we construct character-level clues for foods, and then we select word candidates which could be a part of a food name, and perform manual annotation on them. Based on the annotation result, we used heuristic rules to extract food names from food reviews. Experiments are performed to evaluate our approach. |