This invention proposes a new system and method for automatically extracting the hierarchical travel sequence from the focused travel blog.
System and Method for Extracting Hierarchical Travel Sequence from Blog
1. Background: What is the problem solved by your invention? Describe known solutions to this problem (if any). What are the drawbacks of such known solutions, or why is an additional solution required? Cite any relevant technical documents or references.
With the fast development of Web 2.0, more and more users are sharing their travel notes on the blogs. Such notes are usually valuable to others who are willing to visit the same place of interest as well as travel agencies who are designing the new trips.
However, the travel information on the blogs is usually plain texts for users to read. If a user is interest in a specific place, he can use a blog search engine to search all the related blogs and browsing the result one by one. Obviously, such a process is time consuming. Furthermore, the user can not get a good summary of the queried place.
An ideal way would let computer to summarize all the travel notes. However, such a method is obviously restricted due to the lack of structured travel information. So how to map the plain texts to the structural information is the key for computer to understand the large scale travel notes.
A simple way to extract the travel sequence is to recognize the locations from the blog text in order. However, such an approach usually can not indentify all the places of interest. Besides, some noisy places are usually mixed with the required places. Furthermore, this approach can not organize the places in to hierarchies, which make the sequence hard to understand.
This invention proposes a system and method fro automatically extracting hierarchical travel sequence from blogs. Hierarch...