System of Using 3 Tag Types to Categorize Discussion Threads Based on Topics Discussed
|
IP.com Disclosure Number: IPCOM000142520D
|
Publication Date: 31-Oct-2006 |
Publishing Venue
The IP.com Prior Art Database
Abstract
Language
English (United States)
Document File
1 pages / 25.1 KB
System of Using 3 Tag Types to Categorize Discussion Threads Based on Topics Discussed
Disclosed is a system and apparatus of managing discussion threads by adding arbitrary tags to each post in the thread where such tags can be of three types based on who defines/ratifies them: user-defined tags, moderator-promoted tags and system-defined tags. An innovative aspect of this disclosure is that the user, the system and the administrator of the discussion forum all contribute in generating the set of tags. Furthermore, the system overcomes the nature of discussion posts (short quick comments) and still performs reliable data mining by incorporating the sub-thread structure of the post and its replies.
System tags for each post are mined by the system by examining the most commonly occurring piece of sub-text in all the posts of a sub-thread (barring common stop words such as "the") and extracting the top n words (we recommend n = 3). The reason we use the whole sub-thread is this: Many posts are quick and short comments, but mining algorithms to determine topic words need a bigger data source. Since replies to a post can be treated as commentary on the original post, we end up using the whole sub-thread as input to the mining algorithm. This implies that the innovative aspect here is the use of a sub-thread as opposed to each individual post in determining system tags; and the idea being disclosed is not limited by a specific...
- a representative PDF of the primary file (contains all the relevant information for most users)
- the full document ZIP file containing the primary file, packaged metadata, and attachments (as appropriate)