System of Using 3 Tag Types to Categorize Discussion Threads Based on Topics Discussed

IP.com Prior Art Database Disclosure
IP.com Disclosure Number: IPCOM000142520D
Publication Date: 31-Oct-2006
More Like This Download

Publishing Venue

The IP.com Prior Art Database

Abstract

Discussion threads have a tendency of becoming voluminous, with users posting quick and short, spontaneous comments that may have little value. Finding individual posts after a while becomes very tricky. Furthermore, new visitors to a thread or any visitors to a discussion thread that is no longer active find it difficult to read through a discussion thread out of context. In this publication, we propose a system to flexibly tag discussion posts using a mixture of user input, data mining, and moderator actions to correct or improve categorization results.

Language

English (United States)

Document File

1 pages / 25.1 KB

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 61% of the total text.

Page 1 of 1

System of Using 3 Tag Types to Categorize Discussion Threads Based on Topics Discussed

Disclosed is a system and apparatus of managing discussion threads by adding arbitrary tags to each post in the thread where such tags can be of three types based on who defines/ratifies them: user-defined tags, moderator-promoted tags and system-defined tags. An innovative aspect of this disclosure is that the user, the system and the administrator of the discussion forum all contribute in generating the set of tags. Furthermore, the system overcomes the nature of discussion posts (short quick comments) and still performs reliable data mining by incorporating the sub-thread structure of the post and its replies.

System tags for each post are mined by the system by examining the most commonly occurring piece of sub-text in all the posts of a sub-thread (barring common stop words such as "the") and extracting the top n words (we recommend n = 3). The reason we use the whole sub-thread is this: Many posts are quick and short comments, but mining algorithms to determine topic words need a bigger data source. Since replies to a post can be treated as commentary on the original post, we end up using the whole sub-thread as input to the mining algorithm. This implies that the innovative aspect here is the use of a sub-thread as opposed to each individual post in determining system tags; and the idea being disclosed is not limited by a specific...

First page image
We are pleased to offer a download of this document free of charge.
Files available for download:
  • a representative PDF of the primary file (contains all the relevant information for most users)
  • the full document ZIP file containing the primary file, packaged metadata, and attachments (as appropriate)
To obtain the file, please enter the "captcha" below and click the Download button.
Avoid entering CAPTCHAs! Sign In or Create a Free Account.

Challenge image
  • Please enter letters and numbers only; no spaces.
  • Cannot read this one? Click the image.
  • Difficulty with captchas? Contact us with the URL of this page and we will email it to you.