Preparing your result...
Loading...
Press Esc to dismiss this message

Method and system for data management in a video decoder (02-Mar-2010)

Thumbnail
US Patent Publication (Source: USPTO)
Publication No. US 7672372 B1 published on 02-Mar-2010
Application No. US 10/374777 filed on 24-Feb-2003
Abstract (English)
A method and system for minimizing bus traffic in a video decoder is disclosed. A method and system for processing a portion of a reference picture includes designating the reference picture, selecting a display picture within the reference picture, transmitting a display picture size, and sending a display picture offset. A method and system for compressing IDCT coefficients corresponding to a macroblock, the macroblock having a plurality of blocks, includes locating each non-zero IDCT coefficient corresponding to one of the plurality of blocks, assigning an index to the non-zero IDCT coefficient, the index designating a location within the one of the plurality of blocks, packing the non-zero IDCT coefficient in little endian format, and specifying a terminator bit corresponding to the non-zero coefficient, the terminator bit indicating the end of all non-zero IDCT coefficients for the one of the plurality of blocks. A method and system for selectively controlling each hardware device within a video decoder includes obtaining a video stream, performing VLC decoding, encoding a plurality of instructions to control each hardware device within the video decoder, decoding each one of the plurality of instructions, and optionally performing an IDCT in response to each one of the plurality of instructions.
Inventors/Applicants
Nguyen, Hungviet [+3] [-3]
Fremont, CA, US
Hu, Xiaoping
San Jose, CA, US
Tu, Kuei-Chung
San Jose, CA, US
Liu, Yan
San Jose, CA, US
Assignees
Intel Corporation
Santa Clara, CA, US
Classifications
International (2006.01): H04B 1/66
National: 375/240.12
Field of Search: 375/240.12; 375/240.16; 375/240.17; 375/240.23; 375/240.13; 375/240.14 [+3] [-3]
Patent References
US 4121283 A Interface device for encoding a digital image for a CRT display Oct-1978 364/200
US 4346377 A Method and apparatus for encoding and generating characters in a display Aug-1982 340/731
US 4382254 A Video display control circuitry May-1983 340/744 [+38] [-38]
US 4399435 A Memory control unit in a display apparatus having a buffer memory Aug-1983 340/750
US 4418344 A Video display terminal Nov-1983 340/726
US 4471465 A Video display system with multicolor graphics selection Sep-1984 364/900
US 4488254 A Method and apparatus for efficient data storage Dec-1984 364/900
US 4531160 A Display processor system and method Jul-1985 358/240
US 4569019 A Video sound and system control circuit Feb-1986 364/410
US 4644495 A Video memory system Feb-1987 364/900
US 4700182 A Method for storing graphic information in memory Oct-1987 340/750
US 4737772 A Video display controller Apr-1988 340/703
US 4751502 A Display controller for displaying a cursor on either of a CRT display device or a liquid crystal display device Jun-1988 340/709
US 4760387 A Display controller Jul-1988 340/716
US 4763118 A Graphic display system for personal computer Aug-1988 340/735
US 4779083 A Display control system Oct-1988 340/767
US 4821226 A Dual port video memory system having a bit-serial address input port Apr-1989 364/900
US 5028917 A Image display device Jul-1991 340/703
US 5030946 A Apparatus for the control of an access to a video memory Jul-1991 340/750
US 5065346 A Method and apparatus for employing a buffer memory to allow low resolution video data to be simultaneously displayed in window fashion with high resolution video data Nov-1991 395/128
US 5122792 A Electronic time vernier circuit Jun-1992 340/793
US 5138305 A Display controller Aug-1992 340/717
US 5274794 A Method and apparatus for transferring coordinate data between a host computer and display device Dec-1993 395/500
US 5355465 A Data storing device having a plurality of registers allotted for one address Oct-1994 395/425
US 5594467 A Computer based display system allowing mixing and windowing of graphics and video Jan-1997 345/115
US 5654759 A Methods and apparatus for reducing blockiness in decoded video Aug-1997 348/405
US 5675387 A Method and apparatus for efficient addressing of DRAM in a video decompression processor Oct-1997 348/416.1
US 5754243 A Letter-box transformation device May-1998 348/445
US 5781788 A Full duplex single clip video codec Jul-1998 712/1
US 5815646 A Decompression processor for video applications Sep-1998 395/163
US 5905839 A Digital video signal recording/reproducing apparatus for storing a vertical resolution signal May-1999 386/26
US 5905840 A Method and apparatus for recording and playing back digital video signal May-1999 386/44
US 5907372 A Decoding/displaying device for decoding/displaying coded picture data generated by high efficiency coding for interlace scanning picture format May-1999 348/716
US 5969770 A Animated "on-screen" display provisions for an MPEG video signal processing system Oct-1999 348/569
US 5970173 A Image compression and affine transformation for image motion compensation Oct-1999 382/236
US 6058463 A Paged memory data processing system with overlaid memory control registers May-2000 711/202
US 6061400 A Methods and apparatus for detecting scene conditions likely to cause prediction errors in reduced resolution video decoders and for using the detected information May-2000 375/240
US 6104434 A Video coding apparatus and decoding apparatus Aug-2000 348/403
US 6121998 A Apparatus and method for videocommunicating having programmable architecture permitting data revisions Sep-2000 348/14.13
US 6405273 B1 Data processing device with memory coupling unit Jun-2002 710/131
US 6823016 B1 Method and system for data management in a video decoder Nov-2004 375/240.25
Related Documents
Division of application No. US 09/27014 00, filed on 20-Feb-1998, now Pat. No. US 6823016 A.
Examiners
Primary: Vo, Tung
Attorney, Agent or Firm
Trop, Pruner & Hu, P.C.

Supplemental Information (Source: DOCDB)
Inventors
NGUYEN HUNGVIET [+3] [-3]
US
HU XIAOPING
US
TU KUEI-CHUNG
US
LIU YAN
US
Assignees/Applicants
INTEL CORP
US
Priority
US 374777 A  24-Feb-2003 [+1] [-1]
US 27014 A  20-Feb-1998
Classifications
International (2010.01): H04B 1/66
International (2006.01): H04B 1/66; H04N 7/26; H04N 7/50
European: H04N 7/26L2D4; H04N 7/26A4Z; H04N 7/26A6U; H04N 7/26A8B; H04N 7/26P; H04N 7/50 [+3] [-3]
Preview up to the first 8 page images of this publication.
--- Page 1 ---
Page 1
--- Page 2 ---
Page 2
--- Page 3 ---
Page 3
--- Page 4 ---
Page 4
--- Page 5 ---
Page 5
--- Page 6 ---
Page 6
--- Page 7 ---
Page 7
--- Page 8 ---
Page 8
(Source: USPTO)
CROSS REFERENCE TO RELATED APPLICATION
This application is a divisional application based on U.S. patent application Ser. No. 09/027,014, filed on Feb. 20, 1998 now U.S. Pat. No. 6,823,016.
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to a command queue manager. More particularly, the present invention relates to a method and system for minimizing bus traffic in a video decoder.
2. The Background Art
With the merging of personal computer systems and entertainment systems, digital component video and audio were developed. Typically, this audio and video data are encoded into a compressed program stream for transmission. A stream demultiplexer parses the incoming program stream into audio and video bitstreams. When video frames are ultimately displayed, there must be a decompression of these compressed video bitstreams. A video decoder is used for this decompression, or decoding, process.
According to the MPEG-2 video compression standard, the compression ratio can be as high as 50 to 1. Similarly, decompression expands data up to 50 times. This high data rate, as well as the high video window resolution of MPEG-2 decoding, puts heavy demands on the video system. Moreover, at the present time, a video decoder must accommodate a frame rate of approximately 30 frames per second.
Traditionally, software sends video data to a video decoder implemented entirely in hardware. If the entire decoder is built in hardware, then only a compressed data stream is needed. As a result, a decoder built entirely in hardware is extremely fast. However, the hardwired decoder is inflexible as well as complex, which makes the debugging process extremely difficult. In addition, the hardwired decoder requires numerous gates, resulting in a costly system.
Software can be used to provide greater versatility. However, software is computation intensive, and results in a substantial increase in bus traffic. Accordingly, a need exists for a video decoder which provides greater flexibility than the hardwired decoder while minimizing bus traffic and reducing hardware costs.
BRIEF DESCRIPTION OF THE INVENTION
According to a first aspect of the present invention, a method and system for selectively controlling each hardware device within a video decoder includes obtaining a video stream, performing Variable Length Coding (VLC) decoding, on the video stream, encoding a plurality of instructions to control each hardware device within the video decoder, decoding each one of the plurality of instructions, and controlling each hardware device in response to the plurality of instructions. Since the decoder of the present invention comprises hardware and software, greater versatility than traditional hardwired decoders is achieved while manufacturing costs are substantially reduced. Thus, the decoder has the flexibility to control the hardware devices through the use of an instruction set. Since the software portion of the video decoder can instruct the hardware to perform operations that the data stream requires, various instructions can be used to control the hardware to compensate for various problems with a data stream, or substitute software functions in place of non-functional hardware devices. Moreover, since the CPU in a desktop or laptop computer environment can be used to process a portion of the decoding steps at the beginning of the process, it is beneficial to take advantage of this added processing power.
According to a second aspect of the present invention, a method and system for compressing Inverse Discrete Cosine Transform (IDCT) coefficients corresponding to a macroblock, the macroblock having a plurality of blocks, includes locating each non-zero IDCT coefficient corresponding to one of the plurality of blocks, assigning an index to each non-zero IDCT coefficient, the index designating a location within the one of the plurality of blocks, packing each non-zero IDCT coefficient in little endian format, and specifying a terminator bit corresponding to each non-zero coefficient, the terminator bit indicating the end of all non-zero IDCT coefficients for the one of the plurality of blocks. Since the IDCT coefficients are packed in this manner, bus traffic is decreased and efficiency of the decoder is increased.
According to a third aspect of the present invention, a method and system for processing a portion of a reference picture includes designating the reference picture, selecting a display picture within the reference picture, transmitting a display picture size, and sending a display picture offset. This method allows panning and shifting of a display window selected by a user within a reference picture. Therefore, the present invention provides greater flexibility than systems limiting the display picture size to that of the reference picture. Furthermore, only the display picture data rather than the reference picture data must be processed, resulting in a more efficient decoder.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 illustrates a data flow diagram of an MPEG-2 decoder according to a presently preferred embodiment of the present invention.
FIG. 2 illustrates a block diagram of an MPEG-2 decoder according to a presently preferred embodiment of the present invention.
FIG. 3 illustrates a macroblock numbering system according to a presently preferred embodiment of the present invention.
FIG. 4 illustrates a series of non-zero IDCT coefficients corresponding to one macroblock.
FIG. 5 illustrates a method for storing each non-zero IDCT coefficient across a 32 bit memory location according to a presently preferred embodiment of the present invention.
FIG. 6 illustrates terminator bit positions according to a presently preferred embodiment of the present invention.
FIG. 7 illustrates possible index values according to a presently preferred embodiment of the present invention.
FIG. 8 illustrates a method for processing a display window within a reference picture according to a presently preferred embodiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
In the following description, a preferred embodiment of the invention is described with regard to preferred process steps and data structures. However, those skilled in the art would recognize, after perusal of this application, that embodiments of the invention may be implemented using a set of general purpose computers operating under program control, and that modification of a set of general purpose computers to implement the process steps and data structures described herein would not require undue invention.
The present invention provides a method and apparatus for distributing commands in a video decoder. According to a first aspect of the present invention, the MPEG-2 decoder comprises hardware and software to provide greater versatility than hardwired decoders. Referring first to FIG. 1, a data flow diagram of an MPEG-2 decoder according to a presently preferred embodiment of the present invention is illustrated. The MPEG-2 decoder is partitioned into software 10 and hardware 12. The front end of the decoding process comprises software, while the back end comprises hardware. A system stream 14 comprises an audio stream 16 and a video stream 18. Once the system stream 14 is split, Variable Length Coding (VLC) decoding 20 is performed on the video stream 18. Next, an instructions assembler 22, receives information comprising data and instructions from a host. This information is then compressed into a packed data format and stored in a command queue stored in a frame buffer 22. A command queue manager 24 then unpacks the data in the frame buffer, decodes the commands, and sends appropriate signals to corresponding hardware devices 26 capable of performing dequantization, IDCT, motion compensation, display format conversion, color space conversion, scaling and interpolation, and video overlay to complete the decoding process. The instructions are executed by the command queue manager 24 in the order the instructions are stored within the frame buffer 22. Therefore, an instruction set may be modified to provide flexibility and allow the command queue manager to control each hardware block. Moreover, the instruction set is provided to transmit only necessary information across the bus. For example, IDCT coefficients are transmitted in a compact form to maximize efficiency. Thus, bus traffic resulting from the added software is minimized without compromising the quality of the decoder.
Referring now to FIG. 2, a block diagram of an MPEG-2 decoder according to a presently preferred embodiment of the present invention is shown. As illustrated, a 32 bit PCI bus 28 interfaces with a CPU and a 560 64-bit SDRAM memory interface sequencer for writing to and reading data from a frame buffer. The command queue manager 24 fetches commands and data from the frame buffer through a frame buffer interface 30. It then decodes the commands and dispatches data to one of three major video blocks: dequantization and IDCT 32, Motion Compensation 34, and Reformatter 36. For example, the video command queue manager 24 sends IDCT coefficients and a dequantization table to a Dequantization and IDCT block 32. Similarly, the command queue manager 24 sends commands and motion vectors to a Motion Compensation block 34, and commands and parameters to an Output Reformatter 36. The Output Reformatter block 36 is adapted for converting a 4:2:0 macroblock format to a 4:2:2 scan line format. IDCT and Motion Compensation are known in the art of video decoding. According to a presently preferred embodiment of the present invention, implementations of various IDCT algorithms may be provided in software. In this manner, the IDCT commands may be selectively bypassed during testing. Moreover, software may be used as a substitute for the IDCT block, or other hardware block, when the hardware block is not functioning properly.
The command queue 22, shown in FIG. 1, is implemented in frame buffer memory. According to a presently preferred embodiment, four address pointers are used to manage the data stored in the command queue. Top and bottom address pointers define the area in memory allocated for the video command queue. In addition, head and tail address pointers define the data stored within the video command queue, and are updated accordingly. The video command queue manager tracks all the address pointers and determines where to fetch the commands and data. The software updates the tail pointer as it stores data in the video command queue, and the hardware will update the head pointer as it removes data from the video command queue. According to a presently preferred embodiment of the present invention, if the number of valid data words in the command queue is less than a specified number, the command queue manager will interrupt the CPU.
Commands and data are packed in the frame buffer and an instruction set is set forth to allow the command queue manager to identify and interpret these commands. The commands are then sent to the appropriate hardware block. According to a presently preferred embodiment of the present invention, the command queue is 64 bits wide. Similarly, each instruction is a multiple of 32-bit words. Therefore, each word in the command queue can store up to 2 instructions.
According to a second aspect of the present invention, a method for packing IDCT coefficients is presented. Referring now to FIG. 3, a macroblock numbering system according to a presently preferred embodiment of the present invention is presented. Each macroblock 38 is processed individually. Each macroblock comprises 6 blocks 40-50, numbered 0, 1, 2, 3, 4, and 5, corresponding to Y, Cb and Cr color space components, respectively. According to a presently preferred embodiment of the present invention, only non-zero IDCT coefficients are packed and transferred to a dequantization block. Therefore, a maximum of 64 IDCT coefficients may be transferred for each block within the macroblock. As shown in FIG. 4, a series of non-zero IDCT coefficients corresponding to one macroblock 52 are presented. Non-zero IDCT coefficients corresponding to each block are stored sequentially by block 54. IDCT coefficients for each block 0, 1, 2, 3, 4, and 5 are sequentially stored.
Referring now to FIG. 5, a method for storing each non-zero IDCT coefficient across a 32 bit memory location according to a presently preferred embodiment of the present invention is presented. One of ordinary skill in the art, however, will readily recognize that a different number of bits may be used. For each non-zero IDCT coefficient, the following method is performed. Each macroblock is processed individually. First, the next block within the macroblock is obtained at step 56. Next, at step 58, a non-zero IDCT coefficient is obtained. Next, at step 60, an index is assigned to the non-zero IDCT coefficient. Next, at step 62, the index is packed in a memory location. The index serves as an address, based on the horizontal scan direction. According to a presently preferred embodiment of the present invention, an inverse zig zag scan is performed to convert the MPEG-2 standard zig zag scan to the horizontal scanning convention. Those of ordinary skill in the art will readily recognize that such scanning methods are known in the art of video encoding and decoding. According to a presently preferred embodiment of the present invention, the first non-zero coefficient in each 8×8 block is 0.
Next, at step 64, the non-zero IDCT coefficient is packed in little endian format. The non-zero IDCT coefficient is packed in an available least significant position in memory. Therefore, the first coefficient is stored in the least significant memory location, or right most position. According to a presently preferred embodiment of the present invention, the coefficient data comprises 12 bits.
According to a presently preferred embodiment of the present invention, each 32 bit instruction comprises index and coefficient data, with the two most significant bits comprising terminator bits. Therefore, each coefficient and index are packed across multiple 32-bit words. Each terminator bit corresponds to one coefficient. A terminator bit may comprise a 0 or a 1. According to a presently preferred embodiment of the present invention, a 0 indicates that more coefficients follow within the current 8×8 block, while a 1 indicates that no more coefficients follow after the current one of this 8×8 block. According to the presently preferred embodiment of the present invention, the least significant terminator bit in the first 32 bit instruction is not used.
If it is determined at step 66 that more IDCT coefficients exist for the current block, a terminator bit for the current IDCT coefficient is set to 0 at step 68. Next, at step 70, the terminator bit corresponding to the non-zero IDCT coefficient is packed in one of two most significant bits of the memory location. The next non-zero IDCT coefficient for the current block is then obtained at step 58.
If it is determined at step 66 that no more coefficients exist for the current block, the terminator bit for the current IDCT coefficient is set to 1 at step 72. Next, at step 74, the terminator bit corresponding to the non-zero IDCT coefficient is packed in one of two most significant bits of the memory location. The IDCT coefficients for the current block are then stored in a location designated for the current macroblock at step 76. However, if the IDCT coefficients for the current block are originally stored in a location designated for the current macroblock, this step may be ignored. If at step 78, it is determined that there are no more blocks in the current macroblock, the process is completed at step 80. However, if there are more blocks in the current macroblock, the next block is obtained at step 56, the process is repeated. Those of ordinary skill in the art will readily recognize that the above steps are presented for illustrative purposes only. Moreover, those of ordinary skill in the art will similarly recognize that the steps may be performed in an alternate order to achieve the same result.
Referring now to FIG. 6, terminator bit positions 82-84 are presented for blocks having one 86, two 88, three 90, four 92, five 94, six 96, and seven 98 coefficients. As shown, each index 100 and coefficient 102 are stored across 32 bit words 104. The terminator bit used for each instruction word repeats every 3 instruction words as shown. An n coefficient case, where n is greater than seven, is similar to an n−5 coefficient case. As a result, the IDCT coefficients are packed in a manner to minimize bus traffic and reduce the command queue size.
Referring now to FIG. 7, possible index values within an 8×8 block according to a presently preferred embodiment of the present invention are presented. According to the presently preferred embodiment of the present invention, the index comprises 6 bits, since the index comprises a binary number between 0 and 63 indicating a pixel position 106 within an 8×8 block. However, index 0 may not correspond to location 0. For example, index 0 108 may correspond to coefficient 8, and index 1 110 may correspond to coefficient 18, as shown.
According to a third aspect of the present invention, a method for allowing panning and shifting of a display window within a reference picture is provided. In this manner, a portion of a reference picture may be processed. Referring now to FIG. 8, a method for processing a display window within a reference picture is presented. First, a reference picture is designated at step 112. According to a presently preferred embodiment of the present invention, a reference picture size defining the reference picture is transmitted. According to a presently preferred embodiment of the present invention, the reference picture size includes a horizontal reference picture size in macroblocks and a vertical reference picture size in macroblocks. The reference picture size is then used by the motion compensation block. Second, a display picture is selected at step 114. According to a presently preferred embodiment of the present invention, a user may specify a display picture through the use of a mouse or other equivalent means for selecting a display picture. Third, a display picture size defining the display picture is transmitted at step 116. Similarly, the display picture size includes a horizontal and vertical display picture size, both designated in macroblocks. The horizontal and vertical display picture size may then be used by the motion compensation and the output reformatter blocks. Fourth, at step 118, a display picture offset defining a location of the display picture within the reference picture is transmitted to the motion compensation block. According to a presently preferred embodiment of the present invention, the display picture offset comprises delta x and delta y. Therefore, the display picture size may be less than the reference picture size. Moreover, the display picture offset provides a means for panning, or shifting, the display window within the reference picture. All non displayable macroblocks may then be stripped from the video stream prior to writing the instructions to the command queue. Thus, this provides greater flexibility than systems limiting the display picture size to that of the reference picture. Furthermore, only the display picture data rather than the reference picture data must be processed at step 120, resulting in a more efficient decoder. Moreover, this is particularly important in systems with limited memory. Those of ordinary skill in the art will readily recognize that the above steps are presented for illustrative purposes only. Moreover, those of ordinary skill in the art will similarly recognize that the steps may be interchanged to achieve the same result. According to a preferred embodiment, the above described methods may be implemented in software or firmware, as well as in programmable gate array devices, ASIC and other hardware.
While embodiments and applications of this invention have been shown and described, it would be apparent to those skilled in the art that many more modifications than mentioned above are possible without departing from the inventive concepts herein. The invention, therefore, is not to be restricted except in the spirit of the appended claims.
(Source: USPTO)
What is claimed is:
1. A method for selectively controlling at least one hardware device within a video decoder, the method comprising: performing machine-executable instructions for VLC decoding of a video stream; performing machine-executable instructions for encoding a plurality of instructions to control at least one hardware device within the video decoder, the plurality of instructions being encoded within the video stream; decoding in a hardware command queue manager each one of the plurality of instructions; and controlling at least one hardware device within the video decoder in response to the plurality of instructions.
2. The method according to claim 1, further including storing the plurality of instructions in a command queue.
3. The method according to claim 1, wherein the controlling includes optionally performing an inverse discrete cosine transform in response to the plurality of instructions.
(Source: USPTO)