0% found this document useful (0 votes)
41 views

Multimedia Questions and Answering Using Web Data Mining

Data stream minng can be done when we have enough memory and fast processing in single scan.

Uploaded by

Sapna Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Multimedia Questions and Answering Using Web Data Mining

Data stream minng can be done when we have enough memory and fast processing in single scan.

Uploaded by

Sapna Yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

ICICES2014 - S.A.

Engineering College, Chennai, Tamil Nadu, India

Multimedia Questions and Answering using Web Data


Mining
Bavarva Bhaskar D.

Asst. Prof. Dheeraj Kumar Singh

Computer Engineering Department,


Parul Institute of Engineering and Technology, Limda,
Waghodia, Vadodara, India.
bavarva.bhaskar@gmail.com

Information Technology, Department


Parul Institute of Engineering and Technology, Limda,
Waghodia, Vadodara, India.
dhirajsingh66@gmail.com

Abstract ---A large amount of information returned by the


web search engines like google, yahoo etc.Users become
overloaded to find the correct information from the web.
Question Answering (QA)approach solves these problems.
Instead of returning a list of document from the current search
engine, QA system provides information from a comprehensive
set of well answered question with appropriate media data. QA
aims to leverage in depth linguistic and media content analysis as
well as domain knowledge to return precise answers to natural
language questions. This paper briefly surveys the progress of
multimedia question answering (MMQA) research and details its
future directions.
Keywords---question answering

I. INTRODUCTION
The amount of information on the web has increase year to
year with content covering almost any topic. As a result, when
looking for information, user becomes overloaded to find the
correct information from the current search engine. Users
usually have to painstakingly browse through a long list of
results to look for a precise answer [1]. Therefore questions
answering system solve these problems. It avoids the
painstaking browsing the vast quantity of information returned
by the search engines for the correct answers.
QA only focus on the textual data. Therefore its time to
extend the concept of text to multimedia data. Multimedia
data are helpful to user to quickly understand the content of
information. Multimedia questions answering (MMQA)
provides the textual answers along with the media
format(image and video) according to the questions.
Multimedia answers are more helpful for some questions
likeWhat are the steps to download a Firefox browser. In
this type of questions if multimedia videos are available users
quickly understand the answers. Textual answers cannot give
more information and user gets painstaking to understand the

answers. MMQA provides the best answers which is the


combination of text and other mediums.
Thus few works have been done on the MMQA. Hui Yang
provides a VideoQA on News video [2]. It provides
architecture similar to text QA with video content analysis to
support personalized news video retrieval. Several video QA
systems were proposed and most of them rely on the use of
text transcript derived from video OCR (Optical Character
Recognition) and ASR (Automatic Speech Recognition)
outputs [2]-[6].
Tom Yeh, John J. Lee and Trevor Darrell were the first to
present image based QA [7]. They describe a photo based
question answering which is a useful way of finding
information about physical objects. They develop a three layer
system architecture for a photo based QA. The first, template
based QA layer matches a query photo to online images and
extracts structured data from multimedia databases to answer
questions about the photo. To simplify image matching, it
exploits the question text to filter images based on categories
and keywords. The second, information retrieval QA layer
searches an internal repository of resolved photo based
questions to retrieve relevant answers and third layer human
computation QA layer leveraged community experts to handle
the most difficult cases.
This paper briefly surveys the progress of multimedia
questions answering (MMQA) research and details its future
directions.
The remainder of the paper is organized as follows: In Section
2 briefly reviews the related work. In Section 3 we introduce
the aspects of multimedia. In Section 4 we finally present the
conclusion.

ISBN No.978-1-4799-3834-6/14/$31.002014 IEEE

ICICES2014 - S.A.Engineering College, Chennai, Tamil Nadu, India


II.

RELATED WORK

A. From Text to Multimedia


The early investigation of QA systems started from 1960s
and mainly focused on expert systems in specific domains.
Text based QA has gained its research popularity since the
establishment of QA track in TREC in the late 1990s [8].
Based on the type of questions and expected answers, it can
roughly summarize the sorts of QA into factoid QA, Open
Domain QA[9], Restricted- Domain QA[10], Definitional
QA[11], and list QA[12] and more recently, how to, why,
opinion and analysis QA. A factoid QA returns as answers
factual tidbits of information such as name, location,
quantities etc. A list QA such as What is most popular city in
India? the system is expected to return one or more precise
city names. For Definitional QA such as Who is
Gandhi?the system should return a set of answer sentences
that best describe the question topic [11].
As we compare the traditional QA and MMQA system.
There is lots of improvement in the MMQA. Traditional QA
framework consists of three main components: document
retrieval, question analysis and answer extraction. In the
document retrieval components, the related document
according to the question is fetched. Then the top relevant
segment is selected. After analysis of question, answer can be
display to the related document.
Given the vast amount of Web content is non textual
media; it is natural to extend the text based QA research to
MMQA [13]. The Advantage of MMQA is many questions
are better explained with the help of non textual medium.
MMQA extends the concept of community question
answering (cQA) along with the media content (image and
video).Liqiang Nie and his colleagues proposed a method that
enriches text answers with image and video information [14].
Given a question, an answer can be found from community
members and media content are found from search engine and
enrich to the textual answers. Images, video and audio QA
aim to give precise images, video clips or audio fragments as
answers to users questions. Hang Chi, Min-Yen Kan and TatSeng Chua designed an early system to address the
multimedia factoid QA that follows a similar architecture as
text based QA, with video content analysis being performed at
various stages of the QA pipeline to obtain precise video
answers [2]. Their work also includes a simple video
summarization process to provide the contextual aspects of the
answers.
Li et al [1] proposed an approach that leverages YouTube
video collections as a source to automatically find videos to
describe cooking techniques. But these approaches usually
work on a specific domain. Multimedia question answering

system (MMQA) does not directly answers the questions


instead it enrich the community contributed answers with
multimedia contents. It combines the community question
answering along with the media content. That why, MMQA
works on anany domain.
Kacmarcik et al. [15] explored a non text input mode for
QA that relies on specially annotated virtual photographs. To
increase the performance of text based search , some machine
learning techniques that aim to automatically annotate media
entities have been proposed in the multimedia
community[16]-[20].
III.

ASPECTS OF MMQA

A. Determining User intent


When users search for particular information, they might
not have a clear idea about what they actually want according
to his questions. Ritendra Datta and his colleagues broadly
characterized users as browsers, surfers and searches based on
the clarity [21]. Therefore, users actually express what they
want.
Alexander Kotov and ChengXiang Zhai proposed a
framework for question-guided search [22]. This system gives
suggestions of the questions to users. So users actually know
the syntactically features of the questions. In case of complex
queries, it engages to users to go in the actual flow of goal. In
this way, the system can also guide users toward useful
answer because the answers to the suggested questions are
already known to exist. Its increase the system performance
because the questions are suggested from the system and the
answer is already known.
B. Choosing the Proper Medium
It is important parameters to determine the best source and
medium (image and video) to answer a questions. The scheme
from Nie and his colleagues supplies the best answers with
multimedia content using cQA[14]. It can be divided into
three main components. First Answer Medium Selection,
Given a QA pair, it predicts whether textual answers should
be enriched with media information, and which kind of media
data should be added. QA pair is categorized into one of the
four predefined classes: text only, text and image, text and
video, or text, image and video. The scheme will
automatically collect images, videos or the combination of
images and videos to enrich the original textual answers.
Second Query Generation for multimedia generation, After
collecting multimedia data informative queries are generated.
This component extracts three queries from the question, the
answer and the QA pair. Third, Multimedia data selection and
presentation, Based on the generated queries, collect images

ISBN No.978-1-4799-3834-6/14/$31.002014 IEEE

ICICES2014 - S.A.Engineering College, Chennai, Tamil Nadu, India


and videos from the search engines. After perform reranking
and duplicate removal to obtain a set of accurate and
representative images or videos to enrich the textual answers.

[3] J. Cao, F. Jay, and J. Nunamaker, Question answering on lecture videos:


A multifaceted approach, in Proc. Int. Joint Conf. Digital Libraries,
2004.

Samuel Huston and W. Bruce Croft examined query


processing techniques that can be applied to verbose queries
prior to submission to a serach engine to improve the search
engines results [23]. Giridhar Kumaran and Vitor R.
Carvalho presented techniques to shorten long queries by
removing extraneous terms [24].

[5] Y.-S. Lee, Y.-C. Wu, and J.-C. Yang, Bvideoqa : Online English/Chinese
bilingual video question answering, Amer. Soc. Inf. Sci. Technol., vol.
60, no. 3, pp. 509-525, 2009.

C. Presenting Answers
Traditional system presents results using a sorted list of
descending relevancy. In the traditional system, the list of
related document is search from the web according to the
questions, after retrieving of the documents, the top related
documents are selected and analysis the questions then present
a answer according to the questions. In multimedia question
answering (MMQA) can use semantic summarization to
present an answer by summarizing the retrieved potential
answers from various sources (text, image, audio, video, or a
hybrid) at the semantic level.
IV.

CONCLUSION

Multimedia QA is an emerging topic, and so research and


achievement in this area are a preliminary. Many existing
system may fail to generate the answer if the query are
complex. There is lack of diversity of the generated media
data. In the future there is need to provide better algorithm for
image selection and duplicate removal. Multimedia
diversification methods are requiringmaking the enriched
media data more diverse.
V. ACKNOWLEDGEMENTS
I would like to express the deepest appreciation to Mr.
Dheeraj Kumar Singh who has continuously guided me
through my research work.

[4] Y.-C. Wu, C.-H. Chang, and Y.-S. Lee, Cross-Language Video
Questions/Answering System, in Proc. IEEE Int. Symp. Multimedia
Software Engineering, 2004, pp. 294-301.

[6] Y.-C. Wu and J.-C. Yang, A robust passage retrieval algorithm for video
question answering , IEEE Trans. Circuits Syst. Video Technol., vol. 18,
no. 10, pp. 1411-1421, 2008.
[7] T. Yeh, J.J. Lee, and T. Darrell, Photo-Based Question Answering,
Proc. 16th ACM IntI Conf. Multimedia, ACM Press, 2008, pp. 389-398.
[8] Trec: The Text Retrieval Conf. [Online]. Available: http://trec.nist.gov/.
[9] S. A. Quarteroni and S. Manandhar, Designing an interactive open
domain question answering system, J. Natural Lang. Eng., vol. 15, no. 1,
pp. 73-95, 2008.
[10] D. Molla and J.L. Vicedo, Question answering in restricted domains: An
overview, Computat. Linguist, vol. 13, no. 1, pp. 41-61, 2007.
[11] H. Cui, M.-Y. Kan, and T.-S. Chua, Soft pattern matching models for
definitional question answering, ACM Trans. Inf. Syst., vol. 25, no. 2,
pp. 30-30, 2007.
[12] R.C. Wang, N. Schlaefer, W. W. Cohen, and E. Nyberg, Automatic set
expansion for list question answering, in Proc. Int. Conf. Empirical
Methods in Natural Language Processing, 2008.
[13] R.Hong, M. Wang, G. Li, L. Nie, Z.-J. Zha, and T.-S Chua, Multimedia
Question Answering IEEE 2012.
[14] L. Nie et al., Multimedia Answering: Enriching Text QA with Media
Information, Proc. 34th IntI ACM SIGIR Conf. Research and
Development in Information Retrieval, ACM Press, 2011, pp. 695-704.
[15]G. Kacmarcik, Multi-Modal Question-Answering: Questions Without
Keyboards, Asia Federation of Natural Language Processing, 2005.
[16] M. Wang, X.S. Hua, R. Hong, J. Tang, G.J. Qi, Y. Song, and L.R. Dai,
Unified video annotation via multi-graph learning, IEEE Trans. Circuits
Syst. Video Technol., vol. 19, no. 5, pp. 733-749, 2009.
[17] M. Wang, X. S. Hua, T. Mei, R. Hong, G. J. Qi, Y. Song, and L. R. Dai,
Semi-supervised kernel density estimation for video annotation,
Comput. Vision Image Understand., vol. 113, no. 3, pp. 384-396, 2009.
[18] J. Tang, R. Hong, S. Yan, T. S. Chua, G. J. Qi, and R. Jain, Image
annotation by KNN- sparse graph-based label propagation over noisytagged web images, ACM Trans. Intell. Syst. Technol., vol. 2, no. 2, pp.
1-15, 2011.
[19] J. Tang, X. S. Hua, M. Wang, Z. Gu, G. J. Qi, and X. Wu, Correlative
linear neighborhood propagation for video annotation, IEEE Trans.
Syst., Man, Cybern. B, vol. 39, no. 2, pp. 409-416, 2009.
[20] Z.-J. Zha, X.-S. Hua, T. Mei, J. Wang, G. J. Qi, and Z. Wang, Joint
multi-label multi-instance learning for image classification, in Proc.
IEEE Conf. Computer Vision and Pattern Recognition, 2008, pp. 1-8.

REFERENCES
[1]G.Li, H. Li, Z. Ming, R. Hong, S. Tang, and T.-S. Chua, Question
answering over community contributed web videos , IEEE Multimedia,
vol. 17, no. 4, pp. 46-57, 2010.
[2] H. Yang et al., Structured Use of External Knowledge for Event-based
Open-Domain Question-Answering, Proc 26th Ann. IntI ACM SIGIR
Conf. Research and Development in Information Retrieval, ACM Press,
2003, pp 33-40.

[21] R. Datta et al., Image Retrieval: Ideas, Influence, and Trends of the
New Age, ACM Computing Surveys, vol. 40, no. 2, 2008, article no. 5.
[22] A. Kotav and C. Zhai, Towards Natural Question Guided Search,
Proc. 19th IntI Conf. World Wide Web(WWW), ACM Press, 2010, pp.
541-550.
[23]S. Huston and W. B. Croft, Evaluating Verbose Query Processing
Techniques, Proc. 33rd Int I ACM SIGIR Conf. Research and
Development in Information Retrieval, ACM Press. 2010, pp. 291-298.

ISBN No.978-1-4799-3834-6/14/$31.002014 IEEE

ICICES2014 - S.A.Engineering College, Chennai, Tamil Nadu, India


[24] G. Kumaran and V.R. Carvalho, Reducing Long Queries Using Query
Quality Predictors, Proc. 32nd IntI ACM SIGIR Conf. Research and
Development in Information Retrieval, ACM Press, 2009, pp. 541-571.

ISBN No.978-1-4799-3834-6/14/$31.002014 IEEE

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy