NTCIR-8 Patent Mining Task CFP

	NTCIR-8 Patent Mining Task CALL FOR PARTICIPATION

Task Overview

The purpose of the Patent Mining Task is to create technical trend maps from a set of research papers and patents. Figure 1 shows an example of a technical trend map that is created from research papers and patents. In this map, research papers and patents are classified in terms of elemental technologies and their effects.

	Effect 1	Effect 2	Effect 3
Technology 1	[AAA 1993] [US Pat. XX/XXXX]		[BBB 2002]
Technology 2	[CCC 2000]
Technology 3		[US Pat. YY/YYYY]	[US Pat. ZZ/ZZZZ] [USP WW/WWWW]

Figure 1 An example of a technical trend map created from a set of research papers and patents

To create technical trend maps, the following two steps are required.

(Step 1) For a given field, research papers and patents written in various languages are collected.
(Step 2) Elemental technologies and their effects are extracted from the documents collected in Step 1.

For each of these steps, we will conduct the following two subtasks.

Research Papers Classification
Technical Trend Map Generation (a pilot subtask)

In the following, we describe the details of these subtasks.

Subtask of Research Papers Classification

In this subtask, each system is requested to classify research papers written in either Japanese or English in terms of the International Patent Classification (IPC) system. The IPC is a global standard hierarchical patent classification system, containing more than 50,000 classes at the most detailed level. The goal of this task was to assign one or more of these 50,000 classes to a given topic, as expressed in terms of the title and/or abstract of a research paper. This subtask is almost the same as the Patent Mining Task in NTCIR-7 [Nanba, 2008a][Nanba, 2008b]. In NTCIR-7, the following subtasks were conducted.

Japanese subtask: classification of Japanese research papers using patent data written in Japanese.
English subtask: classification of English research papers using patent data written in English.
Cross-lingual subtask (J2E): classification of Japanese research papers using patent data written in English.
Cross-lingual subtask (E2J): classification of English research papers using patent data written in Japanese.

In the same way, Japanese, English, and Cross-lingual Subtasks are conducted in NTCIR-8.

Subtask of Technical Trend Map Creation

In this subtask, each system is requested to extract expressions of element technologies and their effects from research papers and patents. Figure 2 is an example of the data for this subtask. In this example, "Technology" and "Effect" tags are annotated to an elementary technology and its effect in the text. In the "Effect" tag, "Value" and "Attribution" tags are also annotated. We will conduct as a pilot subtask, because this is a new subtask, and we need to discuss about a task design. The following subtasks are conducted.

Japanese subtask: extraction of technologies and their effects from Japanese documents.
English subtask: extraction of technologies and their effects from English documents.

[Japanese]
PM磁束制御用コイルを設けて<Technology>閉ループフィードバック制御</Technology>を施すため、<Effect><Attribution>電力損失</Attribution>を<Value>最小化</Value></Effect>できる。
[English]
Through <Technology>closed-loop feedback control</Technology>, the system could <Effect><Value>minimize</Value> the <Attribution>power loss<Attribution> </Effect>.

Figure 2 An example of a data for the Subtask of Technical Trend Map Creation

Figure 3 is another example, in which "Value" is a numerical expression.

[English]
<Technology>CRF-based approach</Technology> obtained a <Effect><Attribute>precision</Attribute> of <Value>0.935</Value></Effect>.

Figure 3 Another example for the Subtask of Technical Trend Map Creation

If precision scores are extracted from a set of technical documents in a specific field, such as morphological analysis or syntactic analysis, and put them on a graph, we can understand changes of temporal statistical data. Figure 4 is an example of output, which is generated from a set of research papers in morphological analysis field. In this graph, x-axis and y-axis indicate "publication year of each paper" and "precision scores of morphological analysis systems", respectively. This study is considered as a kind of summarization of trend information, which was conducted as the MuST Task [Kato et al., 2008] in the NTCIR-7.

Figure 4 An example of output for the Subtask of Technical Trend Map Creation (changes of precision scores in Japanese morphological analysis)

Background and objectives

For a researcher in a field with high industrial relevance, retrieving research papers and patents has become an important aspect of assessing the scope of the field. Examples of these fields are bioscience, medical science, computer science, and materials science. In fact, the development of an information retrieval system of research papers and patents for academic researchers is central to the Intellectual Property Strategic Programs for 2006, 2007, and 2008 of the Intellectual Property Strategy Headquarters in the Cabinet Office, Japan.

In addition, research paper searches and patent searches are required by examiners in government Patent Offices, and by the intellectual property divisions of private companies. An example is the execution of an invalidity search among existing patents or research papers, which could invalidate a rival company's patents or patents under application in a Patent Office.

However, the terms used in patents are often more abstract or creative than those used in research papers, to try to widen the scope of the claims. Therefore, the Patent Mining Task aims to develop fundamental techniques for retrieving, classifying, and analyzing both research papers and patents.

In previous NTCIR Workshops, Patent Classification Subtasks were conducted [Iwayama, 2005][Iwayama, 2007]. In these subtasks, participants were asked to classify Japanese patent applications in terms of the File Forming Term (F-term) system, which is a classification system for Japanese patent documents. Here, we are focusing on the classification of research papers in addition to patents, and we conducted the Patent Mining Task in NTCIR-7 [Nanba, 2008a][Nanba, 2008b]. The aim of the Patent Mining Task in NTCIR-7 was the classification of research papers written in either Japanese or English in terms of the International Patent Classification (IPC) system. In NTCIR-8, we will continue this task. In addition to this subtask, we also start a new subtask of technical trend map creation.

Data used

Unexamined Japanese patent applications published in 1993-2002
USPTO patent data published in 1993-2002
Patent Abstracts of Japan (English translations for Japio patent abstracts) in 1993-2002
NTCIR-1, 2 CLIR task Test Collection (Abstracts of research papers)

Application Procedure

Application form is available at this page.

(to appear)

Schedule

Subtask of Research Papers Classification

Preparation Call for participation 2009.5.15

Data release As needed.

Dry run Topic release ~~2009.10.15~~2009.10.16

Submission deadline ~~2009.11.15~~2009.11.16

Evaluation release 2009.11.22

Formal run Topic release 2009.12.22

Submission deadline 2010.01.22

Evaluation release 2010.01.29

Preparation for meeting Paper for the proceedings due 2010.04.01

Camera-ready paper for the proceedings due 2010.05.15

Workshop meeting at Tokyo 2010.6.15-18

Preparation	Call for participation	2009.5.15
Data release	As needed.
Dry run	Topic release	~~2009.10.15~~2009.10.16
Submission deadline	~~2009.11.15~~2009.11.16
Evaluation release	2009.11.22
Formal run	Topic release	2009.12.22
Submission deadline	2010.01.22
Evaluation release	2010.01.29
Preparation for meeting	Paper for the proceedings due	2010.04.01
Camera-ready paper for the proceedings due	2010.05.15
Workshop meeting at Tokyo	2010.6.15-18

Subtask of Technical Trend Map Creation

Preparation Call for participation 2009.5.15

Data release (research papers) 2009.10.01 ⁽¹⁾

Data release (patents) 2009.10.15/31 ⁽²⁾

Dry run Topic release 2009.11.15

Submission deadline 2009.12.15

Evaluation release 2009.12.22

Formal run Topic release 2010.01.22

Submission deadline 2010.02.22

Evaluation release 2010.03.01

Preparation for meeting Paper for the proceedings due 2010.04.01

Camera-ready paper for the proceedings due 2010.05.15

Workshop meeting at Tokyo 2010.6.15-18

Preparation	Call for participation	2009.5.15
Data release (research papers)	2009.10.01 ⁽¹⁾
Data release (patents)	2009.10.15/31 ⁽²⁾
Dry run	Topic release	2009.11.15
Submission deadline	2009.12.15
Evaluation release	2009.12.22
Formal run	Topic release	2010.01.22
Submission deadline	2010.02.22
Evaluation release	2010.03.01
Preparation for meeting	Paper for the proceedings due	2010.04.01
Camera-ready paper for the proceedings due	2010.05.15
Workshop meeting at Tokyo	2010.6.15-18

(1) A training data: 250 research papers with manually annotated tags (English/Japanese)
(2) A training data: 250 patents with manually annotated tags (English/Japanese). 100 patents will be released on Oct. 15, and the remainder 150 on Oct. 31.

Changes in schedule will be announced via this page and the mailing list for participants.

References

[Iwayama, 2007] Iwayama, M., Fujii, A., and Kando, N. 2007. Overview of Classification Subtask at NTCIR-6 Patent Retrieval Task. Proceedings of the 6th NTCIR Workshop Meeting.
[Iwayama, 2005] Iwayama, M., Fujii, A., and Kando, N. 2005. Overview of Classification Subtask at NTCIR-5 Patent Retrieval Task. Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access.
[Kando, 1999] Kando, N., Kuriyama, K., Nozue, T., Eguchi, K., Kato, H., and Hidaka, S. 1999. Overview of IR Tasks at the First NTCIR Workshop. Proceedings of the 1st NTCIR Workshop on Research in Japanese Text Retrieval and Term Recognition, pp.11-44.
[Kando, 2001] Kando, N., Kuriyama, K., and Yoshioka, M. 2001. Overview of Japanese and English Information Retrieval Tasks (JEIR) at the Second NTCIR Workshop. Proceedings of the 2nd NTCIR Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization, pp.4-37 - 4-60.
[Nanba, 2008a] Nanba, H., Fujii, A., Iwayama, M., and Hashimoto, T. (2008) "Overview of the Patent Mining Task at the NTCIR-7 Workshop". In Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-lingual Information Access, 325-332.
[Nanba, 2008b] Nanba, H., Fujii, A., Iwayama, M., and Hashimoto, T. (2008) "The Patent Mining Task in the Seventh NTCIR Workshop". In Proceedings of the 1st International CIKM Workshop on Patent Information Retrieval (PaIR'08), 25-31.
[Nanba 2008c] Nanba, H., Anzen, N., and Okumura, M. Automatic Extraction of Citation Information in Japanese Patent Applications. International Journal on Digital Libraries, Vol.9, No.2, 151-161.

Organizers

Hidetsugu Nanba (Hiroshima City University)
Taiichi Hashimoto(Tokyo Institute of Technology)
Atsushi Fujii (University of Tsukuba)
Makoto Iwayama (Tokyo Institute of Technology / Hitachi Ltd.)

Contact

Last modified on May 15, 2009

NTCIR-8 Patent Mining Task CALL FOR PARTICIPATION

Task Overview

Subtask of Research Papers Classification

Subtask of Technical Trend Map Creation

Background and objectives

Data used

Application Procedure

Schedule

Subtask of Research Papers Classification

Subtask of Technical Trend Map Creation

References

Organizers

Contact

NTCIR-8 Patent Mining Task
CALL FOR PARTICIPATION