課號 |
課名 |
開課班級 |
上課時間地點 |
|
Information
Retrieval |
資訊工程系碩士班 |
(一)5 |
Mining the Web-Discovering Knowledge
from Hypertext Data
Soumen Chakrabarti
ISBN: 1-55860-754-4
林典鍵、蘇辰豫
資訊檢索information retrieval
資訊檢索 (IR) 的技術是收集、處理文件,做資訊加值和知識擷取的基礎。
相關主題:
抓取文件 (Web Spider)
文件剖析 (Text Parsing)
文件資訊表示 (Representation Model)
索引文件 (Indexing)
搜尋引擎(Search Engine)
文件自動分類 (Document Classification)
資訊擷取和文字探勘 (Information Extraction and Text Mining)
Week |
Date |
notes |
Topics |
Presenter |
|
1 |
|
vacation |
|
|
|
2 |
3/5 |
|
|
Introduction, VSM,
TF-IDF |
|
3 |
3/12 |
|
|
VSM |
|
4 |
3/19 |
|
|
NTCIR |
蘇辰豫 |
5 |
3/26 |
|
|
Crawler |
1. 周仁亮, 2.王峻國 |
6 |
4/2 |
holiday |
|
|
|
7 |
4/9 |
|
|
Recall, Precision |
3. 洪大弘4.謝旻樺 |
8 |
4/16 |
|
|
Relevance Feedback |
6.劉勇成, 9.林典鍵 |
9 |
4/23 |
midterm |
|
Clustering, LSI |
7. 黃雅紫8.張昊5. 呂啟銓 |
10 |
4/30 |
|
|
midterm |
|
11 |
5/7 |
|
|
SOM, EM algorithm |
|
12 |
5/14 |
|
|
Supervised learning |
1. 洪大弘, 2.王峻國 |
13 |
5/21 |
|
|
NTCIR, QA |
3. 周仁亮4.謝旻樺, 劉勇成 |
14 |
5/28 |
|
|
Supervised learning |
5. 呂啟銓 |
15 |
6/4 |
|
|
NLP, 中文處理 |
7. 黃雅紫8.張昊, 李郁德 |
16 |
6/11 |
|
|
Social network,
PageRank |
Term project demo |
17 |
6/18 |
|
|
|
Term project demo |
18 |
6/25 |
|
|
|
term project demo |
Homework #1, do the exercise No.
1-3 (minimum requirement) No. 4-5 (optional) (4/9
due)
Homework #2, show a plan that can
determine how many web pages are available under *.cyut.edu.tw. (4/30 due)
Homework #3, Try the 50 queries for
TREC-6 on TREC Disk 4, show your result. (6/4 due)
First presentation: Latest reference of a topics in IR.
Second presentation: Most related
reference of your term project.
An IR system that can Demo!
Last update:
2007/07/09
shwu@cyut.edu.tw