Babes-Bolyai University of Cluj-Napoca
Faculty of Mathematics and Computer Science
Study Cycle: Master

SUBJECT

Code
Subject
MII1005 Knowledge Based Systems and Language Technology
Section
Semester
Hours: C+S+L
Category
Type
Intelligent Systems - in English
2
2+1+1
speciality
compulsory
Teaching Staff in Charge
Prof. TATAR Doina, Ph.D.,  dtatarcs.ubbcluj.ro
Aims

Language technology (LT) is a subfield of artificial intelligence which studies the quantitative aspects of natural human languages. The main tools in the applications of LT are Knowledge Based Systems (KBS), as for example ontologies, thesauri, corpora, electronical dictionaries. The course has as objectives to provide the basic
principles, technologies and applications of LT and KBS . The optimization of the search on Web, the interfaces in natural language and the recent aspects of text mining are only some of the motivations for studying LT and KBS. A deep understanding of the current state of the art in LT in order to realize original research in LT is another
goal of this course.
Content
1. Language Technology: Stages, Domains, Chapters.
2. The corpus as a KBS and as a tool of different applications in LT. The most important taggs for an annotated corpus. Using the corpus in a supervised method.
3. WordNet as a KBS. Relations and synsets. Examples of applications.
4. Word Sense Disambiguation(WSD): machine learning approach (supervised and unsupervised), dictionary based approach. Where WSD is needed.
5. Statistical natural language processing and LT: Markov chains, Hidden Markov Model. Evaluation, Estimation and Training with HMM. Applications.
6. Probabilistic Context- free Grammars. Syntactic analysis: active charts.
7. Feature structures (FS): FS as graphs, AVM and descriptors. Unification of Feature structures.
8. Unification grammars. Parsing with Unification grammars.


References
1. J.ALLEN : Natural language understanding, Benjamin/Cummings Publ. , 2nd ed., 1995.
2. E. CHARNIAK: $Statistical language learning$, MIT press, 1996.
3. B.CARPENTER: ALE:The attribute logic engine.User@s guide. Carnegie Mellon University,1994.
4. H. Helbig: $Knowledge Representation and the Semantics of Natural Language$, Springer, 2006.
5. D.JURAFSKY, J.MARTIN: Speech and language processing, Prentice Hall, 2000.
6. C.MANNING, H.SCHUTZE: Foundation of statistical natural language processing, MIT, 1999.
7. (Editor) R. MITKOV: The Oxford Handbook of Computational Linguistics, Oxford University Press, 2003.
8. S.J.RUSSELL, P.NORVIG: Artificial intelligence.A modern approach, Prentice-Hall International,1995.
9. D.TATAR: Inteligenta artificiala: demonstrare automata de teoreme, prelucrarea limbajului natural, Editura Albastra, Microinformatica, 2001.
ra Academiei, 2000, pg 289-300.
10. D. TATAR: Inteligenta artificiala. Aplicatii in prelucrarea limbajului natural,Editura Albastra, Microinformatica, 2003, ISBN 973-650-100-0
Assessment
The final note will have the following components :

(1) Project (providing a LT or KBS techniques tool) .... 30%

(2) Research based on recent papers (at least 2)...30%

(3) Final examination .............................40%
Links: Syllabus for all subjects
Romanian version for this subject
Rtf format for this subject