MI282 | Natural Language Processing |
Teaching Staff in Charge |
Lect. LUPSA Dana, Ph.D., danacs.ubbcluj.ro |
Aims |
Natural language processing is now accepted as one of the most studied and active field of Computer Science. The notion of feature structure as linguistic object stands on the base of most recent approaches which are surveyed in this course. The optimization of the search on Web, the interfaces in natural language and the aspects of text mining are only some of motivations for studying natural language processing. |
Content |
1. Introduction in Natural Language Processing: Stages, Domains, Chapters. Corpora.
2. Word Sense Disambiguation. Machine learning approach: supervised (NBC and k-NN) and unsupervised (by clustering). Dictionary based approach (Lesk, Yarowsky, bilingual dictionaries). 3. Statistics in NLP: Markov chains, Hidden Markov Model. Evaluation, Estimation and Training with HMM. The probability of input sequences, the most likely path. Applications to POS tagging. 4. Probabilistic Context- free Grammars. Syntactic analysis: active charts. Earley@s algorithm. 5. Unification grammars. Feature structures as objects of linguistic knowledge representation. Feature structures as graphs, AVM and descriptors. Parsing with unification grammars. |
References |
1. J.ALLEN : Natural language understanding, Benjamin/Cummings Publ. , 2nd ed., 1995.
2. E. CHARNIAK: "Statistical language learning", MIT press, 1996. 3. B.CARPENTER: ALE:The attribute logic engine.User's guide. Carnegie Mellon University,1994. 4. D.JURAFSKY, J.MARTIN: Speech and language processing, Prentice Hall, 2000. 5. C.MANNING, H.SCHUTZE: Foundation of statistical natural language processing, MIT, 1999. 6. S.J.RUSSELL, P.NORVIG: Artificial intelligence.A modern approach, Prentice-Hall International,1995. 7. D.TATAR: Inteligenta artificiala: demonstrare automata de teoreme, prelucrarea limbajului natural, Editura Albastra, Microinformatica, 2001. 8. D.TATAR: Unification Grammars in Natural Language Processing, in "Recent topics in mathematical and computational linguistic, ed. C. Martin-Vide, G. Paun, Editura Academiei, 2000, pg 289-300. 9. D. TATAR: Inteligenta artificiala. Aplicatii in prelucrarea limbajului natural,Editura Albastra, Microinformatica, 2003, ISBN 973-650-100-0 10. Editor R. MITKOV: The Oxford Handbook of Computational Linguistics, Oxford University Press, 2003. |
Assessment |
The examination is by written exam, with the subjects from all the matter (60%). Will be evaluated the activity of understanding and communication of some recent papers in the field and implementation of some apllications (40%). |
Links: | Syllabus for all subjects Romanian version for this subject Rtf format for this subject |