This site is last-updated on 2024-03-22
Important course information will be posted on this web page and announced in class. You are responsible for all material that appears here and should check this page for updates frequently.
Computational Linguistics (CL) is now a very active sub-discipline in applied linguistics. Its main focus is on the computational text analytics, which is essentially about leveraging computational tools, techniques, and algorithms to process and understand natural language data (in spoken or textual formats). Therefore, this course aims to introduce useful strategies and common workflows that have been widely adopted by data scientists to extract useful insights from natural language data. In this course, we will focus on textual data processing.
A selective collection of potential topics may include:
This course is extremely hands-on and will guide the students through classic examples of many task-oriented implementations via in-class theme-based tutorial sessions. The main coding language used in this course is Python . We will make extensive use of the language. It is assumed that you know or will quickly learn how to code in Python. In fact, this course assumes that every enrolled student has working knowledge of Python. (If you are not sure if you fulfill the prerequisite, please contact the instructor first.)
A test on Python Basics will be conducted on the first week of the class to ensure that every enrolled student fulfills the prerequisite. (To be more specific, you are assumed to have already had working knowledge of all the concepts included in the book, Lean Python: Learn Just Enough Python to Build Useful Tools). Those who fail on the Python basics test are NOT advised to take this course.
Please note that this course is designed specifically for linguistics majors in humanities. For computer science majors, please note that this course will not feature a thorough description of the mathematical operations behind the algorithms. We focus more on the practical implementation.
(The schedule is tentative and subject to change. Please pay attention to the announcements made during the class.)
Week | Date | Topic |
---|---|---|
Week 1 | 2023-02-23 | Course Orientation and Computational Linguistics Overview |
Week 2 | 2023-03-01 | NLP Pipeline |
Week 3 | 2023-03-08 | Machine Learning Basics: Regression and Classification |
Week 4 | 2023-03-15 | Naïve Bayes, Logistic Regression |
Week 5 | 2023-03-22 | Feature Engineering and Text Vectorization |
Week 6 | 2023-03-29 | Common NLP Tasks (Guest Speaker: Robin Lin from Droidtown Linguistic Tech. Co. Ltd. ) |
Week 7 | 2023-04-05 | Holiday |
Week 8 | 2023-04-12 | Neural Network: A Primer |
Week 9 | 2023-04-19 | Midterm Exam |
Week 10 | 2023-04-26 | Deep Learning NLP and Word/Doc Embeddings |
Week 11 | 2023-05-03 | Sequence Model I: RNN and Neural Language Model |
Week 12 | 2023-05-10 | Sequence Model II: LSTM and GRU |
Week 13 | 2023-05-17 | Sequence Model III: Sequence-to-Sequence Model & Attention |
Week 14 | 2023-05-24 | Transformer, BERT, Transfer Learning, and Explainable AI |
Week 15 | 2023-05-31 | LLM, RAG, and Multimodal Processing |
Week 16 | 2023-06-07 | Final Exam |
All the course materials are available on the course website. Please consult the instructor for the direct link to the course materials. They will be provided as a series of online packets (i.e., handouts, script source codes etc.) on the course website.
If you have any further questions related to the course, please consult FAQ on our course website or write me at any time at alvinchen@ntnu.edu.tw.
While I have made every attempt to ensure that the information contained on the Website is correct, I am not responsible for any errors or omissions, or for the results obtained from the use of this information. All information on the Website is provided “as is”, with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information, and without warranty of any kind, express or implied.
You may print a copy of any part of this website for your personal or non-commercial use. Without the author’s prior written consent, you cannot disclose confidential information of the website (e.g., log-in username and password) to any third party.