Loading Events

« All Events

Text Analysis with Computation and Large Language Models 

Event Language

English

Format

in person/face-à-face

Dates: 14th-17th May 2024

This workshop will introduce students to the possibilities of analyzing text using computationally focused methods. This includes classic computational text analysis such as topic modelling, sentiment analysis, and word embedding. All of this and more available to everyone via open-source Python Libraries. However what is different is that this class will also delve into a new emergent area: analysis using large language models (LLM). Much is being said presently about LLMs and for better or for worse they are becoming a part of daily life for answering questions and attempting to automate certain tasks. What is novel is a research focused possibility to use LLM for something known as Retrieval Augmented Generation (RAG). With RAG an LLM is pre-seeded with a corpus of documents that it will refer to when generating responses. It is possible then to make natural language inquiries against the corpus in order to generate insights. There are limitations to this however as LLMs are notorious for drifting from the truth, but it is worth exploring the capabilities of such systems.

This class will provide learners an opportunity to walk through a complete analysis of a dataset using all of these computational methods to see the full gamut of what is possible. As a concluding activity participants will be encouraged to scaffold their own dataset into this developed framework to see what research insights they can produce.

Instructors: Tim Ribaric and John Fink

Trent Lane
Guelph, Ontario Canada
+ Google Map