Event box

Text Data: Basics of Text Processing and Regular Expressions

Text Data: Basics of Text Processing and Regular Expressions Online

Join workshop in Webex: https://gsumeetings.webex.com/meet/jwalker184

RDS@GSU Data Certification: https://lib.gsu.edu/data-certified

The size and volume of textual data available to academic researchers is absolutely immense. Consequently, for some researchers, having the skills to process, transform, and analyze text data using computational tools is increasingly necessary for certain types of research.

This workshop will introduce the fundamentals of working with and manipulating text data using scripting languages (e.g. Python, R). This includes loading, processing, and preparing text data for use with quantitative models. Although advanced natural language processing (NLP) models are not included in this workshop, some possible applications may be demonstrated if time permits.

No special background knowledge or skills are required to attend. All are welcome.

Workshop Topics

-- Common text and string operations

-- Tokenization, transformations, and processing

-- Regular Expressions

-- N-grams and term frequencies

Prerequisites: Basic familiarity with Python, R, or any scripting language preferred.

Software Requirements:

-- Participants will need a Google / Gmail account in order to access Google Colab

-- No software installation is required.

NOTE: Please read our Workshops ~ Etiquette & Policies page for pertinent information to your workshop attendance.

Presenter: Jeremy Walker, Data Services Librarian and member of the Library's Research Data Services Team

Related LibGuide: *Research Data Services @ Georgia State University Library by Mandy Swygart-Hobaugh

Tuesday, October 26, 2021
11:00am - 1:30pm
Time Zone:
Eastern Time - US & Canada (change)
All Campuses
  Data Services Workshops     Online workshops  
This is an online event. Event URL: https://gsumeetings.webex.com/meet/jwalker184

Event Organizer

Profile photo of Jeremy Walker
Jeremy Walker