Week 14: November 28 - December 2

Computing With Text

Project Gutenberg provides an online database of more than 49,000 free ebooks. We use Python’s file handling functions to load and display some classic texts.

Statistical Language Models

A statistical language model assigns probabilities to sequences of words. These models have uses in many natural language processing applications such as speech recognition, machine translation, spelling correction and handwriting recognition. We use Python to build language models of increasing complexity based on the distribution of words in the texts downloaded from Project Gutenberg.

Python

  • Reading and writing files using open, close and read.
  • String processing - lower, replace, split, strip and join.
  • Dictionaries
  • Sorting sequences using sorted.

Matplotlib

  • pie