Unrestricted Use
CC BY
This lesson shows how to use Python to transliterate automatically a list of words from a language with a non-Latin alphabet to a standardized format using the American Standard Code for Information Interchange (ASCII) characters. It builds on readers’ understanding of Python from the lessons “Viewing HTML Files,” “Working with Web Pages,” “From HTML to List of Words (part 1)” and “Intro to Beautiful Soup.” At the end of the lesson, we will use the transliteration dictionary to convert the names from a database of the Russian organization Memorial from Cyrillic into Latin characters. Although the example uses Cyrillic characters, the technique can be reproduced with other alphabets using Unicode.
- Subject:
- Applied Science
- Computer Science
- Material Type:
- Diagram/Illustration
- Provider:
- Center for History and New Media
- Author:
- Seth Bernstein
- Date Added:
- 06/16/2015