Skip to content

IPA ↔ SAMPA Converter

Short description

Create a library that can convert sequences of phonetic symbols from Unicode-based IPA (International Phonetic Alphabet) into SAMPA (an ASCII-based phonetic alphabet). The focus is on mapping symbols rather than handling full tokenization, so the input may already arrive as separate characters.

  • IPA (International Phonetic Alphabet) – a system for representing speech sounds using special characters.
  • SAMPA (Speech Assessment Methods Phonetic Alphabet) – an ASCII-based alternative to IPA.
  • Unicode – a standard for encoding text characters, including phonetic symbols.
  • Character mapping – defining correspondences between two symbol sets.
  • Phonetic transcription – writing down speech sounds using consistent notation.

High-level technical guidelines

  • Represent the conversion as a dictionary mapping IPA characters to SAMPA equivalents.
  • Work with Unicode strings in Python to handle IPA correctly.
  • Start with a small subset of IPA characters and expand gradually.
  • Handle unknown or unmapped characters gracefully (e.g., mark them as “?”).
  • Optionally, allow users to update or extend the mapping file for custom use.

Using GPT with an expert persona

Using GPT with an expert persona can be a powerful way to simulate conversations with specialists, helping you quickly understand new concepts, explore different perspectives, and even discover useful features or requirements you might not have thought of on your own. Since this is a learning environment, you can also safely test the boundaries of what GPT knows and where it fails, without any risk to your job or reputation—making it a low-stakes, high-value tool for practicing how to ask good questions, evaluate answers critically, and deepen your understanding.

...but the responsibility for correctness and implementation is still yours!

When using the expert persona prompt, treat GPT as a helpful consultant, not an unquestionable authority. The answers can give you inspiration, explanations, or practical examples, but you should always double-check information in reliable sources and test ideas in your own code. Think of it as brainstorming with an expert partner—you get useful guidance, but the responsibility for correctness and implementation is still yours.

Sample expert persona prompt

You are a friendly but professional consultant helping early-year software engineering students create a converter between IPA and SAMPA. Take on the perspectives of a linguist specializing in phonetics, a computational linguist experienced in text encoding and symbol mapping, a software engineer working with Unicode and string handling, a tool designer concerned with usability and extensibility, and a language learner who benefits from clear phonetic transcriptions. Be constructive, but let the students guide the discussion. If they drift from a professional tone, gently remind them. Always explain domain-specific terminology in simple words, and encourage students to ask questions if something is unclear. Ask as many clarification questions as possible to make sure you and the student are fully aligned before giving detailed answers.

Roles

  • Linguist (phonetics) – ensures accurate IPA–SAMPA correspondences and clarifies symbol nuances.
  • Computational linguist (encoding/mapping) – defines robust character mappings, normalization, and token handling.
  • Software engineer (Unicode/string handling) – implements safe Unicode processing, edge-case handling, and tests.
  • Tool designer (usability/extensibility) – specifies clear APIs/CLIs, graceful unknown-symbol behavior, and pluggable maps.
  • Language learner (end-user) – validates that outputs are readable, consistent, and helpful for pronunciation.
  • Documentation writer – provides concise tables, examples, and guidance for extending the mapping.