Burmese Language SEO: Zawgyi vs. Unicode and What It Means for Search

Understand the Zawgyi vs Unicode divide and what it means for Burmese SEO. Learn how Google handles each encoding, how to check your site, convert content, and rank Burmese pages.

If you run a Myanmar website and publish content in Burmese, there is one technical question that determines whether Google can index your content at all: are you using Zawgyi or Unicode?

This is not a minor formatting preference. It is the difference between Burmese content that appears in Google search results and Burmese content that is effectively invisible to search engines. Understanding the Zawgyi-Unicode divide is the single most important technical concept for anyone doing Burmese language SEO.

This article explains the history, the technical reality, how to check which encoding your site uses, how to convert existing content, and what best practices look like for Burmese content that actually ranks.


The Two Encoding Systems: A Brief History

Zawgyi: The Legacy System

Before Unicode support for the Myanmar script was robust enough for practical use, developers in Myanmar created their own font-based encoding called Zawgyi. Zawgyi works by reassigning Latin characters and special symbols to Myanmar script glyphs within a custom font. When you install the Zawgyi font and see what appears to be Burmese text, what you are actually seeing is the font drawing Burmese characters in positions that are meant for other characters in standard encoding.

Zawgyi spread widely because it worked — for years, it was the practical way to type and display Burmese text on most devices. Millions of Myanmar internet users installed Zawgyi fonts as a matter of course. Websites, applications, and even some government systems adopted it as their standard.

The problem: Zawgyi is not a standard. Different Zawgyi implementations handle certain character combinations differently. And crucially, systems without the Zawgyi font installed — including Google's indexing infrastructure — cannot render or interpret Zawgyi text correctly.

Unicode: The International Standard

Unicode is the international standard for encoding text across all writing systems. The Myanmar script has its own dedicated block in Unicode (code points U+1000 through U+109F for core characters, with additional extended blocks). Devices and browsers have supported Myanmar Unicode natively since around 2012 on modern systems.

In 2019, Myanmar officially mandated Unicode as the national standard for digital text. Major technology platforms — Android, iOS, Windows, Mac, Google — use and index Unicode. Any Burmese text stored and transmitted in Unicode is readable by any modern system without requiring a special font installation.


Why This Matters for Google

How Google Reads Zawgyi Text

When Google's crawler encounters Zawgyi-encoded Burmese text on a webpage, it does not have the Zawgyi font installed. It reads the raw character codes and sees a sequence of standard Latin characters, punctuation marks, and combining diacritics that happen to look like Burmese text when rendered with the Zawgyi font.

The result: Google does not understand the content as Burmese text. It cannot extract meaningful keywords. It cannot correctly classify the language. The page may be indexed, but it will not rank for Burmese search queries because Google does not recognise the content as matching those queries.

In some cases, Zawgyi text renders in search result snippets as garbled strings of characters — a visible symptom that tells you the encoding is wrong.

How Google Reads Unicode Text

Unicode Burmese is a different story. Google's systems are fully equipped to process Unicode. The crawler reads the Myanmar script characters, correctly identifies the language as Burmese (ISO code: my), extracts keywords and topics, and indexes the page for relevant Burmese search queries.

Unicode is also what Google uses to process user search queries. When a Myanmar user types a search in Burmese on a modern smartphone (which uses Unicode by default), Google matches that query against Unicode-encoded content. If your page is in Zawgyi, the encoding simply does not match — even if the words are conceptually identical.

The practical conclusion: Any Burmese content on your website that uses Zawgyi encoding is not contributing to your SEO. It may as well not exist from Google's perspective.


How to Check Your Website's Encoding

Method 1: View Page Source

  1. Open your website in a browser
  2. Right-click anywhere on the page and select "View Page Source"
  3. Look at the top of the source code for the <head> section

Check for font references: Search (Ctrl+F) for "Zawgyi" or "ZawgyiOne" or "Myanmar3" in the source. If you find these referenced as font-family declarations without any Unicode fallback, your site is likely serving Zawgyi.

Check the charset: Look for a line like <meta charset="UTF-8">. UTF-8 is the encoding standard for Unicode. If you see a different charset declaration, or none at all, that is a concern.

Check the lang attribute: The <html> tag should include lang="my" for Burmese content. This tells Google the language explicitly.

Method 2: Use a Browser Extension

The "Rabbit Myanmar" browser extension for Chrome can detect whether text on a page is Zawgyi or Unicode. Install it and visit your pages to get an immediate indication of which encoding is in use.

Method 3: Google Search Console Language Report

Once your site is connected to Google Search Console, check the "Search results" performance report and filter by "Page." If pages with Burmese content show impressions for Burmese queries, the content is being indexed as Burmese Unicode. If those pages show no Burmese query impressions despite having Burmese content, encoding may be the issue.

Method 4: The Copy-Paste Test

Copy a section of Burmese text from your website and paste it into a plain text editor (Notepad on Windows, TextEdit in plain text mode on Mac). If the pasted text appears as recognisable Burmese script without any font applied, it is Unicode. If it appears as random Latin characters and symbols, it is Zawgyi.


Converting Zawgyi Content to Unicode

The Rabbit Converter

The most widely used tool for Zawgyi-to-Unicode conversion in Myanmar is Rabbit Converter (rabbit-converter.org). It provides:

  • A web interface for converting small blocks of text by pasting them in
  • An API for automated conversion of larger content
  • Bidirectional conversion (Zawgyi → Unicode and Unicode → Zawgyi)

For converting a few pages manually, the web interface is sufficient. For sites with hundreds of pages and a database full of Zawgyi content, the API approach or a script-based conversion is necessary.

Bulk Database Conversion

If your website runs on a CMS (WordPress, custom PHP, etc.) and stores content in a database, you need to convert the database content rather than individual pages.

General process:

  1. Export your database (SQL dump)
  2. Run the Zawgyi-to-Unicode conversion on the exported content using a script that calls the Rabbit Converter API or uses the open-source rabbit-c library
  3. Test the converted content to verify accuracy
  4. Import the converted database to a staging site first and review the output
  5. Replace the production database once verified

Important: Always back up your database before making bulk conversions. Test on a staging environment. Some Zawgyi content — particularly older posts or user-generated content — may have encoding inconsistencies that produce incorrect conversion output and require manual review.

CMS-Level Considerations

WordPress: If your WordPress site serves Zawgyi, the issue may be in the theme's font declarations, the stored content, or both. Fix the content first (database conversion), then update the theme to use system Unicode fonts or a Google Fonts Myanmar Unicode font like Padauk or Pyidaungsu.

Custom-built sites: The font CSS declaration controls what the user sees. The character encoding in the HTML and database controls what search engines see. Both must use Unicode.

Facebook-generated content: Many Myanmar businesses copy-paste content directly from Facebook (which uses Unicode) into their CMS. If this content is rendering incorrectly on your site, the issue is likely in the CMS's font stack rather than the encoding itself. Check whether your body or content font-family is forcing Zawgyi.


Burmese Keyword Research: Unique Challenges

Burmese language has characteristics that make keyword research different from working with European languages.

No Word Boundaries

Burmese text has no spaces between words. The sentence "I love Yangon" in Burmese runs all syllables together without breaks. This means:

  • Automated keyword tools that rely on word-splitting algorithms produce poor results for Burmese
  • Google's understanding of Burmese search intent is more nuanced than simple keyword matching
  • You cannot simply translate an English keyword directly and expect the same search volume pattern

What to do: Use Google's own search bar as your primary Burmese keyword research tool. Type Burmese search terms and observe autocomplete suggestions — these are real queries real users have submitted. Use Google Trends filtered to Myanmar for relative popularity comparisons.

Tonal and Script Variation

Some Burmese words can be spelled or entered in slightly different ways depending on the writer's region, age, or transliteration habit. When targeting Burmese keywords, include natural variations in your content rather than obsessing over a single exact spelling.

Mixed-Language Searches

Many Myanmar internet users combine Burmese and English in a single search query — especially for business, technology, or professional topics. A query like "[Burmese word for service] + Yangon" or "SEO [Burmese explanation of what it is]" is common. Plan content that naturally addresses these mixed-language intents.

Font-Rendering Differences in Search Results

Until relatively recently, some users in Myanmar were still on devices where the Zawgyi font rendered Burmese text. Google's search results page renders Myanmar script in Unicode. A Zawgyi-encoded snippet in search results would appear as garbled text on Unicode devices, and vice versa. This is another reason why enforcing Unicode sitewide — not mixing both — is essential.


Best Practices for Publishing Burmese Content That Ranks

1. Confirm Your Entire Site Uses Unicode

Do not have a mixed site where some pages are Unicode and others are Zawgyi. Mixed encoding creates confusing signals for Google and a bad experience for users. Audit everything and enforce Unicode universally.

2. Set the Correct HTML Language Attribute

Your pages with Burmese content should declare:

<html lang="my">

For bilingual pages or sites with a mix of Burmese and English:

<html lang="my-MM">

This explicit language declaration helps Google understand and correctly classify your content.

3. Use Unicode-Compatible Myanmar Fonts

Choose fonts that are designed for Unicode Myanmar rendering:

  • Padauk — widely used, open source, good rendering on most devices
  • Pyidaungsu — the Myanmar government's official Unicode font, clean and professional
  • Myanmar Sangam MN — available on macOS/iOS, good for mixed Burmese-English text

Avoid loading Zawgyi-One or Myanmar3 fonts on your site even as a fallback, as this can cause encoding confusion.

4. Write Natural, Readable Burmese Content

Google's quality signals apply to Burmese content the same as English. Burmese content that reads naturally, answers genuine questions, and is substantive in length will outperform thin or keyword-stuffed content. Write for humans first.

5. Target Burmese and English Separately

Create separate pages or posts for Burmese-language and English-language versions of your content. Do not mix both languages on the same page. Separate pages allow each to rank independently for their respective language queries.

6. Monitor with Google Search Console

After converting to Unicode and publishing Burmese content, watch your Search Console performance report for:

  • Increasing impressions for Burmese search queries
  • Burmese pages appearing in the Coverage report as indexed
  • No language detection errors in the International Targeting report

If Burmese pages are indexed but not generating Burmese query impressions, the content quality or relevance may need attention. If they are not being indexed at all, check for crawl errors or technical issues.

7. Use hreflang for Bilingual Sites

If you publish the same content in both Burmese and English, use hreflang tags to tell Google which version to show to which audience:

<link rel="alternate" hreflang="my" href="https://yourdomain.com/my/article" />
<link rel="alternate" hreflang="en" href="https://yourdomain.com/en/article" />

This prevents Google from treating your bilingual content as duplicate content.


The Transition Timeline: What Myanmar Businesses Should Do Now

The market is mid-transition. Zawgyi still exists in older devices and legacy systems, but modern smartphones — which account for the vast majority of Myanmar internet usage — default to Unicode. Facebook, Google, and all major platforms operate on Unicode.

The direction is clear and irreversible: Unicode is the present and future of Burmese digital text.

For SEO purposes, this means:

  • New content should be in Unicode from day one
  • Existing Zawgyi content should be migrated as soon as practically possible
  • Social media content (Facebook, Viber) is already in Unicode by default — this content can be repurposed for your website without encoding concerns
  • Email marketing and SMS should use Unicode for consistent rendering across devices

Frequently Asked Questions

Q: If my website still uses Zawgyi, will it rank at all on Google? Your English-language content will rank normally. Your Zawgyi-encoded Burmese content, however, will not rank for Burmese search queries because Google cannot correctly read or index it. The Zawgyi characters are interpreted as garbled Latin characters rather than Burmese script. Converting to Unicode is the only fix.

Q: Can I use both Zawgyi and Unicode on the same website? Technically possible but strongly inadvisable. Mixed encoding causes inconsistent search engine indexing, confusing rendering for users on different devices, and complex maintenance challenges. Commit to Unicode sitewide and migrate all Zawgyi content in one organised process.

Q: Does Zawgyi content on Facebook affect my SEO? Facebook itself uses Unicode internally — when you type in Burmese on Facebook, it stores and transmits the content in Unicode. Users whose devices have Zawgyi fonts installed may display that Unicode content in Zawgyi rendering, but the underlying data is Unicode. So Facebook content does not have a Zawgyi encoding problem. Only your actual website's HTML content needs to be audited.

Q: Are there automatic tools that can convert my entire WordPress database from Zawgyi to Unicode? Yes. The open-source rabbit-c library and its PHP wrapper can be integrated into a WordPress migration script. Some WordPress developers in Myanmar have created plugins that automate this conversion. Search for "Myanmar Zawgyi to Unicode WordPress converter" to find current options. Always test on a staging site before running any bulk database change on production.

Q: After converting from Zawgyi to Unicode, how long does it take for Google to re-index and reflect the change in rankings? After converting your content, submit an updated sitemap through Google Search Console and use the URL Inspection tool to request re-indexing of key pages. Google typically re-crawls and re-indexes updated pages within one to four weeks. You may see improved Burmese keyword rankings within a few weeks of successful re-indexing.