How to check if a word is English or not in Python

How to check if a word is English or not in Python

Here I introduce several ways to identify if the word consists of the English alphabet or not.

1. Using isalpha method

In Python, string object has a method called isalpha

word = "Hello"
if word.isalpha():
    print("It is an alphabet")
    
word = "123"
if word.isalpha():
    print("It is an alphabet")
else:
    print("It is not an alphabet")

However, this approach has a minor problem; for example, if you use the Korean alphabet, it still considers the Korean word as an alphabet. (Of course, for the non-Korean speaker, it wouldn’t be a problem 😅 )

To avoid this behavior, you should add encode method before call isalpha.

word = "한글"
if word.encode().isalpha():
    print("It is an alphabet")
else:
    print("It is not an alphabet")

2. Using Regular Expression.

I think this is a universal approach, regardless of programming language.

import re
word="hello"
reg = re.compile(r'[a-zA-Z]')

if reg.match(word):
    print("It is an alphabet")
else:
    print("It is not an alphabet")
    
word="123"
reg = re.compile(r'[a-z]')
if reg.match(word):
    print("It is an alphabet")
else:
    print("It is not an alphabet")

3. Using operator

It depends on the precondition; however, we will just assume the goal is if all characters should be the English alphabet or not.

Therefore, we can apply the comparison operator.

word = "hello"

if 'a' <= word[0] <= "z" or 'A' <= word[0] <='Z':
    print("It is an alphabet")
else:
    print("It is not an alphabet")

Note that we have to consider both upper and lower cases. Also, we shouldn’t use the entire word because the comparison would work differently based on the length of the word.

We can also simplify this code using the lower or upper method in the string.

word = "hello"

if 'a' <= word[0].lower() <= "z":
    print("It is an alphabet")
else:
    print("It is not an alphabet")

4. Using lower and upper method

This is my favorite approach. Since the English alphabet has Lower and Upper cases, unlike other characters (number or Korean), we can leverage this characteristic to identify the word.

word = "hello"
if word.upper() != word.lower():
    print("It is an alphabet")
else:
    print("It is not an alphabet")

Happy coding!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: