Skip to main content

How to run word search function for each letter of the alphabet

I have defined a function that searches a string for all 5-letter words starting with the letter I've specified. A text file is then created with the processed list.

I currently must run the program manually for each letter. How can I change my code so that the function will search for all letters of the alphabet one after the other and make a separate text file for each result?

import re
from pathlib import Path

# define starting letter of word search
ltr = 'm' 

# define RegEx pattern to find all 5-letter words starting with a 
certain letter 
pattern = r'\b'+ltr+r'\w{4}\b'

# define Function WordSearch: search wordlist.txt with defined RegEx 
Pattern
def WordSearch():
    # set Path to Main Text file
    file = r'C:\Users\User1\OneDrive\Documents\wordlist.txt'

    # open Main Text file and set variable 'MainText' equal to contents
    fileToOpen = Path(file)
    f = open(fileToOpen)
    MainText = f.read()

    # search all words in Main Text and make list of those that fit the 
    # RegEx pattern defined above
    slist = re.findall(pattern,MainText)

    # make set from list to eliminate duplicate entries
    slist_nd = list(set(slist))

    # turn Set into a string
    sliststr = ' '.join(slist_nd)

    # open new txt file, write string into contents
    slist_file = open(r"C:\Users\User1\OneDrive\Documents\fivelist"+ltr+".txt", "w")
    slist_file.write(sliststr)

    # close files
    f.close()
    slist_file.close()
WordSearch()
Answer

Here you go.

import string
letters = string.ascii_lowercase

Then you can do

for ltr in letters:
    pattern = r'\b'+ltr+r'\w{4}\b'

Note that you have to indent all forward code to be in the for loop.

Other helpful answers

Use a list and the loop through list. Use variables for your function. Call function in loop.

import string
import re
from pathlib import Path


def WordSearch(pattern, ltr):
    # set Path to Main Text file
    file = r'C:\Users\User1\OneDrive\Documents\wordlist.txt'

    # open Main Text file and set variable 'MainText' equal to contents
    fileToOpen = Path(file)
    f = open(fileToOpen)
    MainText = f.read()

    # search all words in Main Text and make list of those that fit the 
    # RegEx pattern defined above
    slist = re.findall(pattern,MainText)

    # make set from list to eliminate duplicate entries
    slist_nd = list(set(slist))

    # turn Set into a string
    sliststr = ' '.join(slist_nd)

    # open new txt file, write string into contents
    slist_file = open(r"C:\Users\User1\OneDrive\Documents\fivelist"+ltr+".txt", "w")
    slist_file.write(sliststr)

    # close files
    f.close()
    slist_file.close()


alphabet = list(string.ascii_lowercase)

for x in alphabet:
    pattern = r'\b'+x+r'\w{4}\b'
    WordSearch(pattern, x)

You could look into threading as well. And start each function in it's own thread. That way you don't have to wait for the first one to finish for the next one to start.

from threading import Thread

Then go like this.

threads = []
alphabet = list(string.ascii_lowercase)

for x in alphabet:
    pattern = r'\b'+x+r'\w{4}\b'
    t = Thread(target=WordSearch, args=(pattern, x))
    threads.append(t)
    
for x in threads:
    x.start()
for x in threads:
    x.join()

Comments