I have defined a function that searches a string for all 5-letter words starting with the letter I've specified. A text file is then created with the processed list.
I currently must run the program manually for each letter. How can I change my code so that the function will search for all letters of the alphabet one after the other and make a separate text file for each result?
import re
from pathlib import Path
# define starting letter of word search
ltr = 'm'
# define RegEx pattern to find all 5-letter words starting with a
certain letter
pattern = r'\b'+ltr+r'\w{4}\b'
# define Function WordSearch: search wordlist.txt with defined RegEx
Pattern
def WordSearch():
# set Path to Main Text file
file = r'C:\Users\User1\OneDrive\Documents\wordlist.txt'
# open Main Text file and set variable 'MainText' equal to contents
fileToOpen = Path(file)
f = open(fileToOpen)
MainText = f.read()
# search all words in Main Text and make list of those that fit the
# RegEx pattern defined above
slist = re.findall(pattern,MainText)
# make set from list to eliminate duplicate entries
slist_nd = list(set(slist))
# turn Set into a string
sliststr = ' '.join(slist_nd)
# open new txt file, write string into contents
slist_file = open(r"C:\Users\User1\OneDrive\Documents\fivelist"+ltr+".txt", "w")
slist_file.write(sliststr)
# close files
f.close()
slist_file.close()
WordSearch()
Here you go.
import string
letters = string.ascii_lowercase
Then you can do
for ltr in letters:
pattern = r'\b'+ltr+r'\w{4}\b'
Note that you have to indent all forward code to be in the for loop.
Use a list and the loop through list. Use variables for your function. Call function in loop.
import string
import re
from pathlib import Path
def WordSearch(pattern, ltr):
# set Path to Main Text file
file = r'C:\Users\User1\OneDrive\Documents\wordlist.txt'
# open Main Text file and set variable 'MainText' equal to contents
fileToOpen = Path(file)
f = open(fileToOpen)
MainText = f.read()
# search all words in Main Text and make list of those that fit the
# RegEx pattern defined above
slist = re.findall(pattern,MainText)
# make set from list to eliminate duplicate entries
slist_nd = list(set(slist))
# turn Set into a string
sliststr = ' '.join(slist_nd)
# open new txt file, write string into contents
slist_file = open(r"C:\Users\User1\OneDrive\Documents\fivelist"+ltr+".txt", "w")
slist_file.write(sliststr)
# close files
f.close()
slist_file.close()
alphabet = list(string.ascii_lowercase)
for x in alphabet:
pattern = r'\b'+x+r'\w{4}\b'
WordSearch(pattern, x)
You could look into threading as well. And start each function in it's own thread. That way you don't have to wait for the first one to finish for the next one to start.
from threading import Thread
Then go like this.
threads = []
alphabet = list(string.ascii_lowercase)
for x in alphabet:
pattern = r'\b'+x+r'\w{4}\b'
t = Thread(target=WordSearch, args=(pattern, x))
threads.append(t)
for x in threads:
x.start()
for x in threads:
x.join()
Comments
Post a Comment