Ügyességi játékok programozása
Pythonnal és Pygame-mel

Lab 15: Spell Check

This lab shows how to create a spell checker. To prepare for the lab, go to:
ProgramArcadeGames.com/index.php?chapter=examples list
...and download the files listed below. The files are also in the “Searching and Sorting Examples” section.

If you are working off the BitBucket repository template for this class, then the files are already there and you don't need to download them.

15.1 Requirements

Write a single program in Python that checks the spelling of the first chapter of “Alice In Wonderland.” First use a linear search, then use a binary search. Print the line number along with the word that does not exist in the dictionary.

Follow the steps below carefully. If you don't know how to accomplish one step, ask before moving on to the next step.

15.2 Steps to complete:

  1. If you work off the BitBucket template, skip ahead to step 6.
  2. Find or create a directory for your project.
  3. Download the dictionary to the directory.
  4. Download first 200 lines of Alice In Wonderland to your directory.
  5. Start a Python file for your project.
  6. It is necessary to split apart the words in the story so that they may be checked individually. It is also necessary to remove extra punctuation and white-space. Unfortunately, there is not any good way of doing this with what the book has covered so far. The code to do this is short, but a full explanation is beyond the scope of this class. Include the following function in your program. Remember, function definitions should go at the top of your program just after the imports. We'll call this function in a later step.
    import re
    
    # This function takes in a line of text and returns
    # a list of words in the line.
    def split_line(line):
        return re.findall('[A-Za-z]+(?:\'[A-Za-z]+)?',line)
    
    This code uses a regular expression to split the text apart. Regular expressions are very powerful and relatively easy to learn. To learn more about regular expressions, see:
    http://regexone.com/
  7. Read the file dictionary.txt into an array. Go back to the chapter on Searching, or see the searching_example.py for example code on how to do this. This does not have anything to do with the import command, libraries, or modules. Don't call the dictionary word_list or something generic because that will be confusing. Call it dictionary_list or something similar.
  8. Close the file.
  9. Print --- Linear Search ---
  10. Open the file AliceInWonderLand200.txt
  11. We are not going to read the story into a list. Do not create a new list here like you did with the dictionary.
  12. Start a for loop to iterate through each line.
  13. Call the split_line function to split apart the line of text in the story that was just read in. Store the list that the function returns in a new variable named words. Remember, just calling the function won't do anything useful. You need to assign a variable equal (words) to the result. If you've forgotten now to capture the return value from a function, flip back to the functions chapter to find it.
  14. Start a nested for loop to iterate through each word in the words list. This should be inside the for loop that runs through each line in the file. (One loop for each line, another loop for each word in the line.)
  15. Using a linear search, check the current word against the words in the dictionary. Check the chapter on searching or the searching_example.py for example code on how to do this. The linear search is just three lines long. When comparing to the word to the other words in the dictionary, convert the word to uppercase. In your while loop just use word.upper() instead of word for the key. This linear search will exist inside the for loop created in the prior step. We are looping through each word in the dictionary, looking for the current word in the line that we just read in.
  16. If the word was not found, print the word. Don't print anything if you do find the word, that would just be annoying.
  17. Close the file.
  18. Make sure the program runs successfully before moving on to the next step.
  19. Create a new variable that will track the line number that you are on. Print this line number along with the misspelled from the prior step.
  20. Make sure the program runs successfully before moving on to the next step.
  21. Print --- Binary Search ---
  22. The linear search takes quite a while to run. To temporarily disable it, it may be commented out by using three quotes before and after that block of code. Ask if you are unsure how to do this.
  23. Repeat the same pattern of code as before, but this time use a binary search. Much of the code from the linear search may be copied, and it is only necessary to replace the lines of code that represent the linear search with the binary search.
  24. Note the speed difference between the two searches.
  25. Make sure the linear search is re-enabled, if it was disabled while working on the binary search.
  26. Upload the final program or check in the final program.

15.3 Example Run

--- Linear Search ---
Line 3  possible misspelled word: Lewis
Line 3  possible misspelled word: Carroll
Line 46  possible misspelled word: labelled
Line 46  possible misspelled word: MARMALADE
Line 58  possible misspelled word: centre
Line 59  possible misspelled word: learnt
Line 69  possible misspelled word: Antipathies
Line 73  possible misspelled word: curtsey
Line 73  possible misspelled word: CURTSEYING
Line 79  possible misspelled word: Dinah'll
Line 80  possible misspelled word: Dinah
Line 81  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 149  possible misspelled word: flavour
Line 150  possible misspelled word: toffee
Line 186  possible misspelled word: croquet
--- Binary Search ---
Line 3  possible misspelled word: Lewis
Line 3  possible misspelled word: Carroll
Line 46  possible misspelled word: labelled
Line 46  possible misspelled word: MARMALADE
Line 58  possible misspelled word: centre
Line 59  possible misspelled word: learnt
Line 69  possible misspelled word: Antipathies
Line 73  possible misspelled word: curtsey
Line 73  possible misspelled word: CURTSEYING
Line 79  possible misspelled word: Dinah'll
Line 80  possible misspelled word: Dinah
Line 81  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 149  possible misspelled word: flavour
Line 150  possible misspelled word: toffee
Line 186  possible misspelled word: croquet

You are not logged in. Log in here and track your progress.