In this section we will discuss how you can read files in Python. We'll also demo a common pattern in Python programming, where we first create "top level" scripts, that we later move into functions once we feel confident they are working.

To start, open the file called "word_analyzer.py" in your favorite text editor. Once you've got it open, add the following code at the top of the file:

f = open("horse_ebooks.txt", "r")
print(f)
      

In Python, open is a reporter that takes two arguments. The first is the name of the file you want to open, and the second is a "mode". Here, we use "r" for "read". Don't worry about the details too much in this lab, you'll cover them in a later course. The open reporter reports a file object, which we then store in variable f using the = operator. What is a file object? You can think of it as a friend robot that knows how to read one specific file from our computer's hard drive. When we create a file object using open, we're delegating the responsibility for reading that one file to this object. We'll rely on this to get the data we need from the file.

Run this file and you should see something mysterious that looks something like:

$ python word_analyzer.py
<_io.TextIOWrapper name='horse_ebooks.txt' mode='r' encoding='UTF-8'>

Tip: Instead of typing out the entire filename for "word_analyzer.py", instead just type "python wo", then press tab, and your command line will autocomplete the filename.

Consider that mysterious stuff that got printed out by Python: <_io.TextIOWrapper name='horse_ebooks.txt' mode='r' encoding='UTF-8'> This is Python's way of printing out the file object. As a programmer, you don't really care (or understand) what this stuff means. It's like asking one of your friends for her DNA sequence instead of asking her what color her hair was before she dyed it. We want the file object to do its job and get the data from the file on our computer, not neccessarily to understand how it does all that. To tell it to get the data from the file, modify your word_analyzer.py so that it reads as shown below:

f = open("horse_ebooks.txt", "r")
text = f.read()
print(text)

Try running word_analyzer.py, and you should get a print out of the contents of the file "horse_ebooks.txt".

$ python word_analyzer.py
Fruits and Vegetables and Vegetables on a Budget and Vegetables at a Store and Vegetables to Clean Fruit and Vegetables

If we look to the Python code, f.read() is the important part. read is a function that is built into every file object (just like .append is built into any list and .join is built into any string). Here, we're telling the file object to give us the information in the file for which it is responsible (horse_ebooks.txt). Note that this was not possible in Snap!, given the restrictions that our web browser places on the Snap! interpreter.

Putting Our Code into a Nice Function

In Snap!, we saw that it makes sense to move code that does a specific task into a block. In this case, we're going to want to read files throughout the rest of this lab, so we will move our code into a function called read_file. We're doing this because if we want to read many files throughout the lab, we'd rather use a single function rather than have to rewrite our two lines of file-reading code every time. Modify your word_analyzer.py so that it reads as below.

def read_file(filename):
    """Returns the text contained in file with given filename."""
    f = open(filename, "r")
    text = f.read()
    return text

You'll notice that we've done something strange, and have added an English description of what the function is supposed to do inside triple quotes. These triple quoted comments are very (!!) common in real world Python code, as it provides other programmers with an understanding of what your function is supposed to do.

Recall that abstraction gives us two glorious advantages: detail hiding and generalization. We've now got a general function that can read any file, and when we read files we don't have to think about file objects or modes or any of that mysterious business. We first saw these principles in action in Snap!, but they'll be important to you as long as you're writing programs, and even beyond.