Why the syntax for open() and .read() is different? - python

This is a newbie question, but I looked around and I'm having trouble finding anything specific to this question (perhaps because it's too simple/obvious to others).
So, I am working through Zed Shaw's "Learn Python the Hard Way" and I am on exercise 15. This isn't my first exposure to python, but this time I'm really trying to understand it at a more fundamental level so I can really do something with a programming language for once. I should also warn that I don't have a good background in object oriented programming or fully internalized what objects, classes, etc. etc. are.
Anyway, here is the exercise. The ideas is to understand basic file opening and reading:
from sys import argv
script, filename = argv
txt = open(filename)
print "Here's your file %r:" % filename
print txt.read()
print "I'll also ask you to type it again:"
file_again = raw_input("> ")
txt_again = open(file_again)
print txt_again.read()
txt.close()
txt_again.close()
My question is, why are the open and read functions used differntly?
For example, to read the example file, why don't/can't I type print read(txt) on line 8?
Why do I put a period in front of the variable and the function after it?
Alternatively, why isn't line 5 written txt = filename.open()?
This is so confusing to me. Is it simply that some functions have one syntax and others another syntax? Or am I not understanding something with respect to how one passes variables to functions.

Syntax
Specifically to the syntactical differences: open() is a function, read() is an object method.
When you call the open() function, it returns an object (first txt, then txt_again).
txt is an object of class file. Objects of class file are defined with the method read(). So, in your code above:
txt = open(filename)
Calls the open() function and assigns an object of class file into txt.
Afterwards, the code:
txt.read()
calls the method read() that is associated with the object txt.
Objects
In this scenario, it's important to understand that objects are defined not only as data entities, but also with built-in actions against those entities.
e.g. A hypothetical object of class car might be defined with methods like start_engine(), stop_engine(), open_doors(), etc.
So as a parallel to your file example above, code for creating and using a car might be:
my_car = create_car(type_of_car)
my_car.start_engine()
(Wikipedia entry on OOP.)

To answer this you should have some understanding of object oriented programming.
open() is a normal function, and the first parameter is a string, with the path to the file. The return value of this function is an object.
The further work is done by using this object. An object also has functions, but they are called methods. These methods are called in the context of this object, and the point connects the object with the method. So txt.read() means that you are calling the read-method from the txt-object.
But if you really want to understand this, you should have a look at OOP.

You're coming up against methods vs functions.
open is a global function, and it takes as its parameters simply the things that go between the brackets.
read is a method of file objects. The expression txt.read() calls the read method of the txt object. Under the hood, the txt object is passed as the first parameter of its read method. The read method will be defined something like this:
class File(object):
def read(self):
# do whatever here
# self is whatever object appears to the left of the dot in foo.read
It follows from the above definition that you can only use a method like read on an object which has a read method defined for it.

Related

In the class, how do i name the output file as the name of the future instance of this class

I am a beginner in creating class. In my class, I have defined a function to output a csv file. However, I would like to name the csv file name with the name of the future instance.
Here is a dummy code hopefully can explain what i want ot do
class output
function()
df.to_csv('output.csv')
a=output()
a.function()
The results i am looking for: the output csv file - a.csv
How do i achieve the above?
An appropriate way to code your intent is to pass the name of the csv file (or some portion of it) to the object's constructor:
#UNTESTED
class output:
def __init__(self, myname):
self.myname = myname
def function():
df.to_csv(self.myname + '.csv')
a=output('a')
a.function()
There's no useful way to do precisely what you asked for. The name(s) of the variable(s) that are bound to your object are not known by the object itself.
But even if the variable(s)'s name(s) were available, consider this:
a=output()
b=a
a.function()
b.function()
The two .function() calls are happening on precisely the same object. The object wouldn't know whether to create a.csv or b.csv.
Even worse, what about:
lst[4] = dct["hello"] = output()
Assuming .function() is invoked, what would the filename be?
I think what you describe is possible due to Python's extremely dynamic abilities. However, it will be clumsy and most probably not the right way to achieve your goal. Consider simply doing
class output:
def function(name):
df.to_csv(name + '.csv')
a = output()
a.function('a')
Generally speaking, your choice of variable names should never affect the final outcome of your code.

Basic use of functions in python

I'm trying to learn Python 3. This is an example I am trying to learn from. So here I define a function to read text. Open a file, read the contents, print it, then close.
So this code runs well. The thing I don't understand, however, is why we write:
print(contents_of_file), but not read(quotes). How come it's quotes.read()? As far I can understand both print() and read() are functions and I expected both to be used the same way. What am I missing here - please help?
Is there a rule when to put stuff inside brackets and when not to?
def read_text():
quotes = open("/Users/me/text.txt", encoding = "utf-8")
contents_of_file = quotes.read()
print(contents_of_file)
quotes.close()
read_text()
print() is a function. read() is a method of the object bound to quotes. As such, read must be referred to by accessing quotes. Only then can we add parens to invoke it.
You've stumbled across the often argued definitions of functions and methods.
read() is a method that belongs to quotes (which is an instance of a class, I don't actually know the name of which). Technically, Methods belong to Objects, Functions are normally defined in a style that isn't strictly Object Orientated, or in global scope (like all C functions).
It might be worth reading up on the OOP aspects of Python, this will likely help you understand it more.
quotes is a file object. I understand you don't yet know what is an object. But try printing quotes.
print type(quotes)
This object has a function read() whose purpose is to read contents from the file.
To call a function of an object, you have to write:
object.funcName()
As this is exactly what we want, we are just calling that function. So we are writing:
quotes.read()
print doesn't belongs to any of these type of objects. So, we can call it without any object reference.

Correct way to write to files?

I was wondering if there was any difference between doing:
var1 = open(filename, 'w').write("Hello world!")
and doing:
var1 = open(filename, 'w')
var1.write("Hello world!")
var1.close()
I find that there is no need (AttributeError) if I try to run close() after using the first method (all in one line).
I was wondering if one way was actually any different/'better' than the other, and secondly, what is Python actually doing here? I understand that open() returns a file object, but how come running all of the code in one line automatically closes the file too?
Using with statement is preferred way:
with open(filename, 'w') as f:
f.write("Hello world!")
It will ensure the file object is closed outside the with block.
Let me example to you why your first instance wont work if you initial a close() method. This will be useful for your future venture into learning object orientated programming in Python
Example 1
When you run open(filename, 'w') , it will initialise and return an file handle object.
When you call for open(filename, 'w').write('helloworld'), you are calling the write method on the file object that you initiated. Since the write method do not return any value/object, var1 in your code above will be of NoneType
Example 2
Now in your second example, you are storing the file object as var1.
var1 will have the write method as well as the close method and hence it will work.
This is in contrast to what you have done in your first example.
falsetru have provided a good example of how you can read and write file using the with statement
Reading and Writing file using the with statement
to write
with open(filename, 'w') as f:
f.write("helloworld")
to read
with open(filename) as f:
for line in f:
## do your stuff here
Using nested with statements to read/write multiple files at once
Hi here's an update to your question on the comments. Not too sure if this is the most pythonic way. But if you will like to use the with statement to read/write mulitple files at the same time using the with statement. What you can do is the nest the with statement within one another
For instance :
with open('a.txt', 'r') as a:
with open('b.txt', 'w') as b:
for line in a:
b.write(line)
How and Why
The file object itself is a iterator. Therefore, you could iterator thru the file with a for loop. The file object contains the next() method, which, with each iteration will be called until the end of file is reached.
The with statement was introduced in python 2.5. Prior to python 2.5 to achieve the same effect, one have to
f = open("hello.txt")
try:
for line in f:
print line,
finally:
f.close()
Now the with statement does that automatically for you. The try and finally statement are in place to ensure if there is any expection/error raised in the for loop, the file will be closed.
source : Python Built-in Documentation
Official documentations
Using the with statement, f.close() will be called automatically when it finishes. https://docs.python.org/2/tutorial/inputoutput.html
Happy venture into python
cheers,
biobirdman
#falsetru's answer is correct, in terms of telling you how you're "supposed" to open files. But you also asked what the difference was between the two approaches you tried, and why they do what they do.
The answer to those parts of your question is that the first approach doesn't do what you probably think it does. The following code
var1 = open(filename, 'w').write("Hello world!")
is roughly equivalent to
tmp = open(filename, 'w')
var1 = tmp.write("Hello world!")
del tmp
Notice that the open() function returns a file object, and that file object has a write() method. But write() doesn't have any return value, so var1 winds up being None. From the official documentation for file.write(str):
Write a string to the file. There is no return value. Due to buffering, the string may not actually show up in the file until the flush() or close() method is called.
Now, the reason you don't need to close() is that the main implementation of Python (the one found at python.org, also called CPython) happens to garbage-collect objects that no longer have references to them, and in your one-line version, you don't have any reference to the file object once the statement completes. You'll find that your multiline version also doesn't strictly need the close(), since all references will be cleaned up when the interpreter exits. But see answers to this question for a more detailed explanation about close() and why it's still a good idea to use it, unless you're using with instead.

CSV Parsing, trying to understand some code

Here's the code
import csv
def csv_dict_reader(file_obj):
"""
read a CSV file using csv.DictReader
"""
reader = csv.DictReader(file_obj, delimiter=',')
for line in reader:
print(line['first_name']),
print(line['last_name']),
if __name__== "__main__":
with open("dummy.csv") as f_obj:
csv_dict_reader(f_obj)
I wanted to try and do a quick breakdown, to see if I understand how exactly this works. Here we go:
1) import csv brings in the csv method
2) We define a function, which takes 'file_obj' as its argument
3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'
4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?
5) I'm really confused when it comes to 'name' and 'main', are these somehow related to how we specify a 'file_obj'? I'm equally confused with how we end up specifying the 'file_obj' in the end; I've been assuming 'f_obj' somehow manages to fill this role.
--edit--
Awesome, this is starting to make a whole lot more sense to me. So, when I make a 'class' call to DictReader(), I'm creating an instance of it in the variable 'reader'?
Maybe I'm going too far off the beaten path, but what in the DictReader() class allows for it to determine the structure of fields like 'last_name' or 'first_name'? I'm assuming it has something to do with how CSV files are structures, but I'm not entirely certain.
1) import csv brings in the csv method
Well, not quite; it brings in the csv module.*
* … which includes the csv.DictReader class, which has a csv.DictReader.__next__ method that you call implicitly, but that's not important here.
2) We define a function, which takes 'file_obj' as its argument
Exactly.*
* Technically, there's a distinction between arguments and parameters, or between actual vs. formal arguments/parameters. You probably don't want to learn that yet. But if you do, formal parameters go in function definitions; actual arguments go in function calls.
3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'
Again, not quite; it makes a call to the class DictReader. Calling a class constructs an instance of that class. Arguments are passed the same way as in a function call.* You can see the parameters that DictReader takes by looking it up in the help.
* In fact, constructing a class actually calls the class's __new__ method, and then (usually) its __init__ method. But that's only important when you're writing new classes; when you're just using classes, you don't care about __new__ or __init__. That's why the documentation shows, e.g., class csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds).
4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?
No, that's exactly what for statements do: each time through the loop, line gets assigned to the next value in reader. The tutorial explains in more detail.
A simpler example may help:
for a in [1, 2, 3]:
print(a)
This assigns 1 to a, prints out that 1, then assigns 2 to a, prints out that 2, then assigns 3 to a, prints out that 3, then it's done.
Also, you may be confused by other languages, which need variables to be declared before they can be used. Python doesn't do that; you can assign to any name you want anywhere you want, and if there wasn't a variable with that name, there is now.
5) I'm really confused when it comes to 'name' and 'main'
This is a special case where you have to learn something reasonably advanced a little early.
The same source code file can be used as a script, to run on the command line, and also as a module, to be imported by other code. The way you distinguish between the two is by checking __name__. If you're being run as a script, it will be '__main__'. If you're being used as a module by some other script, it will be whatever the name of your module is.
So, idiomatically, you define all your public classes and functions and constants that might be useful to someone else, then you do if __name__ == '__main__': and put all the "top-level script" code there that you want to execute if someone runs you as a script.
Again, the tutorial explains in more detail.

Trouble with the scope of variables in a class?

So I'm in the process of making a class in Python that creates a network (with pybrain) using solely the numeric input it's given {just a little process to get my feet wet in Pybrain's API}.
My problem is, I'm rather unfamiliar with how scopes work in classes, and while I basically have the program set up properly, it keeps returning a keyerror.
All the variables needed to be acted upon are created in the init function; the method I'm working on for the class is trying to call upon one of the variables, declared in the init function using the vars()[] method in Python. (you can actually see a portion of the code's...rough draft here:
Matching Binary operators in Tuples to Dictionary Items
Anyways, the method is:
def constructnetwork(self):
"""constructs network using gathered data in __init__"""
if self.recurrent:
self.network = pybrain.RecurrentNetwork
#add modules/connections here
elif self.forward:
self.network = pybrain.FeedForwardNetwork
else:
self.network = pybrain.network
print vars()[self.CONNECT+str(1)]
print vars()[self.CONNECT+str(2)]
print self.network
(pardon the bad spacing, it didn't copy and paste over well.) The part that's raising the KeyError is the "print vars()[self.CONNECT+str(1)], which should retreive the value of the variable "Connection1" (self.CONNECT = 'Connection'), but calls a keyerror.
How do I get the variables to transfer over? If you need more information to help just ask, I'm trying to keep the quesiton as short as possible.
vars() returns a reference to the dictionary of local variables. If you used vars() in your __init__ (as the code in the post you linked to suggests), then you just created a local variable in that method, which isn't accessible from anywhere outside that method.
What is it that you think vars() does, and what are you trying to do? I have a hunch that what you want is getattr and setattr, or just a dictionary, and not vars.
Edit: Based on your comment, it sounds like, indeed, you shouldn't use vars. You would be better off, in __init__, doing something like:
self.connections = {}
self.connections[1] = "This is connection 1"
then in your method, do:
self.connections[1]
This is just a vague guess based on your code, though. I can't really tell what you are intending for this "connection". Do you want it to be some data associated with your object?

Categories