CSV Parsing, trying to understand some code - python

Here's the code
import csv
def csv_dict_reader(file_obj):
"""
read a CSV file using csv.DictReader
"""
reader = csv.DictReader(file_obj, delimiter=',')
for line in reader:
print(line['first_name']),
print(line['last_name']),
if __name__== "__main__":
with open("dummy.csv") as f_obj:
csv_dict_reader(f_obj)
I wanted to try and do a quick breakdown, to see if I understand how exactly this works. Here we go:
1) import csv brings in the csv method
2) We define a function, which takes 'file_obj' as its argument
3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'
4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?
5) I'm really confused when it comes to 'name' and 'main', are these somehow related to how we specify a 'file_obj'? I'm equally confused with how we end up specifying the 'file_obj' in the end; I've been assuming 'f_obj' somehow manages to fill this role.
--edit--
Awesome, this is starting to make a whole lot more sense to me. So, when I make a 'class' call to DictReader(), I'm creating an instance of it in the variable 'reader'?
Maybe I'm going too far off the beaten path, but what in the DictReader() class allows for it to determine the structure of fields like 'last_name' or 'first_name'? I'm assuming it has something to do with how CSV files are structures, but I'm not entirely certain.

1) import csv brings in the csv method
Well, not quite; it brings in the csv module.*
* … which includes the csv.DictReader class, which has a csv.DictReader.__next__ method that you call implicitly, but that's not important here.
2) We define a function, which takes 'file_obj' as its argument
Exactly.*
* Technically, there's a distinction between arguments and parameters, or between actual vs. formal arguments/parameters. You probably don't want to learn that yet. But if you do, formal parameters go in function definitions; actual arguments go in function calls.
3) the reader variable makes a call to a function within csv called "DictReadre", which subsequently takes arguments from 'file_obj' and specifies a 'delimiter'
Again, not quite; it makes a call to the class DictReader. Calling a class constructs an instance of that class. Arguments are passed the same way as in a function call.* You can see the parameters that DictReader takes by looking it up in the help.
* In fact, constructing a class actually calls the class's __new__ method, and then (usually) its __init__ method. But that's only important when you're writing new classes; when you're just using classes, you don't care about __new__ or __init__. That's why the documentation shows, e.g., class csv.DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds).
4) I get confused with this for loop, why is that we don't have to define line beforehand? Is it that line is already defined as part of 'reader'?
No, that's exactly what for statements do: each time through the loop, line gets assigned to the next value in reader. The tutorial explains in more detail.
A simpler example may help:
for a in [1, 2, 3]:
print(a)
This assigns 1 to a, prints out that 1, then assigns 2 to a, prints out that 2, then assigns 3 to a, prints out that 3, then it's done.
Also, you may be confused by other languages, which need variables to be declared before they can be used. Python doesn't do that; you can assign to any name you want anywhere you want, and if there wasn't a variable with that name, there is now.
5) I'm really confused when it comes to 'name' and 'main'
This is a special case where you have to learn something reasonably advanced a little early.
The same source code file can be used as a script, to run on the command line, and also as a module, to be imported by other code. The way you distinguish between the two is by checking __name__. If you're being run as a script, it will be '__main__'. If you're being used as a module by some other script, it will be whatever the name of your module is.
So, idiomatically, you define all your public classes and functions and constants that might be useful to someone else, then you do if __name__ == '__main__': and put all the "top-level script" code there that you want to execute if someone runs you as a script.
Again, the tutorial explains in more detail.

Related

Running function code only when NOT assigning output to variable?

I am looking for a way in python to stop certain parts of the code inside a function but only when the output of the function is assigned to a variable. If the the function is run without any assignment then it should run all the inside of it.
Something like this:
def function():
print('a')
return ('a')
function()
A=function()
The first time that I call function() it should display a on the screen, while the second time nothing should print and only store value returned into A.
I have not tried anything since I am kind of new to Python, but I was imagining it would be something like the if __name__=='__main__': way of checking if a script is being used as a module or run directly.
I don't think such a behavior could be achieved in python, because within the scope of the function call, there is no indication what your will do with the returned value.
You will have to give an argument to the function that tells it to skip/stop with a default value to ease the call.
def call_and_skip(skip_instructions=False):
if not skip_instructions:
call_stuff_or_not()
call_everytime()
call_and_skip()
# will not skip inside instruction
a_variable = call_and_skip(skip_instructions=True)
# will skip inside instructions
As already mentionned in comments, what you're asking for is not technically possible - a function has (and cannot have) any knowledge of what the calling code will do with the return value.
For a simple case like your example snippet, the obvious solution is to just remove the print call from within the function and leave it out to the caller, ie:
def fun():
return 'a'
print(fun())
Now I assume your real code is a bit more complex than this so such a simple solution would not work. If that's the case, the solution is to split the original function into many distinct one and let the caller choose which part it wants to call. If you have a complex state (local variables) that need to be shared between the different parts, you can wrap the whole thing into a class, turning the sub functions into methods and storing those variables as instance attributes.

Modules and variable scopes

I'm not an expert at python, so bear with me while I try to understand the nuances of variable scopes.
As a simple example that describes the problem I'm facing, say I have the following three files.
The first file is outside_code.py. Due to certain restrictions I cannot modify this file. It must be taken as is. It contains some code that runs an eval at some point (yes, I know that eval is the spawn of satan but that's a discussion for a later day). For example, let's say that it contains the following lines of code:
def eval_string(x):
return eval(x)
The second file is a set of user defined functions. Let's call it functions.py. It contains some unknown number of function definitions, for example, let's say that functions.py contains one function, defined below:
def foo(x):
print("Your number is {}!".format(x))
Now I write a third file, let's call it main.py. Which contains the following code:
import outside_code
from functions import *
outside_code.eval_string("foo(4)")
I import all of the function definitions from functions.py with a *, so they should be accessible by main.py without needing to do something like functions.foo(). I also import outside_code.py so I can access its core functionality, the code that contains an eval. Finally I call the function in outside_code.py, passing a string that is related to a function defined in functions.py.
In the simplified example, I want the code to print out "Your number is 4!". However, I get an error stating that 'foo' is not defined. This obviously means that the code in outside_code.py cannot access the same foo function that exists in main.py. So somehow I need to make foo accessible to it. Could anyone tell me exactly what the scope of foo currently is, and how I could extend it to cover the space that I actually want to use it in? What is the best way to solve my problem?
You'd have to add those names to the scope of outside_code. If outside_code is a regular Python module, you can do so directly:
import outside_code
import functions
for name in getattr(functions, '__all__', (n for n in vars(functions) if not n[0] == '_')):
setattr(outside_code, name, getattr(functions, name))
This takes all names functions exports (which you'd import with from functions import *) and adds a reference to the corresponding object to outside_code so that eval() inside outside_code.eval_string() can find them.
You could use the ast.parse() function to produce a parse tree from the expression before passing it to eval_function() and then extract all global names from the expression and only add those names to outside_code to limit the damage, so to speak, but you'd still be clobbering the other module namespace to make this work.
Mind you, this is almost as evil as using eval() in the first place, but it's your only choice if you can't tell eval() in that other module to take a namespace parameter. That's because by default, eval() uses the global namespace of the module it is run in as the namespace.
If, however, your eval_string() function actually accepts more parameters, look for a namespace or globals option. If that exists, the function probably looks more like this:
def eval_string(x, namespace=None):
return eval(x, globals=namespace)
after which you could just do:
outside_code.eval_string('foo(4)', vars(functions))
where vars(functions) gives you the namespace of the functions module.
foo has been imported into main.py; its scope is restricted to that file (and to the file where it was originally defined, of course). It does not exist within outside_code.py.
The real eval function accepts locals and globals dicts to allow you to add elements to the namespace of the evaluted code. But you can't do anything if your eval_string doesn't already pass those on.
The relevant documentation: https://docs.python.org/3.5/library/functions.html#eval
eval takes an optional dictionary mapping global names to values
eval('foo(4)', {'foo': foo})
Will do what you need. It is mapping the string 'foo' to the function object foo.
EDIT
Rereading your question, it looks like this won't work for you. My only other thought is to try
eval_str('eval("foo(4)", {"foo": lambda x: print("Your number is {}!".format(x))})')
But that's a very hackish solution and doesn't scale well to functions that don't fit in lambdas.

Basic use of functions in python

I'm trying to learn Python 3. This is an example I am trying to learn from. So here I define a function to read text. Open a file, read the contents, print it, then close.
So this code runs well. The thing I don't understand, however, is why we write:
print(contents_of_file), but not read(quotes). How come it's quotes.read()? As far I can understand both print() and read() are functions and I expected both to be used the same way. What am I missing here - please help?
Is there a rule when to put stuff inside brackets and when not to?
def read_text():
quotes = open("/Users/me/text.txt", encoding = "utf-8")
contents_of_file = quotes.read()
print(contents_of_file)
quotes.close()
read_text()
print() is a function. read() is a method of the object bound to quotes. As such, read must be referred to by accessing quotes. Only then can we add parens to invoke it.
You've stumbled across the often argued definitions of functions and methods.
read() is a method that belongs to quotes (which is an instance of a class, I don't actually know the name of which). Technically, Methods belong to Objects, Functions are normally defined in a style that isn't strictly Object Orientated, or in global scope (like all C functions).
It might be worth reading up on the OOP aspects of Python, this will likely help you understand it more.
quotes is a file object. I understand you don't yet know what is an object. But try printing quotes.
print type(quotes)
This object has a function read() whose purpose is to read contents from the file.
To call a function of an object, you have to write:
object.funcName()
As this is exactly what we want, we are just calling that function. So we are writing:
quotes.read()
print doesn't belongs to any of these type of objects. So, we can call it without any object reference.

Can I use same argument names when passing arguments in Python

Can you please help me guys. I believe I've got pretty easy questions but don't want to stuff up with my assignment. I'm going to have Class in my module, this class will have few functions.
I just want to be sure it works alright and this is a not ugly code practice.
I.e. my first function test_info accepts one parameter test_code and returns something and the second function check_class accepts two parameter, one of them is called test_code as well
Can I use same argument name: test_code? Is it normal code practice?
def test_info (self, test_code):
my_test_code = test_code
#here we'll be using my_test_code to get info from txt file and return other info
def check_class (self, test_code, other_arg):
my_test_code = test_code
#here some code goes
Also is it fine to use my_test_code in both functions to get argument value or is it better to use different ones like my_test_code_g etc.
Many thanks
Yes you may.
The two variables test_code are defined only in the scope of their respective functions and therefore will not interfere with one another since the other functions lie outside their scope.
Same goes for my_test_code
Read online about variable scopes. Here is a good start
There is no technical reason to resolve this one way or another. But if the variables don't serve exactly the same purpose in both functions, it's confusing for a human reader if they have the same name.

Why the syntax for open() and .read() is different?

This is a newbie question, but I looked around and I'm having trouble finding anything specific to this question (perhaps because it's too simple/obvious to others).
So, I am working through Zed Shaw's "Learn Python the Hard Way" and I am on exercise 15. This isn't my first exposure to python, but this time I'm really trying to understand it at a more fundamental level so I can really do something with a programming language for once. I should also warn that I don't have a good background in object oriented programming or fully internalized what objects, classes, etc. etc. are.
Anyway, here is the exercise. The ideas is to understand basic file opening and reading:
from sys import argv
script, filename = argv
txt = open(filename)
print "Here's your file %r:" % filename
print txt.read()
print "I'll also ask you to type it again:"
file_again = raw_input("> ")
txt_again = open(file_again)
print txt_again.read()
txt.close()
txt_again.close()
My question is, why are the open and read functions used differntly?
For example, to read the example file, why don't/can't I type print read(txt) on line 8?
Why do I put a period in front of the variable and the function after it?
Alternatively, why isn't line 5 written txt = filename.open()?
This is so confusing to me. Is it simply that some functions have one syntax and others another syntax? Or am I not understanding something with respect to how one passes variables to functions.
Syntax
Specifically to the syntactical differences: open() is a function, read() is an object method.
When you call the open() function, it returns an object (first txt, then txt_again).
txt is an object of class file. Objects of class file are defined with the method read(). So, in your code above:
txt = open(filename)
Calls the open() function and assigns an object of class file into txt.
Afterwards, the code:
txt.read()
calls the method read() that is associated with the object txt.
Objects
In this scenario, it's important to understand that objects are defined not only as data entities, but also with built-in actions against those entities.
e.g. A hypothetical object of class car might be defined with methods like start_engine(), stop_engine(), open_doors(), etc.
So as a parallel to your file example above, code for creating and using a car might be:
my_car = create_car(type_of_car)
my_car.start_engine()
(Wikipedia entry on OOP.)
To answer this you should have some understanding of object oriented programming.
open() is a normal function, and the first parameter is a string, with the path to the file. The return value of this function is an object.
The further work is done by using this object. An object also has functions, but they are called methods. These methods are called in the context of this object, and the point connects the object with the method. So txt.read() means that you are calling the read-method from the txt-object.
But if you really want to understand this, you should have a look at OOP.
You're coming up against methods vs functions.
open is a global function, and it takes as its parameters simply the things that go between the brackets.
read is a method of file objects. The expression txt.read() calls the read method of the txt object. Under the hood, the txt object is passed as the first parameter of its read method. The read method will be defined something like this:
class File(object):
def read(self):
# do whatever here
# self is whatever object appears to the left of the dot in foo.read
It follows from the above definition that you can only use a method like read on an object which has a read method defined for it.

Categories