Name not defined error python when reading file line by line - python

So, I am very new to python and I am not sure if my code is the most effective, but would still be very appreciative if someone could explain to me why my script is returning the "name not defined" error when I run it. I have a list of 300 gene names in a separate file, one name per line, that I want to read, and store each line as a string variable.
Within the script I have a list of 600 variables. 300 variables labeled name_bitscore and 300 labeled name_length for each of the 300 names.
I want to filter through the list based on a condition. My script looks like this:
#!/usr/bin/python
with open("seqnames-test1-iso-legal-temp.txt") as f:
for line in f:
exec("b="+line+"_bitscore")
exec("l="+line+"_length")
if 0.5*b <= 2*1.05*l and 0.5*b >= 2*0.95*l:
print line
ham_pb_length=2973
ham_pb_bitscore=2165
g2225_ph_length=3303
cg2225_ph_bitscore=2278
etc. for the length and bitscore variables.
Essentially, what I am trying to do here, is read line 1 of the file "seqnames-test1-iso-legal-temp.txt" which is ham_pb. Then I use wanted to use the exec function to create a variable b=ham_pb_bitscore and l=ham_pb_length, so that I could test if half the value of the gene's bitscore is within the range of double its length with a 5% margin of error. Then, repeat this for every gene, i.e. every line of the file "seqnames-test1-sio-legal-temp.txt".
When I execute the script, I get the error message:
Traceback (most recent call last):
File "duplicatebittest.py", line 4, in <module>
exec("b="+line+"_bitscore")
File "<string>", line 1, in <module>
NameError: name 'ham_pb' is not defined
I made another short script to make sure I was using the exec function correctly that looks like this:
#!/usr/pin/python
name="string"
string_value=4
exec("b="+name+"_value")
print(name)
print(b)
And this returns:
string
4
So, I know that I can use exec to include a string variable in a variable declaration because b returns 4 as expected. So, I am not sure why I get an error in my first script.
I tested to make sure the variable line was a string by entering
#!/usr/bin/python
with open("seqnames-test1-iso-legal-temp.txt") as f:
for line in f:
print type(line)
And it returned the line
<type 'str'>
300 times, so I know each variable line is a string, which is why I don't understand why my test script worked, but this one did not.
Any help would be super appreciated!

line is yield by the text file iterator, which issues a newline for each line read.
So your expression:
exec("b="+line+"_bitscore")
is passed to exec as:
b=ham_pb
_bitscore
Strip the output and that will work
exec("b="+line.rstrip()+"_bitscore")
provided that you move the following lines before the loop so variables are declared:
ham_pb_length=2973
ham_pb_bitscore=2165
g2225_ph_length=3303
cg2225_ph_bitscore=2278
Better: quit using exec and use dictionaries to avoid defining variables dynamically.

put #!/usr/bin/env python as the first line. See this question for more explanation.
As Jean pointed out, exec is not the right tool for this job. You should be using dictionaries, as they are less dangerous (search code injection) and dictionaries are easier to read. Here's an example of how to use dictionaries taken from the python documentation:
>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'sape': 4139, 'guido': 4127, 'jack': 4098}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'guido': 4127, 'irv': 4127, 'jack': 4098}
>>> list(tel.keys())
['irv', 'guido', 'jack']
>>> sorted(tel.keys())
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False
Here's a way I can think of to accomplish your goal:
with open("seqnames-test1-iso-legal-temp.txt") as f:
gene_data = {'ham_pb_length':2973, 'am_pb_bitscore':2165,
'g2225_ph_length':3303, 'cg2225_ph_bitscore':2278}
'''maybe you have more of these gene data things. If so,
just append them to the end of the above dictionary literal'''
for line in f:
if not line.isspace():
bitscore = gene_data[line.rstrip()+'_bitscore']
length = gene_data[line.rstrip()+'_bitscore']
if (0.95*length <= bitscore/4 <= 1.05*length):
print line
I take advantage of a few useful python features here. In python3, 5/7 evaluates to 0.7142857142857143, not your typical 0 as in many programming languages. If you want integer division in python3, use 5//7. Additionally, in python 1<2<3 evaluates to True, and 1<3<2 evaluates to False whereas in many programming languages, 1<2<3 evaluates to True<3 which might give an error or evaluate to True depending on the programming language.

Related

How to deal with long list of return values and not face IndentationError

I have a function in Python that returns 5 items (I didn't develop the function so I cannot change it). If I want to give meaningful names to the return values, it will be a long line and it will exceed the 80 characters per line recommendation. So I wrote it like this:
encoded_en,
forward_h_en, forward_c_en,
backward_h_en, backward_c_en = encoder(embedding)
But then, I faced the indentation error:
File "<ipython-input-28-5947292b462a>", line 20
forward_h_en, forward_c_en
^
IndentationError: unexpected indent
What is the proper way to deal with such cases?
Use parentheses:
(encoded_en,
forward_h_en, forward_c_en,
backward_h_en, backward_c_en) = range(5)
print(encoded_en) # 0

ValueError in Python 3 code

I have this code that will allow me to count the number of missing rows of numbers within the csv for a script in Python 3.6. However, these are the following errors in the program:
Error:
Traceback (most recent call last):
File "C:\Users\GapReport.py", line 14, in <module>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
File "C:\Users\GapReport.py", line 14, in <genexpr>
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
ValueError: invalid literal for int() with base 10: 'AC-SEC 000000001'
Code:
import csv
def out(*args):
print('{},{}'.format(*(str(i).rjust(4, "0") for i in args)))
prev = 0
data = csv.reader(open('Padded Numbers_export.csv'))
print(*next(data), sep=', ') # header
for line in data:
EndDoc_Padded, EndDoc_Padded = (int(s.strip()[2:]) for s in line)
if start != prev+1:
out(prev+1, start-1)
prev = end
out(start, end)
I'm stumped on how to fix these issues.Also, I think the csv many lines in it, so if there's a section that limits it to a few numbers, please feel free to update me on so.
CSV Snippet (Sorry if I wasn't clear before!):
The values you have in your CSV file are not numeric.
For example, FMAC-SEC 000000001 is not a number. So when you run int(s.strip()[2:]), it is not able to convert it to an int.
Some more comments on the code:
What is the utility of doing EndDoc_Padded, EndDoc_Padded = (...)? Currently you are assigning values to two variables with the same name. Either name one of them something else, or just have one variable there.
Are you trying to get the two different values from each column? In that case, you need to split line into two first. Are the contents of your file comma separated? If yes, then do for s in line.split(','), otherwise use the appropriate separator value in split().
You are running this inside a loop, so each time the values of the two variables would get updated to the values from the last line. If you're trying to obtain 2 lists of all the values, then this won't work.

Getting error List object is not callable on this Caesar cipher

plaintext =[
"this is a test",
"caesar’s wife must be above suspicion",
"as shatner would say: you, should, also, be, able, to, handle, punctuation.",
"to mimic chris walken: 3, 2, 1, why must you, pause, in strange places?",]
shift = 3
def caesar(plaintext):
alphabet=["a","b","c","d","e","f","g","h","i","j","k","l",
"m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
dic={}
for i in range(0,len
(alphabet)):
dic[alphabet[i]]=alphabet[(i+shift)%len(alphabet)]
ciphertext=""
for l in plaintext():
if l in dic:
l=dic[l]
ciphertext+=l
return ciphertext
print [caesar(plaintext)]
I'm not sure why it giving me that error. I need some assistance. I tried putting brackets around and replacing the parathesis, but it still giving that error.
Traceback (most recent call last):
File "C:/Users/iii/Desktop/y.py", line 33, in <module>
print (caesar(plaintext))
File "C:/Users/iii/Desktop/y.py", line 24, in caesar
for l in plaintext():
TypeError: 'list' object is not callable
First of all, print(caesar(plaintext)) is how to print in Python 3.x.
You also cannot iterate through a list without calling len() on it: for l in range(len(plaintext)):
You also cannot increment a string, so I suggest if you really want to add 1 to the variable ciphertext, make it a number: ciphertext=0
This modified code runs without error, as long as it still is what you wanted, I cannot really tell if it is.
plaintext =[
"this is a test",
"caesar’s wife must be above suspicion",
"as shatner would say: you, should, also, be, able, to, handle, punctuation.",
"to mimic chris walken: 3, 2, 1, why must you, pause, in strange places?",]
shift = 3
def caesar(plaintext):
alphabet=["a","b","c","d","e","f","g","h","i","j","k","l",
"m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
dic={}
for i in range(0,len(alphabet)):
dic[alphabet[i]]=alphabet[(i+shift)%len(alphabet)]
ciphertext=0
for l in range(len(plaintext)):
if l in dic:
l=dic[l]
ciphertext+=l
return ciphertext
print(caesar(plaintext))
EDIT
If you want to make a Caesar cipher, this web site does a great job of explaining how to make it in Python.

Substring in Python, what is wrong here?

I'm trying to simulate a substring in Python but I'm getting an error:
length_message = len(update)
if length_message > 140:
length_url = len(short['url'])
count_message = 140 - length_url
update = update["msg"][0:count_message] # Substring update variable
print update
return 0
The error is the following:
Traceback (most recent call last):
File "C:\Users\anlopes\workspace\redes_sociais\src\twitterC.py", line 54, in <module>
x.updateTwitterStatus({"url": "http://xxx.com/?cat=49s", "msg": "Searching for some ....... tips?fffffffffffffffffffffffffffffdddddddddddddddddddddddddddddssssssssssssssssssssssssssssssssssssssssssssssssssseeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeedddddddddddddddddddddddddddddddddddddddddddddddfffffffffffffffffffffffffffffffffffffffffffff "})
File "C:\Users\anlopes\workspace\redes_sociais\src\twitterC.py", line 35, in updateTwitterStatus
update = update["msg"][0:count_message]
TypeError: string indices must be integers
I can't do this?
update = update["msg"][0:count_message]
The variable "count_message" return "120"
Give me a clue.
Best Regards,
UPDATE
I make this call, update["msg"] comes from here
x = TwitterC()
x.updateTwitterStatus({"url": "http://xxxx.com/?cat=49", "msg": "Searching for some ...... ....?fffffffffffffffffffffffffffffdddddddddddddddddddddddddddddssssssssssssssssssssssssssssssssssssssssssssssssssseeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeedddddddddddddddddddddddddddddddddddddddddddddddfffffffffffffffffffffffffffffffffffffffffffffddddddddddddddddd"})
Are you looping through this code more than once?
If so, perhaps the first time through update is a dict, and update["msg"] returns a string. Fine.
But you set update equal to the result:
update = update["msg"][0:int(count_message)]
which is (presumably) a string.
If you are looping, the next time through the loop you will have an error because now update is a string, not a dict (and therefore update["msg"] no longer makes sense).
You can debug this by putting in a print statement before the error:
print(type(update))
or, if it is not too large,
print(repr(update))

Unable to have a command line parameter in Python

I run
import sys
print "x \tx^3\tx^3+x^3\t(x+1)^3\tcube+cube=cube+1"
for i in range(sys.argv[2]): // mistake here
cube=i*i*i
cube2=cube+cube
cube3=(i+1)*(i+1)*(i+1)
truth=(cube2==cube3)
print i, "\t", cube, "\t", cube + cube, "\t", cube3, "\t", truth
I get
Traceback (most recent call last):
File "cube.py", line 5, in <module>
for i in range(sys.argv[2]):
IndexError: list index out of range
How can you use command line parameter as follows in the code?
Example of the use
python cube.py 100
It should give
x x^3 x^3+x^3 (x+1)^3 cube+cube=cube+1
0 0 0 1 False
1 1 2 8 False
2 8 16 27 False
--- cut ---
97 912673 1825346 941192 False
98 941192 1882384 970299 False
99 970299 1940598 1000000 False
Use:
sys.argv[1]
also note that arguments are always strings, and range expects an integer.
So the correct code would be:
for i in range(int(sys.argv[1])):
You want int(sys.argv[1]) not 2.
Ideally you would check the length of sys.argv first and print a useful error message if the user doesn't provide the proper arguments.
Edit: See http://www.faqs.org/docs/diveintopython/kgp_commandline.html
Here are some tips on how you can often solve this type of problem yourself:
Read what the error message is telling you: "list index out of range".
What list? Two choices (1) the list returned by range (2) sys.argv
In this case, it can't be (1); it's impossible to get that error out of
for i in range(some_integer) ... but you may not know that, so in general, if there are multiple choices within a line for the source of an error, and you can't see which is the cause, split the line into two or more statements:
num_things = sys.argv[2]
for i in range(num_things):
and run the code again.
By now we know that sys.argv is the list. What index? Must be 2. How come that's out of range? Knowledge-based answer: Because Python counts list indexes from 0. Experiment-based answer: Insert this line before the failing line:
print list(enumerate(sys.argv))
So you need to change the [2] to [1]. Then you will get another error, because in range(n) the n must be an integer, not a string ... and you can work through this new problem in a similar fashion -- extra tip: look up range() in the docs.
I'd like to suggest having a look at Python's argparse module, which is a giant improvement in parsing commandline parameters - it can also do the conversion to int for you including type-checking and error-reporting / generation of help messages.
Its sys.argv[1] instead of 2. You also want to makes sure that you convert that to an integer if you're doing math with it.
so instead of
for i in range(sys.argv[2]):
you want
for i in range(int(sys.argv[1])):

Categories