Help with Python loop weirdness? - python

I'm learning Python as my second programming language (my first real one if you don't count HTML/CSS/Javascript). I'm trying to build something useful as my first real application - an IRC bot that alerts people via SMS when certain things happen in the channel. Per a request by someone, I'm (trying) to build in scheduling preferences where people can choose not to get alerts from between hours X and Y of the day.
Anyways, here's the code I'm having trouble with:
db = open("db.csv")
for line in db:
row = line.split(",") # storing stuff in a CSV, reading out of it
recipient = row[0] # who the SMS is going to
s = row[1] # gets the first hour of the "no alert" time range
f = row[2] # gets last hour of above
nrt = [] # empty array that will store hours
curtime = time.strftime("%H") # current hour
if s == "no":
print "They always want alerts, sending email" # start time will = "no" if they always want alerts
# send mail code goes here
else:
for hour in range(int(s), int(f)): #takes start, end hours, loops through to get hours in between, stores them in the above list
nrt.append(hour)
if curtime in nrt: # best way I could find of doing this, probably a better way, like I said I'm new
print "They don't want an alert during the current hour, not sending" # <== what it says
else:
# they do want an alert during the current hour, send an email
# send mail code here
The only problem I'm having is somehow the script only ends up looping through one of the lines (or something like that) because I only get one result every time, even if I have more than one entry in the CSV file.

If this is a regular CSV file you should not try to parse it yourself. Use the standard library csv module.
Here is a short example from the docs:
import csv
reader = csv.reader(open("some.csv", "rb"))
for row in reader:
print row

There are at least two bugs in your program:
curtime = time.strftime("%H")
...
for hour in range(int(s), int(f)):
nrt.append(hour)
# this is an inefficient synonym for
# nrt = range(int(s), int(f))
if curtime in nrt:
...
First, curtime is a string, whereas nrt is a list of integers. Python is strongly typed, so the two are not interchangeable, and won't compare equal:
'4' == 4 # False
'4' in [3, 4, 5] # False
This revised code addresses that issue, and is also more efficient than generating a list and searching for the current hour in it:
cur_hour = time.localtime().tm_hour
if int(s) <= cur_hour < int(f):
# You can "chain" comparison operators in Python
# so that a op1 b op2 c is equivalent to a op1 b and b op2c
...
A second issue that the above does not address is that your program will not behave properly if the hours wrap around midnight (e.g. s = 22 and f = 8).
Neither of these problems are necessarily related to "the script only ends up looping through one of the lines", but you haven't given us enough information to figure out why that might be. A more useful way to ask questions is to post a brief but complete code snippet that shows the behavior you are observing, along with sample input and the resulting error messages, if any (along with traceback).

Have you tried something more simple? Just to see how your file is actually read by Python:
db = open("db.csv")
for line in db:
print line
There can be problem with format of your csv-file. That happens, for instance, when you open Unix file in Windows environment. In that case the whole file looks like single string as Windows and Unix have different line separators. So, I don't know certain cause of your problem, but offer to think in that direction.
Update:
Your have multiple ways through the body of your loop:
when s is "no": "They always want alerts, sending email" will be printed.
when s is not "no" and curtime in nrt: "They don't want an alert during the current hour, not sending" will be printed.
when s is not "no" and curtime in nrt is false (the last else): nothing will be printed and no other action undertaken.
Shouldn't you place some print statement in the last else branch?
Also, what is exact output of your snippet? Is it "They always want alerts, sending email"?

I would check the logic in your conditionals. You looping construct should work.

You could go thro an existing well written IRC bot in Python Download

Be explicit with what's in a row. Using 0, 1, 2...n is actually your bug, and it makes code very hard to read in the future for yourself or others. So let's use the handy tuple to show what we're expecting from a row. This sort of works like code as documentation
db = open("db.csv")
for line in db.readlines():
recipient, start_hour, end_hour = line.split(",")
nrt = []
etc...
This shows the reader of your code what you're expecting a line to contain, and it would have shown your bug to you the first time you ran it :)

Related

How to dynamically choose a line from a csv file in python

For my final project in a python class, I need to make a "make you're own adventure" type game, or more of a game engine, using python... The csv file is the story;
Should I add a database to my app?,Use MS Access,No way - databases suck,2,3
You are buried in relational diagrams,Get through them and get to the fun stuff,Let's add in a few more relationships,4,5
Shall we stick with some fun Javascript?,Yes,No,7,5
Onto the coding!,Which language?,Let's use several!,3,6
I miss programming Python - Game Over!,,,,
Brain burn out - Game Over!,,,,
Good choice - take a break - Game Over!,,,,
and I need to print the first cell from a line, use the following x cells as prompts, and then the same amount of x cells are the lines that that answer will bring me to. (For example, if I say Use MS Access on the first question, I jump to line 2, and if I say No way - databases suck, I jump to line 3)
My question is, how do I make a line of code that will read out the prompt, show the unspecified amount of options, and then take the answer from that option and jump to the corresponding line?
This is what I have right now:
print(story[0][0])
print("1 -", story[0][1])
print("2 -", story[0][2])
print("3 - Save game")
And that shows up like:
Should I add a database to my app?
1 - Use MS Access
2 - No way - databases suck
3 - Save game
Which is what I want, but it's not dynamic, which is a requirement I need, but I was never thought how to do that.
To gather inputs from the user, you want the builtin input(prompt) function (nb: in Python2.x you want raw_input(prompt) instead). Beware, this function always returns a string so you'll have to turn it into an integer.
Then you have to get the current row's value matching the user's input (in your above example if the user types "1" ("Use MS Access"), you want to retrieve the associated value "2", and if types "2" ("No way, ....") you want to retrieve the value "3".
Once you have the value matching the user's input (and have turned it into an int - if you didn't already did so when loadin the cvs file in story), you just have to retrieve the matching row from story, ie if the user typed "2" ("No way, ...."), you want to read the story's third row (just beware, lists indexes are zero-based, so the third line is actually at index 2).
Then put all this in a loop (starting with 0 as current story index) and you're done.
NB : It would actually have been almost faster to answer with a full code example - this is actually simpler to code than to explain xD - but you wouldn't learn anything then ;-)

Why is a part of my Python code interpreted diffently when I add a seemingly unrelated part?

Some background: I'm implementing a GUI to interact with equipment via GPIB. The issue arises in this method:
from tkinter import *
from tkinter import ttk
import visa #PyVisa Package. pyvisa.readthedocs.io
from time import sleep
import numpy as np #NumPy Package. Scipy.org
def oneDSweep():
Voltage =[]
Current =[]
Source = []
try:
#Gate = parseGate(Gate1Input.get()) #Not implemented yet.
Min = float(Gate1MinInput.get()) #Add a check for valid input
#if Min < .001:
#Throw exception
Max = float(Gate1MaxInput.get()) #Add a check for valid input
VoltageInterval = .02 #Prompt user for interval?
rm = visa.ResourceManager()
SIM900 = rm.open_resource("GPIB0::1::INSTR") #Add a check that session is open.
x = 0
Volt = Min
while Volt <= Max:
SIM900.write("SNDT 1, 'VOLT " + str(Volt) + "'") #Set voltage.
SIM900.write("SNDT 7, 'VOLT? 1'") #Ask a port for voltage.
Vnow = SIM900.query("GETN? 7, 50") #Retrieve data from previous port.
Vnow = Vnow[6:15]
Vnow = float(Vnow) ############Error location
Voltage = np.append(Voltage, Vnow)
SIM900.write("SNDT 1, 'VOLT?'") #Ask a different port for voltage.
Snow = SIM900.query("GETN? 1, 50") #Retrieve data.
print(Snow) #Debugging method. Probably not problematic.
Snow = Snow[4:]
Snow = float(Snow)
sleep(1) #Add a delay for science reasons.
#The code below helps the while loop act like a for loop.
x = x+1
Volt = Min + VoltageInterval*x
Volt = float(truncate(Volt, 7))
finally:
print(Voltage)
print(Source)
Voltage.tofile("output.txt.",sep=",")
SIM900.write("FLSH")#Flush the ports' memories to ensure no bad data stays there.
I get a simple ValueError at the marked location during the first pass of the while loop; Python says it cannot convert the string to a float(more on this later). However, simply remove these five lines of code:
SIM900.write("SNDT 1, 'VOLT?'")
Snow = SIM900.query("GETN? 1, 50")
print(Snow)
Snow = Snow[4:]
Snow = float(Snow)
and the program runs perfectly. I understand the source of the error. With those lines added, when I send these two lines to my instrument:
SIM900.write("SNDT 7, 'VOLT? 1'")
Vnow = SIM900.query("GETN? 7, 50")
I get essentially a null error. #3000 is returned, which is a blank message the machine sends when it is asked to output data and it has none to output. However, these same two lines produce something like #3006 00.003 when the four lines I mentioned are excluded from the program. In other words, simply adding those four lines to my program has changed the message sent to the instrument at the beginning of the while loop, despite adding them near the end.
I am convinced that Python's interpreter is at fault here. Earlier, I was cleaning up my code and discovered that one particular set of quotes, when changed from ' to ", produced this same error, despite no other quote pair exhibiting this behavior, even within the same line. My question is, why does the execution of my code change dependent upon unrelated alterations to the code(would also appreciate a fix)? I understand this problem is difficult to replicate given my somewhat specific application, so if there is more information that would be helpful that I can provide, please let me know.
EDIT: Functionality has improved after moving from the command prompt to IDLE. I'm still baffled by what happened, but due to my meager command prompt skills, I can't provide any proof. Please close this question.
Python is telling you exactly what is wrong with your code -- a ValueError. It even gives you the exact line number and the value that is causing the problem.
'#3006 00.003'
That is the value of SNOW that is being printed out. Then you do this
SNOW = SNOW[4:]
Now SNOW is
'6 00.003'
You then try to call float() on this string. 6 00.003 can't be converted to a float because it's a nonsensical number.
I am convinced that Python's interpreter is at fault here. Earlier, I was cleaning up my code and discovered that one particular set of quotes, when changed from ' to ", produced this same error, despite no other quote pair exhibiting this behavior, even within the same line.
Python generates exactly the same bytecode for single and double quoted strings (unless embedded quotes are involved, of course). So either the environment you're running your script in is seriously broken (I'm counting the python interpreter as part of the "environment"), or your diagnosis is incorrect. I'd put my money on the second.
Here's an alternative explanation. For whatever reason, the hardware you hooked up is returning inconsistent results. So one time you get what you expect, the next time you get an error-- you think your changes to the code account for the differences, but there's no relationship between cause and effect and you end up pulling your hair out. When you run the same code several times in a row, do you get consistent results? I.e. do you consistently get the odd behavior? Even if you do, the problem must be with the hardware or the hookup, not with Python.

How to save a changed item to an external file? (Python 3)

I'm fairly new to python, but I'm making a script and I want one of the functions to update a variable from another file. It works, but when I exit the script and reload it, the changes aren't there anymore. For example (this isn't my script):
#File: changeFile.txt
number = 0
#File: changerFile.py
def changeNumber():
number += 1
If I retrieve number during that session, it will return 1, but if I exit out and go back in again and retrieve number without calling changeNumber, it returns 0.
How can I get the script to actually save the number edited in changeNumber to changeFile.txt? As I said, I'm fairly new to python, but I've looked just about everywhere on the Internet and couldn't really find an answer that worked.
EDIT: Sorry, I forgot to include that in the actual script, there are other values.
So I want to change number and have it save without deleting the other 10 values stored in that file.
Assuming, as you show, that changeFile.txt has no other content whatever, then just change the function to:
def changeNumber():
global number # will not possibly work w/o this, the way you posted!
number += 1
with open('changeFile.txt', 'w') as f:
f.write('number = {}\n'.format(number))
ADDED: the OP edited the Q to mention (originally omitted!-) the crucial fact that changefile.txt has other lines that need to be preserved as well as the one that needs to be changed.
That, of course, changes everything -- but, Python can cope!-)
Just add import fileinput at the start of this module, and change the last two lines of the above snippet (starting with with) to:
for line in fileinput.input(['changefile.txt'], inplace=True):
if line.startswith('number ');
line = 'number = {}\n'.format(number)'
print line,
This is the Python 2 solution (the OP didn't bother to tell us if using Py2 or Py3, a crucial bit of info -- hey, who cares about making it easy rather than very hard for willing volunteers to help you, right?!-). If Python 3, change the last statement from print line, to
print(line, end='')
to get exactly the same desired effect.

How to assign a single variable to a specific line in python?

I was not clear enough in my last question, and so I'll explain my question more this time.
I am creating 2 separate programs, where the first one will create a text file with 2 generated numbers, one on line 1 and the second on line 2.
Basically I saved it like this:
In this example I'm not generating numbers, just assigning them quickly.
a = 15
b = 16
saving = open('filename.txt', "w")
saving.write(a+"\n")
saving.write(b+"\n")
saving.close()
Then I opened it on the next one:
opening = open('filename.txt', "w")
a = opening.read()
opening.close()
print(a) #This will print the whole document, but I need each line to be differnet
Now I got the whole file loaded into 'a', but I need it split up, which is something that i have not got a clue on how to do. I don't believe creating a list will help, as I need each number (Variables a and b from program 1) to be different variables in program 2. The reason I need them as 2 separate variables is because I need to divide it by a different number. If I do need to do a list, please say. I tried finding an answer for about an hour in total, though I couldn't find anything.
The reason I can't post the whole program is because I haven't got access to it from here, and no, this is not cheating as we are free to research and ask questions outside the classroom, if someone wonders about that after looking at my previous question.
If you need more info please put it in a comment and I'll respond ASAP.
opening = open('filename.txt') # "w" is not necessary since you're opening it read-only
a = [b.split() for b in opening.readlines()] # create a list of each line and strip the newline "\n" character
print(a[0]) # print first line
print(a[1]) # print second line

best way to parsing Large files by regex python

I have to parse a large log file (2GB) using reg ex in python. In the log file regular expression matches line which I am interested in. Log file can also have unwanted data.
Here is a sample from the file:
"#DEBUG:: BFM [L4] 5.4401e+08ps MSG DIR:TX SCB_CB TYPE:DATA_REQ CPortID:'h8 SIZE:'d20 NumSeg:'h0001 Msg_Id:'h00000000"
My regular expression is ".DEBUG.*MSG."
First I will split it using the white spaces then the "field:value" patterns are inserted into the sqlite3 database; but for large files it takes around 10 to 15 minutes to parse the file.
Please suggest the best way to do the above task in minimal time.
As others have said, profile your code to see why it is slow. The cProfile module in conjunction with the gprof2dot tool can produce nice readable information
Without seeing your slow code, I can guess a few things that might help:
First is you can probably get away with using the builtin string methods instead of a regex - this might be marginally quicker. If you need to use regex's, it's worthwhile precompiling outside the main loop using re.compile
Second is to not do one insert query per line, instead do the insertions in batches, e.g add the parsed info to a list, then when it reaches a certain size, perform one INSERT query with executemany method.
Some incomplete code, as an example of the above:
import fileinput
parsed_info = []
for linenum, line in enumerate(fileinput.input()):
if not line.startswith("#DEBUG"):
continue # Skip line
msg = line.partition("MSG")[1] # Get everything after MSG
words = msg.split() # Split on words
info = {}
for w in words:
k, _, v = w.partition(":") # Split each word on first :
info[k] = v
parsed_info.append(info)
if linenum % 10000 == 0: # Or maybe if len(parsed_info) > 500:
# Insert everything in parsed_info to database
...
parsed_info = [] # Clear
Paul's answer makes sense, you need to understand where you "lose" time first.
Easiest way if you don't have a profiler is to post a timestamp in milliseconds before and after each "step" of your algorithm (opening the file, reading it line by line (and inside, time taken for the split / regexp to recognise the debug lines), inserting it in the DB, etc...).
Without further knowledge of your code, there are possible "traps" that would be very time consuming :
- opening the log file several times
- opening the DB every time you need to insert data inside instead of opening one connection and then write as you go
"The best way to do the above task in minimal time" is to first figure out where the time is going. Look into how to profile your Python script to find what parts are slow. You may have an inefficient regex. Writing to sqlite may be the problem. But there are no magic bullets - in general, processing 2GB of text line by line, with a regex, in Python, is probably going to run in minutes, not seconds.
Here is a test script that will show how long it takes to read a file, line by line, and do nothing else:
from datetime import datetime
start = datetime.now()
for line in open("big_honkin_file.dat"):
pass
end = datetime.now()
print (end-start)

Categories