Try-Except in the For loop - python

I written the below Try-Except code in a For Loop for Python - PostgreSQL database input.
I have a list in .csv file to input into the PostgreSQL database.
1 of the column is primary data enabled.
Hence, if the data in the list is duplicated, the Python will throw error.
I written below code and my imagination is the code will run "try" 1st, if "try" error then only jump to the "except" line, for every rows in the list (loop).
But, when I execute my code, once the program go to "except" line, then the program would not go to "try" line again.
Ex.
If my list of data all fresh, the program run and input all rows to database correctly.
If my duplicated data is in the 1st line of the list, the program run to "except" line and the fresh data at the bottom are not input to the database.
As long as there is duplicated data on top, the program will just run into "except" line without go back to "try" line.
Any idea how to improve my code here? Thank you.
My intention is all fresh data need to capture into the database, while there is duplicated, the data row shall be skipped and continue for next line.
for data in df.iterrows():
vreceivingdate = str(data[1][0])
vreceivingtime = str(data[1][1])
vscanresult = str(data[1][2])
vinvoicenbr = str(vscanresult[0:6])
vlotnbr = str(vscanresult[6:9])
vcasenbr = str(vscanresult[9:12])
try:
rec.insertRec(vreceivingdate, vreceivingtime, vscanresult, vinvoicenbr, vlotnbr, vcasenbr)
except:
dupDataCounter += 1

Finally I found the solution for my question.
I shall not use Try-Except for this case.
To do what I want to do should use the PostgreSQL function:
"ON CONFLICT DO NOTHING".

Related

issue with str.split() when trying to run the same function - Column not being 'erased'

I have a simple function that works fine, however when I am trying to run it again, I get an error. It's probably caused by str.split(), that I am using in one column of a dataframe (and then dataframe.explode). When I am trying to run the same code again, dataframe is loaded correctly, but this specific column is not - it has a values of previous run. This is simplified example of my code:
file_1 = pd.read_excel(example.xls)
def Validation_text():
text = pd.DataFrame(file_1)
text['Action_Type'] = text['Action_Type'].str.split(',')
text = text.explode('Action_Type')
print (text)
...
As Action_Type in original file, there can be one or more actions, and I am creating new line for each. Example:
Action Type in file:
New
Move
Move,Text
Delete
Move,Text,Update
When I run the code for the first time, printed output is correct and I can see that column 'Action Type' is correctly split:
[New]
[Move]
[Move, Text]
[Delete]
[Move, Text, Update]
But when I do that again 'Action_Type' column is empty - containing only NaN values and therefore program crashes (as there are other steps in program after this is being done). Everything else is fine, all other data are 'loaded' into dataframe again. I guess that those split Action_Type values are being stored somewhere in memory and not being loaded each time into dataframe. I tried to delete whole dataframe by using Del, even clean it with text = '', but nothing works.
Can anyone help?
Thank you

Error swallowed

Problem
I am connecting with JayDeBeApi to SQL Server 2017 and running a script like:
SELECT ... INTO #a-temp-table
DELETE FROM a-table
INSERT INTO a-table SELECT FROM #a-temp-table
DELETE #a-temp-table
During step 3 i get the following error:
Cannot insert duplicate key row in object 'dbo.a-table' with unique index 'UQ_a-table'. The duplicate key value is (11, 0001, 3751191, T70206CAT, 0000).
Instead of ~360k records, only ~180k get inserted. So step 3 aborts.
The temp table however gets deleted. So step 4 completes.
I am able to fix the error. But with JayDeBeApi, I am not seeing the error.
It seems like everything went fine from the Python point of view.
My goal is to capture those errors to handle them appropriately.
Any idea how to achieve that?
What I've tried
My Python code looks like.
try:
localCursor = dbConnection.cursor()
x = localCursor.execute(query)
logInfo("Run script %s... done" % (scriptNameAndPath), "run script", diagnosticLog)
except Exception as e:
logError("Error running sql statement " + scriptNameAndPath + ". Skipping rest of row.",
"run script", e, diagnosticLog)
myrow = skipRowAndLogRecord(startRowTime, cursor, recordLog)
continue
x = localCursor.execute(myqrystm) completes successfully, so no exception is thrown. x is None and while inspecting localCursor, I see no sign of any error message(s)/code(s)
Step 3 should be all-or-none so the a-table should be empty following the duplicate key error unless your actual code has a WHERE clause.
Regarding the undetected exception, add SET NOCOUNT ON as the first statement in the script. That will suppress DONE_IN_PROC messages that will interfere with script execution unless your code handles multiple result sets.
https://learn.microsoft.com/en-us/sql/t-sql/language-elements/try-catch-transact-sql?view=sql-server-2017
-- Create procedure to retrieve error information.
CREATE PROCEDURE usp_GetErrorInfo
AS
SELECT
ERROR_NUMBER() AS ErrorNumber
,ERROR_SEVERITY() AS ErrorSeverity
,ERROR_STATE() AS ErrorState
,ERROR_PROCEDURE() AS ErrorProcedure
,ERROR_LINE() AS ErrorLine
,ERROR_MESSAGE() AS ErrorMessage;
GO
BEGIN TRY
-- Generate divide-by-zero error.
SELECT 1/0;
END TRY
BEGIN CATCH
-- Execute error retrieval routine.
EXECUTE usp_GetErrorInfo;
END CATCH;

Loop through values of one variable to populate another variable - SPSS

I currently have the below syntax -
BEGIN PROGRAM.
import spss,spssdata
varlist = [element[0] for element in spssdata.spssdata('CARD_2_Q2_1_a').fetchall()]
varstring = " ".join(str(int(i)) for i in varlist)
spss.submit("if (Q4_2 = 2 AND CARD_2_Q2_1_a = %(varstring)s) Q4_2_FULL = %(varstring)s." %locals())
END PROGRAM.
I thought this would just loop through the values in my variable CARD_2_Q2_1_a and populate Q4_2_FULL where appropriate. It worked in long hand without Python use, but the code above doesn't change the input file at all. Any reason why this might not be working or an alternative way of doing this?
varstring will be a string of integers joined by blanks. Therefore, your test condition in the IF will never be satisfied. Hence Q4_2_FULL will never be populated. You can print out the command you are submitting to see this.
I'm not sure exactly what your desired result is, but remember that the IF command you are submitting will execute over the entire dataset.

Populate wxChoice at Start of Program Run | Python

As soon as my program is run, I want my wxChoice to be populated with items from a list I designate. I am using wxFormBuilder to handle the GUI elements of my program.
My code:
def onDropDownSelection(self, parent):
#Open designated file
lines = tuple(open("/Users/it/Desktop/Classbook/masterClassList.txt", 'r'))
#Strips the first line of the file, splits the elements, assigns to "one"
lines[1].rstrip()
one = lines[1].split("|")
#My attempt to populate the wxChoice with my list "one"
self.firstChoice.SetItems(one)
This event is activated when the user clicks on the drop-down (wxChoice) menu, and re-populates every time it is clicked on.
Is there a way I can populate my wxChoice, only once, upon the initial opening/running of the program?
I have placed this code where the wxChoice is being created. However, I am now experiencing a "Unindent does not match any outer indentation level" on line 44. How do I fix this?
Check for your indentation. Some times if you copy paste, this can mess things up.
Just rewrite it or replace it with another statement. See here:
IndentationError: unindent does not match any outer indentation level
Problem is if you make your indentation with tabs and then copy-paste some code from an example page, where the indentation is made with spaces. Then you have mixed Indentations. I've had these a lot of times.

Help with Python loop weirdness?

I'm learning Python as my second programming language (my first real one if you don't count HTML/CSS/Javascript). I'm trying to build something useful as my first real application - an IRC bot that alerts people via SMS when certain things happen in the channel. Per a request by someone, I'm (trying) to build in scheduling preferences where people can choose not to get alerts from between hours X and Y of the day.
Anyways, here's the code I'm having trouble with:
db = open("db.csv")
for line in db:
row = line.split(",") # storing stuff in a CSV, reading out of it
recipient = row[0] # who the SMS is going to
s = row[1] # gets the first hour of the "no alert" time range
f = row[2] # gets last hour of above
nrt = [] # empty array that will store hours
curtime = time.strftime("%H") # current hour
if s == "no":
print "They always want alerts, sending email" # start time will = "no" if they always want alerts
# send mail code goes here
else:
for hour in range(int(s), int(f)): #takes start, end hours, loops through to get hours in between, stores them in the above list
nrt.append(hour)
if curtime in nrt: # best way I could find of doing this, probably a better way, like I said I'm new
print "They don't want an alert during the current hour, not sending" # <== what it says
else:
# they do want an alert during the current hour, send an email
# send mail code here
The only problem I'm having is somehow the script only ends up looping through one of the lines (or something like that) because I only get one result every time, even if I have more than one entry in the CSV file.
If this is a regular CSV file you should not try to parse it yourself. Use the standard library csv module.
Here is a short example from the docs:
import csv
reader = csv.reader(open("some.csv", "rb"))
for row in reader:
print row
There are at least two bugs in your program:
curtime = time.strftime("%H")
...
for hour in range(int(s), int(f)):
nrt.append(hour)
# this is an inefficient synonym for
# nrt = range(int(s), int(f))
if curtime in nrt:
...
First, curtime is a string, whereas nrt is a list of integers. Python is strongly typed, so the two are not interchangeable, and won't compare equal:
'4' == 4 # False
'4' in [3, 4, 5] # False
This revised code addresses that issue, and is also more efficient than generating a list and searching for the current hour in it:
cur_hour = time.localtime().tm_hour
if int(s) <= cur_hour < int(f):
# You can "chain" comparison operators in Python
# so that a op1 b op2 c is equivalent to a op1 b and b op2c
...
A second issue that the above does not address is that your program will not behave properly if the hours wrap around midnight (e.g. s = 22 and f = 8).
Neither of these problems are necessarily related to "the script only ends up looping through one of the lines", but you haven't given us enough information to figure out why that might be. A more useful way to ask questions is to post a brief but complete code snippet that shows the behavior you are observing, along with sample input and the resulting error messages, if any (along with traceback).
Have you tried something more simple? Just to see how your file is actually read by Python:
db = open("db.csv")
for line in db:
print line
There can be problem with format of your csv-file. That happens, for instance, when you open Unix file in Windows environment. In that case the whole file looks like single string as Windows and Unix have different line separators. So, I don't know certain cause of your problem, but offer to think in that direction.
Update:
Your have multiple ways through the body of your loop:
when s is "no": "They always want alerts, sending email" will be printed.
when s is not "no" and curtime in nrt: "They don't want an alert during the current hour, not sending" will be printed.
when s is not "no" and curtime in nrt is false (the last else): nothing will be printed and no other action undertaken.
Shouldn't you place some print statement in the last else branch?
Also, what is exact output of your snippet? Is it "They always want alerts, sending email"?
I would check the logic in your conditionals. You looping construct should work.
You could go thro an existing well written IRC bot in Python Download
Be explicit with what's in a row. Using 0, 1, 2...n is actually your bug, and it makes code very hard to read in the future for yourself or others. So let's use the handy tuple to show what we're expecting from a row. This sort of works like code as documentation
db = open("db.csv")
for line in db.readlines():
recipient, start_hour, end_hour = line.split(",")
nrt = []
etc...
This shows the reader of your code what you're expecting a line to contain, and it would have shown your bug to you the first time you ran it :)

Categories