I built a relation data model in Oracle and now creating a GUI using Python. I need a SQL statement to execute in my IDE but get a cx_Oracle.DatabaseError: ORA-00936: missing expression error message. This is a CTE that runs fine in TOAD and when I remove the CTE and put in a simple SQl statement it executes fine.
I can build a view in my DB and do a select * from but I don't want to go that way.
I'm new to Python so I'm sure there is a better way to do this.
import cx_Oracle
con = cx_Oracle.connect('Example', 'Example', "Example")
cur = con.cursor()
statement = ("with r1 as (" +
" select " +
" r.PARENT_ITEM_id, " +
" D.SC_ID, " +
" --F.TONS" +
" SUM(ROUND(F.TONS*Pic_Distro*2000*s.stk_lvl_mult)) as Stocking_Lvl" +
" from PIC_DISTRO_TBL D" +
" Left Join Part_Velocity_TBL P on (P.item_ID = D.Item_ID and D.SC_ID = P.SC_ID)" +
" Left Join Forecast_TBL F on (D.Bucket_ID = F.Bucket_ID and D.SC_ID =F.SC_ID)" +
" left join Stock_lvl_tbl S on (S.Velocity_id = P.VELOCITY_ID)" +
" left join item_tbl I on (i.item_ID = D.ITEM_ID)" +
" left join parent_item_tbl R on (r.PARENT_ITEM_id = i.PARENT_ITEM_id)" +
" Where F.MTH = '4'" +
" and F.YEAR = '2017'" +
" and P.Velocity_id in ('A','B','C')" +
" and D.SC_ID in ('01','02')" +
" -- and SUM(ROUND(F.TONS*Pic_Distro*2000*s.stk_lvl_mult)) > 0" +
" Group by " +
" r.PARENT_ITEM_id, D.SC_ID " +
" Order by " +
" D.SC_ID DESC, Stocking_lvl DESC" +
")," +
"R2 as (" +
"select r.Parent_Item_ID, o.SC_ID, " +
"coalesce(sum(avail_wt), 0) as Avail_Wt" +
" from" +
" open_inv_tbl O" +
" left join item_tbl I on (i.item_ID = o.ITEM_ID)" +
" left join parent_item_tbl R on (r.PARENT_ITEM_id = i.PARENT_ITEM_id)" +
" Where r.Parent_item_ID is not null" +
" Group by r.Parent_Item_ID,o.SC_ID)" +
"select " +
" r1.PARENT_ITEM_id, " +
" R1.SC_ID, R1.Stocking_Lvl , " +
" coalesce(R2.Avail_wt, 0 ) as Avail_Wt, " +
" coalesce(R2.Avail_wt/R1.Stocking_Lvl, 0) as Precantage" +
" From R1" +
" left join R2 on (R1.parent_item_id = R2.parent_item_id and R1.Sc_ID = R2.Sc_ID) " +
" Where R1.Stocking_lvl > '0' " +
" Order by SC_id Desc, Stocking_Lvl Desc)" )
cur.arraysize = 2000
cur.execute(statement)
Python offers you multi-line strings, when wrapped in triple quotes.
Try a execute a single string and review your query being correct.
statement = """
with r1 as (
select
r.PARENT_ITEM_id,
D.SC_ID,
--F.TONS
...
"""
cur.execute(statement)
Related
I am comparing two parcel datasets and want to use fuzzywuzzy to identify owner names that are similar. For example, City of Pittsburgh verses City of Pitt. I am currently using an UpdateCursor to do other things with the data and am not sure how I can add the fuzzy logic to this.
What I would like to do is have the script select all parcels that border the update cursor parcel, compare two fields for a fuzzy ratio, and then pass the variable for the one with the highest ratio.
Here is the original script that I'd like to modify with the fuzzy logic.
count = 0
with arcpy.da.UpdateCursor(children,c_field_list) as update1:
for u1row in update1:
c_owner = u1row[0]
c_id = u1row[1]
c_parent = u1row[2]
c_geo = u1row[3]
if c_parent == " ":
parents_select = arcpy.management.SelectLayerByLocation(parents, 'BOUNDARY_TOUCHES', c_geo, '', 'NEW_SELECTION')
whereClause = ' "ACCOUNTPAR" ' + " != '" + str(c_owner) + " ' "
with arcpy.da.SearchCursor(parents_select, p_field_list,whereClause) as search1:
for parent in search1:
p_owner = parent[0]
p_id = parent[1]
p_add = parent[2]
print("Parent Account Party: " + p_add + " " + p_id + " Child SOW Owner: " + c_owner + " " + c_id)
u1row[2] = p_id
u1row[4] = 6
update1.updateRow(u1row)
count += 1
print("Updating " + c_owner + " with " + u1row[2])
else:
pass
print("Aggregation Analysis Complete")
print(str(count) + " parcels updated")
I have a csv file with a time series of two economic variables (housing starts and Unemployment). I have a list of calculations and a summary (text) that is written with the output of the calculations (basically summarizing in a paragraph format what the trends are of the data). I would like feedback on how i get I get a for loop to go through each variable in the csv file so i have a summary for each variable as the final output.
I tried applying the basic logic of a for loop but I'm just not sure what i have incorrect. I looked at a number of examples on stackoverflow but nothing seems to fit, I'm sure I'm missing something simple but haven't been using python that long so just not sure at this point.
raw_data = pd.read_csv('C:/Users/J042666/Desktop/2019.03 HOUST and GDP.csv')
df = pd.DataFrame(raw_data)
for i in df:
freq = "monthly "
units = " million "
pos = 1
colname = df.columns[pos]
alltime = df.mean()
low = df.min()
maximum = df.max()
today = df.iloc[720]
one_year = df.iloc[709:721].mean()
two_year = df.iloc[697:721].mean()
five_year = df.iloc[661:721].mean()
one_year_vol = df.iloc[709:721].std()
two_year_vol = df.iloc[697:721].std()
five_year_vol = df.iloc[661:721].std()
today_vs_1 = ((today/one_year) -1)*100
today_vs_2 = ((today/two_year) -1)*100
today_vs_5 = ((today/five_year) -1)*100
rolling_1 = df.rolling(window=3).mean()
rolling_2 = df.rolling(window=6).mean()
rolling_3 = df.rolling(window=9).mean()
today_vs_1_rolling = ((today/rolling_1.iloc[720]) -1)*100
today_vs_2_rolling = ((today/rolling_2.iloc[720]) -1)*100
today_vs_3_rolling = ((today/rolling_3.iloc[720]) -1)*100
summary = ("The " + str(freq) + str(colname) + " currently stands at " + str(today) + str(units) + " which compares to the 1,2 and 5 year averages of " + str(one_year) + str(units) + "," + str(two_year) + str(units) + "," + " and " + str(five_year) + str(units) + " respectively. " + " Based on the current " + str(colname) + " levels, that reflects a change of" + str(today_vs_1) + ", " + str(today_vs_2) + " and " + str(today_vs_5) + " respectively." " Since the metric began being tracked, the minimum, maximum and long run average total " + str(low) + str(units) + ", " + str(maximum) + str(units) + " and " + str(alltime) + str(units) + " respectively. " "The 1, 2 and 5 year standard deviation for " + str(colname) + " totals " + str(one_year_vol) + str(units) + " ," + str(two_year_vol) + str(units) + " and" + str(five_year_vol) + str(units) + " respectively." + " Based on the current " + str(colname) + " levels compared to the 3, 6 and 9 month rolling averages, the current level reflects a change of " + str(today_vs_1_rolling) + ", " + str(today_vs_2_rolling) + " and " + str(today_vs_3_rolling) + " respectively.")
print(summary)
As I describe above, I am hoping to have code that produces a paragraph summary of the financial metrics i calculate in the for loop for each variable.
The problem is that you are choosing the entire dataframe rather than each column alone;hence, the analysis you were doing was done for both columns. I also just extracted the values required from your operations rather than keeping the entire text that is printed out from Pandas.
This should work:
df = pd.read_csv('2019.03 HOUST and GDP.csv')
df = df.loc[:, ['Housing Starts', 'Unemployment Rate']]
for idx, col in enumerate(df.columns):
freq = "monthly "
units = " million "
colname = col
selectedCol = df.loc[:, [col]]
alltime = selectedCol.mean()[0]
low = selectedCol.min()[0]
maximum = selectedCol.max()[0]
today = selectedCol.iloc[720][0]
one_year = selectedCol.iloc[709:721].mean()[0]
two_year = selectedCol.iloc[697:721].mean()[0]
five_year = selectedCol.iloc[661:721].mean()[0]
one_year_vol = selectedCol.iloc[709:721].std()[0]
two_year_vol = selectedCol.iloc[697:721].std()[0]
five_year_vol = selectedCol.iloc[661:721].std()[0]
today_vs_1 = ((today/one_year) -1)*100
today_vs_2 = ((today/two_year) -1)*100
today_vs_5 = ((today/five_year) -1)*100
rolling_1 = selectedCol.rolling(window=3).mean()
rolling_2 = selectedCol.rolling(window=6).mean()
rolling_3 = selectedCol.rolling(window=9).mean()
today_vs_1_rolling = ((today/rolling_1.iloc[720]) -1)*100
today_vs_2_rolling = ((today/rolling_2.iloc[720]) -1)*100
today_vs_3_rolling = ((today/rolling_3.iloc[720]) -1)*100
summary = ("The " + str(freq) + str(colname) + " currently stands at " + str(today) + str(units) + " which compares to the 1,2 and 5 year averages of " + str(one_year) + str(units) + "," + str(two_year) + str(units) + "," + " and " + str(five_year) + str(units) + " respectively. " + " Based on the current " + str(colname) + " levels, that reflects a change of" + str(today_vs_1) + ", " + str(today_vs_2) + " and " + str(today_vs_5) + " respectively." " Since the metric began being tracked, the minimum, maximum and long run average total " + str(low) + str(units) + ", " + str(maximum) + str(units) + " and " + str(alltime) + str(units) + " respectively. " "The 1, 2 and 5 year standard deviation for " + str(colname) + " totals " + str(one_year_vol) + str(units) + " ," + str(two_year_vol) + str(units) + " and" + str(five_year_vol) + str(units) + " respectively." + " Based on the current " + str(colname) + " levels compared to the 3, 6 and 9 month rolling averages, the current level reflects a change of " + str(today_vs_1_rolling[0]) + ", " + str(today_vs_2_rolling[0]) + " and " + str(today_vs_3_rolling[0]) + " respectively.")
print(summary)
I would like to collect different type of datas into a file. Here is a part of the code.
val = str(float(data[-1]))
val_dB = float(val)
val_dB = math.log(val_dB, 10) * 10
myfile = open('../../../MLI_values/mli_value.txt', 'a')
myfile.write(date_ID + " " + val + val_dB + "\n")
myfile.close()
But it gives back an error:
myfile.write(date_ID + " " + val + val_dB + "\n")
TypeError: cannot concatenate 'str' and 'float' objects
How can I solve it to put them together? (into columns) into a file?
Change:
myfile.write(date_ID + " " + val + val_dB + "\n")
to:
myfile.write(date_ID + " " + val + " " + str(val_dB) + "\n")
I'm trying to make a code that shows all the details of specific students. The code takes an input of gender and their class and then shows all their data. For example, if I input "Male" and "10A", I want the code to give me all the data from the students who are male and in class 10A. All their information is stored in a CSV file (called details). My code so far is:
file = open("details.csv","rt")
gender_input = input("Input gender of students")
class_input = input("Input class of students")
for line in file:
details_of_gender = line.split(",")
details_of_class = line.split(",")
gender = str(details_of_gender[7])
class1 = str(details_of_class[8])
if gender == "Male":
if class1 == "10A":
print(details[0] + " " + details[1] + " " + details[2] + " " + details[3] + " " + details[4] + " " + details[5])
if class1 == "10B":
print(details[0] + " " + details[1] + " " + details[2] + " " + details[3] + " " + details[4] + " " + details[5])
if gender == "Female":
if class1 == "10A":
print(details[0] + " " + details[1] + " " + details[2] + " " + details[3] + " " + details[4] + " " + details[5])
if class1 == "10B":
print(details[0] + " " + details[1] + " " + details[2] + " " + details[3] + " " + details[4] + " " + details[5])
Assuming that your program isn't working just because you're using a variable details that you haven't defined before, here's a better way of doing this:
import csv
with open("details.csv", newline="") as infile:
reader = csv.reader(infile) # use the csv module to read a CSV file!
for row in reader:
gender = row[7]
class = row[8]
if gender in ("Male", "Female") and class in ("10A", "10B"):
print(" ".join(row[:6])) # join elements 0-5 with spaces
I would like to insert the timestamp into an output filename of a python script, for example: 20011231_230159_md5_filelist.csv
I am having trouble inserting the code.
This is the end of the script whose output filename needs to have a timestamp:
try:
my_last_data = get_md5(file_full_path) + ", " + get_last_write_time(file_full_path) + ", " + get_size(
file_full_path) + ", " + file_full_path + "\n"
with open("md5_filelist.csv", "a") as my_save_file:
my_save_file.write(my_last_data)
print(str(file_full_path) + " ||| Done")
except:
print("Error On " + str(file_full_path))
This is the timestamping code I am battling with (although not sure if it is the best line for the purpose):
timestr = time.strftime("%Y%m%d_%H%M%S")
I tried inserting in various ways, does not work. Any hints?
Thank you to #pm-2ring (see comments), the solution:
timestr = time.strftime("%Y%m%d_%H%M%S")
(timestr + "_md5_filelist.csv", "a")
in the script:
try:
timestr = time.strftime("%Y%m%d_%H%M%S")
my_last_data = get_md5(file_full_path) + ", " + get_last_write_time(file_full_path) + ", " + get_size(
file_full_path) + ", " + file_full_path + "\n"
with open(timestr + "_md5_filelist.csv", "a") as my_save_file:
my_save_file.write(my_last_data)
print(str(file_full_path) + " ||| Done")
except:
print("Error On " + str(file_full_path))
This is all you need.
try:
my_last_data = get_md5(file_full_path) + ", " + get_last_write_time(file_full_path) + ", " + get_size(
file_full_path) + ", " + file_full_path + "\n"
with open("{}md5_filelist.csv".format(timestr), "a") as my_save_file:
my_save_file.write(my_last_data)
print(str(file_full_path) + " ||| Done")
except:
print("Error On " + str(file_full_path))
You were not using timestr anywhere in you code.