Exporting SQL query from Python to txt file - python

I'm trying to export data that I queried from a database to a txt file. I am able to do so with the .to_csv method however it exports with spaces. I've tried to set the (sep) in the query to no space but it is forcing me to use at least one space or item as a seperator. Is there any way to export data to a txt file and not have any spaces in between export?
dataframe
Code I've been using to export to .txt
dataframe.to_csv('Sales_Drivers_ITCSignup.txt',index=False,header=True)
Want it to export like this:

Try
np.savetext(filename, df.values, fmt)
Feel free to ask question in case of any problem.

Took a bit of tinkering but this was the code I was able to come up with. The thought process was to create import the text file, edit it as a list, then re-export it overwriting the previous list. Thanks for all the suggestions!
RemoveCommas = []
RemoveSpaces = []
AddSymbol = []
Removeextra = []
#Import List and execute replacements
SalesDriversTransactions = []
with open('Sales_Drivers_Transactions.txt', 'r')as reader:
for line in reader.readlines():
SalesDriversTransactions.append(line)
for comma in SalesDriversTransactions:
WhatWeNeed = comma.replace(",","")
RemoveCommas.append(WhatWeNeed)
for comma in RemoveCommas:
WhatWeNeed = comma.replace(" ","")
RemoveSpaces.append(WhatWeNeed)
for comma in RemoveSpaces:
WhatWeNeed = comma.replace("þ","þ")
AddSymbol.append(WhatWeNeed)
for comma in AddSymbol:
WhatWeNeed = comma.replace("\n","")
Removeextra.append(WhatWeNeed)
with open('Sales_Drivers_Transactions.txt', 'w')as f:
for i in Removeextra:
f.write(i)
f.write('\n')

Related

Labelling and Grouping Postcodes using Python

I'm fairly new to Python and I am attempting to group various postcodes together under predefined labels. For example "SA31" would be labelled a "HywelDDAPostcode"
I have some code where I read lots of postcodes from a singled columned file into a list and compare them with postcodes that are in predefined lists. However, when I output my postcode labels only the Label "UKPostcodes" is outputted for every postcode in my original file. It would appear that the first two conditions in my code always evaluate to false no matter what. Am I doing the right thing using "in"? Or perhaps it's a file reading issue? I'm not sure
The input file is simply a file which contains a list of postcodes ( in reality it has thousands of rows)
The CSV file
Here is my code:
import csv
with open('postcodes.csv', newline='') as f:
reader = csv.reader(f)
your_list = list(reader)
my_list =[]
HywelDDAPostcodes=["SA46","SY23","SY24","SA18","SA16","SA43","SA31","SA65","SA61","SA62","SA17","SA48","SA40","SA19","SA20","SA44","SA15","SA14","SA73","SA32","SA67","SA45",
"SA38","SA42","SA41","SA72","SA71","SA69","SA68","SA33","SA70","SY25","SA34","LL40","LL42","LL36","SY18","SY17","SY20","SY16","LD6"]
NationalPostcodes=["LL58","LL59","LL60","LL61","LL62","LL63","LL64","LL65","LL66","LL67","LL68","LL69","LL70","LL71","LL72","LL73","LL74","LL75","LL76","LL77","LL78",
"NP1","NP2","NP23","NP3","CF31","CF32","CF33","CF34","CF35","CF36","CF3","CF46","CF81","CF82","CF83","SA35","SA39","SA4","SA47","LL16","LL18","LL21","LL22","LL24","LL25","LL26","LL27","LL28","LL29","LL30","LL31","LL32","LL33","LL34","LL57","CH7","LL11","LL15","LL16","LL17","LL18","LL19","LL20","LL21","LL22","CH1","CH4","CH5","CH6","CH7","LL12","CF1","CF32","CF35","CF5","CF61","CF62","CF63","CF64","CF71","LL23","LL37","LL38","LL39","LL41","LL43","LL44","LL45","LL46","LL47","LL48","LL49","LL51","LL52","LL53","LL54","LL55","LL56","LL57","CF46","CF47","CF48","NP4","NP5","NP6","NP7","SA10","SA11","SA12","SA13","SA8","CF3","NP10","NP19","NP20","NP9","SA36","SA37","SA63","SA64","SA66","CF44","CF48","HR3","HR5","LD1","LD2","LD3","LD4","LD5","LD7","LD8","NP8","SY10","SY15","SY19","SY21","SY22","SY5","CF37","CF38","CF39","CF4","CF40","CF41","CF42","CF43","CF45","CF72","SA1","SA2","SA3","SA4","SA5","SA6","SA7","SA1","NP4","NP44","NP6","LL13","LL14","SY13","SY14"]
NationalPostcodes2= list(dict.fromkeys(NationalPostcodes))
labels=["HywelDDA","NationalPostcodes","UKPostcodes"]
for postcode in your_list:
#print(postcode)
if postcode in HywelDDAPostcodes:
my_list.append(labels[0])
if postcode in NationalPostcodes2:
my_list.append(labels[1])
else:
my_list.append(labels[2])
with open('DiscretisedPostcodes.csv','w') as result_file:
wr = csv.writer(result_file, dialect='excel')
for item in my_list:
wr.writerow([item,])
If anyone has any advice as to what could be causing the issue or just any advice surrounding Python, in general, I would very much appreciate it. Thank you!
The reason why your comparison block isn't working is that when you use csv reader to read your file, each line is being added to your_list as a list. So you are making a list of lists and when you compare those things it doesn't match.
['LL58'] == 'LL58' # fails
So, inspect your_list and see what I mean. You should make a shell your_list before you read the file and append each new reading to it. Then inspect that to make sure it looks good. It would also behoove you to use the strip() command to strip off whitespace from each item. I can't recall if csv reader does that automatically.
Also... a better structure for testing for membership is to use sets instead of lists. in will work for lists, but it is MUCH faster for sets, so I would put your comparison items into sets.
Lastly, it isn't clear what you are trying to do with NationalPostcodes2. Just use your NationalPostcodes, but put them in a set with {}.
#Jeff H's answer is correct, but for what it's worth here's how I might write this code (untested):
# Note: Since, as you wrote, these are only single-column files I did not use the csv
# module, as it will just add additional unnecessary overhead.
# Read the known data from files--this will always be more flexible and maintainable than
# hard-coding them in your code. This is just one possible scheme for doing this; e.g.
# you could also put all of them into a single JSON file
standard_postcode_files = {
'HywelDDA': 'hyweldda.csv',
'NationalPostcodes': 'nationalpostcodes.csv',
'UKPostcodes': 'ukpostcodes.csv'
}
def read_postcode_file(filename):
with open(filename) as f:
# exclude blank lines and strip additional whitespace
return [line.strip() for line in f if line.strip()]
standard_postcodes = {}
for key, filename in standard_postcode_files.items():
standard_postcodes[key] = set(read_postcode_file(filename))
# Assuming all post codes are unique to a set, map postcodes to the set they belong to
postcodes_reversed = {v: k for k, s in standard_postcodes.items() for v in s}
your_postcodes = read_postcode_file('postcodes.csv')
labels = [postcodes_reversed[code] for code in your_postcodes]
with open('DiscretisedPostCodes.csv', 'w') as f:
for label in labels:
f.write(label + '\n')
I would probably do other things like not make the input filename hard-coded. If you need to work with multiple columns using the csv module would also be fine with minimal additional changes, but since you're just writing one item per line I figured it was unnecessary.

Skipping lines and selecting wrong cells while reading CSV file

So I have a csv file that has tons of games and info about them and I'm trying to save the game's publisher and the ESRB rating. But for some reason when I print them out it'll randomly skip games and chose wrong cells.
My code:
def simpleLoop(file_name):
output = []
input_file = open(file_name, "r")
for line in input_file:
cells = line.split(",")
output.append((cells[7], cells[13]))
i = 0
while (i <= 10):
print(output[i]) # testing what values i get
i += 1
Screenshot of csv
Output
Expected Output
Any help is appreciated thanks!
Edit: Solved with the help of SimoN
For anyone else facing a similar issue make sure you specify exactly where you want to split. In my case I split at commas but there were commas inside some of the cells. So to fix this I changed:
cells = line.split(",")
To
cells = line.split('","')
Which makes python split after each cell because cells end with a double quote then a comma and the next cell starts with a double quote
There are commas inside some of the cells and you are splitting on these. When you opened the CSV in Excel (or whatever you used) it knew not to split on these as they are surrounded by quotes. I'd suggest using the Python csv module so you can do the same.

Cross referencing two csv files in python

so as i'm out of ideas I've turned to geniuses on this site.
What I want to be able to do is to have two separate csv files. One of which has a bunch of store names on it, and the other to have black listed stores.
I'd like to be able to run a python script that reads the 'black listed' sheet, then checks if those specific names are within the other sheet, and if they are, then delete those off the main sheet.
I've tried for about two days straight and cannot for the life of me get it to work. So i'm coming to you guys to help me out.
Thanks so much in advance.
p.s If you can comment the hell out out of the script so I know what's going on it would be greatly appreciated.
EDIT: I deleted the code I originally had but hopefully this will give you an idea of what I was trying to do. (I also realise it's completely incorrect)
import csv
with open('Black List.csv', 'r') as bl:
reader = csv.reader(bl)
with open('Destinations.csv', 'r') as dest:
readern = csv.reader(dest)
for line in reader:
if line in readern:
with open('Destinations.csv', 'w'):
del(line)
The first thing you need to be aware of is that you can't update the file you are reading. Textfiles (which include .csv files) don't work like that. So you have to read the whole of Destinations.csv into memory, and then write it out again, under a new name, but skipping the rows you don't want. (You can overwrite your input file, but you will very quickly discover that is a bad idea.)
import csv
blacklist_rows = []
with open('Black List.csv', 'r') as bl:
reader = csv.reader(bl)
for line in reader:
blacklist_rows.append(line)
destination_rows = []
with open('Destinations.csv', 'r') as dest:
readern = csv.reader(dest)
for line in readern:
destination_rows.append(line)
Now at this point you need to loop through destination_rows and drop any that match something in blacklist_rows, and write out the rest. I can't suggest what the matching test should look like, because you haven't shown us your input data, so I don't actually know that blacklist_rows and destination_rows contain.
with open('FilteredDestinations.csv', 'w') as output:
writer = csv.writer(output)
for r in destination_rows:
if not r: # trap for blank rows in the input
continue
if r *matches something in blacklist_rows*: # you have to code this
continue
writer.writerow(r)
You could try Pandas
import pandas as pd
df1 = pd.read_csv("Destinations.csv")
df2 = pd.read_csv("Black List.csv")
blacklist = df2["column_name_in_blacklist_file"].tolist()
df3 = df2[~df2['destination_column_name'].isin(blacklist)]
df3.to_csv("results.csv")
print(df3)

Getting the content of .CSV cell

I’m having troubles reading a .CSV file even though i have tried to read the online python-doc.
The thing is i have been using the xlrd module on python to read through xls file and it went superbly.
Now i want to try with .CSV but i find things much more complicated.
When i wanted python to return the content of a cell(i,j) : sheet.Cell(i,j).value and it worked. End.
It's a ";" delimited csv.
Something like :
Ref;A;B;C;D;E;f
P;x1;x2;x3;x4...
L;y1;y2;y3
M;z1...
N:w1 ...
I want to display a list box containing a A,B,C,D ...
And bind this list with a Cur_Selection function that will make some calculus within x,y,z,w of a selected ref A,B,C,D ...
That was very easy in xlrd. I don't get it here.
Can someone help ?
Are you asking how to access the data in the csv? I typically parse csvs with a simple function with string manipulation methods. Works for me with rather small csv files which I generate in excel.
def parse_csv(content, delimiter = ';'):
csv_data = []
for line in content.split('\n'):
csv_data.append( [x.strip() for x in line.split( delimiter )] ) # strips spaces also
return csv_data
content = open(uri,'r').read()
list_data = parse_csv( content )
print list_data[2][1]

Why can't I repeat the 'for' loop for csv.Reader?

I am a beginner of Python. I am trying now figuring out why the second 'for' loop doesn't work in the following script. I mean that I could only get the result of the first 'for' loop, but nothing from the second one. I copied and pasted my script and the data csv in the below.
It will be helpful if you tell me why it goes in this way and how to make the second 'for' loop work as well.
My SCRIPT:
import csv
file = "data.csv"
fh = open(file, 'rb')
read = csv.DictReader(fh)
for e in read:
print(e['a'])
for e in read:
print(e['b'])
"data.csv":
a,b,c
tree,bough,trunk
animal,leg,trunk
fish,fin,body
The csv reader is an iterator over the file. Once you go through it once, you read to the end of the file, so there is no more to read. If you need to go through it again, you can seek to the beginning of the file:
fh.seek(0)
This will reset the file to the beginning so you can read it again. Depending on the code, it may also be necessary to skip the field name header:
next(fh)
This is necessary for your code, since the DictReader consumed that line the first time around to determine the field names, and it's not going to do that again. It may not be necessary for other uses of csv.
If the file isn't too big and you need to do several things with the data, you could also just read the whole thing into a list:
data = list(read)
Then you can do what you want with data.
I have created small piece of function which doe take path of csv file read and return list of dict at once then you loop through list very easily,
def read_csv_data(path):
"""
Reads CSV from given path and Return list of dict with Mapping
"""
data = csv.reader(open(path))
# Read the column names from the first line of the file
fields = data.next()
data_lines = []
for row in data:
items = dict(zip(fields, row))
data_lines.append(items)
return data_lines
Regards

Categories