Result of my CSV file generated contains comma's, brackets - python

I am generating a csv which contains the results I expect, all the numbers are right.
However, presentation wise it contains parentheses around everything and comma's etc.
Is there a way I can remove these?
I tried adding comma as a delimiter but that didn't solve it.
Example output:
Sure: ('Egg',)
results = []
results1 = []
results2 = []
results3 = []
results4 = []
results5 = []
results6 = []
cur.execute(dbQuery)
results.extend(cur.fetchall())
cur.execute(dbQuery1)
results1.extend(cur.fetchall())
cur.execute(dbQuery2)
results2.extend(cur.fetchall())
cur.execute(dbQuery3)
results3.extend(cur.fetchall())
cur.execute(dbQuery4)
results4.extend(cur.fetchall())
cur.execute(dbQuery5)
results5.extend(cur.fetchall())
cur.execute(dbQuery6)
results6.extend(cur.fetchall())
with open("Stats.csv", "wb") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Query1', 'Query2', 'Query3', 'Query4', 'Query5', 'Query6', 'Query7'])
csv_writer.writerows(zip(*[results, results1, results2, results3, results4, results5, results6]))

The zip function is returning a list of tuples [(x, y), (t, s)...]
The writerows method expects a list of lists. So, I think you should format the zip return before call the writerows. Something like that should work:
result = zip(results, results1, results2, results3, results4, results5, results6)
csv_writer.writerows([list(row) for row in result])
EDIT:
I think I understood the problem you are having here (so ignore my previous answer above).
The fetchall function is returning a list of tuples like [(x,), (y,)]
So, then your resultsX variables will have this format. Then, you are applying a zip between these lists (see here what zip does).
If for example we have
results = [(x,), (y,)]
results1 = [(t,), (z,)]
When you run the zip(results, results1), it will return:
[((x,), (t,)), ((y,), (z,))]
So, that's the format of the list you are passing to the writerows, which means the first row will be: ((x,), (t,)) where the element one is: (x,) and the second one is (t,)
So, not sure what you are expecting to write in the CSV with the zip function. But the result you are getting is because your elements to write in the csv are tuples instead of values.
I don't know the query you are doing here, but if you are expecting just one field per each result, maybe then you need to strip out the tuple in each resultsX variable. You can take a look how to do it in this thread: https://stackoverflow.com/a/12867429/1130381
I hope it helps, but that's the best I can do with the info you provided.

Related

Why is my list coming up blank when trying to import data from a CSV file?

Python is completely new to me and i'm still trying to figure out the basics... we were given a project to analyse and determine particular things in a csv file that was given to us.
There are many columns, but the first is most important as one of the variables in the function we need to create is for the first column. It's labelled 'adultids' where a combination of letters and numbers are given, and one 'adultid' takes up 15 rows of different information - there's many different 'adultids' within the file.
So start it off I am trying to make a list from that csv file that contains only the information from the 'adultsID' given (which, as a variable in the function, is a list of two 'adultids' from the csv file), basically trying to single out that information from the rest of the data in the csv file. When i run it, it comes up with '[]', and i cant figure out why... can someone tell me whats wrong?
I'm not sure if any of that makes sense, its very hard to describe, so i apologise in advance, but here is the code i tried:)
def file_to_read(csvfile, adultIDs):
with open(csvfile, 'r') as asymfile:
lst = asymfile.read().split("\n")
new_lst = []
if adultIDs == True:
for row in lst:
adultid, point, x, y, z = row.split(',')
if adultid == adultIDs:
new_lst.append([adultid, point, x, y, z])
return new_lst
Try this.
This is because if you give adultIDs to False Then you get the output [], because you assign the new_lst to []
def file_to_read(csvfile, adultIDs):
with open(csvfile, 'r') as asymfile:
lst = asymfile.read().split("\n")
new_lst = []
if adultIDs == True:
for row in lst:
adultid, point, x, y, z = row.split(',')
if adultid == adultIDs:
new_lst.append([adultid, point, x, y, z])
return new_lst
return lst
As far as I understand, you pass a list of ids like ['R10', 'R20', 'R30'] to the second argument of your function. Those ids are also contained in the csv-file, you are trying to parse. In this case you should, probably, rewrite your function, in a way, that checks if the adultid from a row of your csv-file is contained in the list adultIDs that you pass into your function. I'd rather do it like this:
def file_to_read(csvfile, adult_ids): # [1]
lst = []
with open(csvfile, 'r') as asymfile:
for row in asymfile: # [2]
r = row[:-1].split(',') # [3]
if r[0] in adult_ids: # [4]
lst.append(r)
return lst
Description for commented digits in brackets:
Python programmers usually prefer snake_case names for variables and arguments. You can learn more about it in PEP 8. Although it's not connected to your question, it may just be helpful for your future projects, when other programmers will review your code. Here I also would recommend
You don't need to read the whole file in a variable. You can iterate it row by row, thus saving memory. This may be helpful if you use huge files, or lack of memory.
You need to take the whole string except last character, which is \n.
in checks if the adult_id from the row of a csv-file is contained in the argument, that you pass. Thus I would recommend using set datatype for adult_ids rather than list. You can read about sets in documentation.
I hope I got your task right, and that helps you. Have a nice day!

how to remove the square brackets that csv.writer.writerow create

I'm using csv.writer.writerow to write multiple parameters: chromosome.weight_height, chromosome.weight_holes, chromosome.weight_bumpiness, chromosome.weight_line_clear
for each row inside a csv file, the problem is that write the correct values but with the square brackets, and i don't want this.
Is there a way to remove the brackets?
def write_generation_to_file(generation_number, population):
with open(configuration.WEIGHTS_FILES_FOLDER + "generation_" + str(generation_number) + "_weights.csv", 'w',
newline='') as file:
for chromosome in population:
writer = csv.writer(file)
writer.writerow([chromosome.weight_height, chromosome.weight_holes, chromosome.weight_bumpiness,
chromosome.weight_line_clear])
EDIT 1:
Those are some example of the parameters that i give to the function.
As i know numpy.random.uniform return only one element.
If i try to do chromosome.weight_height[0], for example, it throw exception
weight_height = numpy.random.uniform(-2,2)
weight_holes = numpy.random.uniform(-2,2)
weight_bumpiness = numpy.random.uniform(-2,2)
weight_line_clear = numpy.random.uniform(-2,2)
Its likely that weight_height, weight_holes etc are actually lists with a single element in them. Try giving the first element in the list during the writerow and see if that fixes it. E.g:
writer.writerow([chromosome.weight_height[0], chromosome.weight_holes[0], chromosome.weight_bumpiness[0], chromosome.weight_line_clear[0]])
This may or may not solve your problem as my answer is based on the assumption that these values are single element lists.

Formatting SQL rows to a list in python

I have a bunch of SQL rows that I have converted to lists:
cursor = conn.cursor()
result = cursor.execute(SQLqueries.sql_qry1)
for row in result:
row_to_list = list(row)
print(row_to_list)
The output of this is lists like:
['FreqBand,Frequency,0, 5, 10\r\n1,0.006,16.56,25.15,30.96\r\n']
['FreqBand,Frequency,0, 5, 10\r\n1,0.006,12.56,15.27,31.90\r\n']
['FreqBand,Frequency,0, 5, 10\r\n1,0.006,16.36,25.15,34.46\r\n']
I would like to edit these lists to exclude the first two words and replace the "\r\n" characters with commas. I've tried this to get rid of the 'FreqBand,Frequency':
for row in result:
row_to_list = list(row)
i = 0
for each in row_to_list:
row_to_list[i].replace('FreqBand', '')
i += 1
print(row_to_list)
but the output of this seems to get rid of half the first list and doesn't edit any of the others. Any help on this would be appreciated.
You need to assign the result of replace() back to the list element. And use range() to get a sequence of numbers instead of incrementing i yourself.
for row in result:
row_to_list = list(row)
for i in range(len(row_to_list)):
row_to_list[i] = row_to_list[i].replace('FreqBand,Frequency,', '').replace('\r\n', ',')
print(row_to_list)
First of all, the list below has a size of 1 rather than 8-10 that I assumed when I first saw the question. So initially, please check if that is something you are aware of.
['FreqBand,Frequency,0, 5, 10\r\n1,0.006,16.56,25.15,30.96\r\n']
In this way, when you iterate over this list with for each in row_to_list:, all you will get is the string that has no difference from the string you would get with row_to_list[0].
Secondly, you might want to double check what you are trying to accomplish with the counter i. In the case of manipulating first elements of each list you name as row_to_list, all you need to do is to access by index then reassign the variable.
for row in result:
row_to_list = list(row)
row_to_list[0] = row_to_list[0].replace('FreqBand,Frequency,', '')
row_to_list[-1] = row_to_list[-1].replace('\r\n', ',')

python csv TypeError: unhashable type: 'list'

Hi Im trying to compare two csv files and get the difference. However i get the above mentioned error. Could someone kindly give a helping hand. Thanks
import csv
f = open('ted.csv','r')
psv_f = csv.reader(f)
attendees1 = []
for row in psv_f:
attendees1.append(row)
f.close
f = open('ted2.csv','r')
psv_f = csv.reader(f)
attendees2 = []
for row in psv_f:
attendees2.append(row)
f.close
attendees11 = set(attendees1)
attendees12 = set(attendees2)
print (attendees12.difference(attendees11))
When you iterate csv reader you get lists, so when you do
for row in psv_f:
attendees2.append(row)
Row is actually a list instance. so attendees1 / attendees2 is a list of lists.
When you convert it to set() it need to make sure no item appear more than once, and set() relay on hash function of the items in the list. so you are getting error because when you convert to set() it try to hash a list but list is not hashable.
You will get the same exception if you do something like this:
set([1, 2, [1,2] ])
More in sets: https://docs.python.org/2/library/sets.html
Happened on the line
attendees11 = set(attendees1)
didn't it? You are trying to make a set from a list of lists but it is impossible because set may only contain hashable types, which list is not. You can convert the lists to tuples.
attendees1.append(tuple(row))
Causes you created list of list:
attendees1.append(row)
Like wise:
attendees2.append(row)
Then when you do :
attendees11 = set(attendees1)
The error will be thrown
What you should do is :
attendees2.append(tuple(row))

Generating a .CSV with Several Columns - Use a Dictionary?

I am writing a script that looks through my inventory, compares it with a master list of all possible inventory items, and tells me what items I am missing. My goal is a .csv file where the first column contains a unique key integer and then the remaining several columns would have data related to that key. For example, a three row snippet of my end-goal .csv file might look like this:
100001,apple,fruit,medium,12,red
100002,carrot,vegetable,medium,10,orange
100005,radish,vegetable,small,10,red
The data for this is being drawn from a couple sources. 1st, a query to an API server gives me a list of keys for items that are in inventory. 2nd, I read in a .csv file into a dict that matches keys with item name for all possible keys. A snippet of the first 5 rows of this .csv file might look like this:
100001,apple
100002,carrot
100003,pear
100004,banana
100005,radish
Note how any key in my list of inventory will be found in this two column .csv file that gives all keys and their corresponding item name and this list minus my inventory on hand yields what I'm looking for (which is the inventory I need to get).
So far I can get a .csv file that contains just the keys and item names for the items that I don't have in inventory. Give a list of inventory on hand like this:
100003,100004
A snippet of my resulting .csv file looks like this:
100001,apple
100002,carrot
100005,radish
This means that I have pear and banana in inventory (so they are not in this .csv file.)
To get this I have a function to get an item name when given an item id that looks like this:
def getNames(id_to_name, ids):
return [id_to_name[id] for id in ids]
Then a function which gives a list of keys as integers from my inventory server API call that returns a list and I've run this function like this:
invlist = ServerApiCallFunction(AppropriateInfo)
A third function takes this invlist as its input and returns a dict of keys (the item id) and names for the items I don't have. It also writes the information of this dict to a .csv file. I am using the set1 - set2 method to do this. It looks like this:
def InventoryNumbers(inventory):
with open(csvfile,'w') as c:
c.write('InvName' + ',InvID' + '\n')
missinginvnames = []
with open("KeyAndItemNameTwoColumns.csv","rb") as fp:
reader = csv.reader(fp, skipinitialspace=True)
fp.readline() # skip header
invidsandnames = {int(id): str.upper(name) for id, name in reader}
invids = set(invidsandnames.keys())
invnames = set(invidsandnames.values())
invonhandset = set(inventory)
missinginvidsset = invids - invonhandset
missinginvids = list(missinginvidsset)
missinginvnames = getNames(invidsandnames, missinginvids)
missinginvnameswithids = dict(zip(missinginvnames, missinginvids))
print missinginvnameswithids
with open(csvfile,'a') as c:
for invname, invid in missinginvnameswithids.iteritems():
c.write(invname + ',' + str(invid) + '\n')
return missinginvnameswithids
Which I then call like this:
InventoryNumbers(invlist)
With that explanation, now on to my question here. I want to expand the data in this output .csv file by adding in additional columns. The data for this would be drawn from another .csv file, a snippet of which would look like this:
100001,fruit,medium,12,red
100002,vegetable,medium,10,orange
100003,fruit,medium,14,green
100004,fruit,medium,12,yellow
100005,vegetable,small,10,red
Note how this does not contain the item name (so I have to pull that from a different .csv file that just has the two columns of key and item name) but it does use the same keys. I am looking for a way to bring in this extra information so that my final .csv file will not just tell me the keys (which are item ids) and item names for the items I don't have in stock but it will also have columns for type, size, number, and color.
One option I've looked at is the defaultdict piece from collections, but I'm not sure if this is the best way to go about what I want to do. If I did use this method I'm not sure exactly how I'd call it to achieve my desired result. If some other method would be easier I'm certainly willing to try that, too.
How can I take my dict of keys and corresponding item names for items that I don't have in inventory and add to it this extra information in such a way that I could output it all to a .csv file?
EDIT: As I typed this up it occurred to me that I might make things easier on myself by creating a new single .csv file that would have date in the form key,item name,type,size,number,color (basically just copying in the column for item name into the .csv that already has the other information for each key.) This way I would only need to draw from one .csv file rather than from two. Even if I did this, though, how would I go about making my desired .csv file based on only those keys for items not in inventory?
ANSWER: I posted another question here about how to implement the solution I accepted (becauseit was giving me a value error since my dict values were strings rather than sets to start with) and I ended up deciding that I wanted a list rather than a set (to preserve the order.) I also ended up adding the column with item names to my .csv file that had all the other data so that I only had to draw from one .csv file. That said, here is what this section of code now looks like:
MyDict = {}
infile = open('FileWithAllTheData.csv', 'r')
for line in infile.readlines():
spl_line = line.split(',')
if int(spl_line[0]) in missinginvids: #note that this is the list I was using as the keys for my dict which I was zipping together with a corresponding list of item names to make my dict before.
MyDict.setdefault(int(spl_line[0]), list()).append(spl_line[1:])
print MyDict
it sounds like what you need is a dict mapping ints to sets, ie,
MyDict = {100001: set([apple]), 100002: set([carrot])}
you can add with update:
MyDict[100001].update([fruit])
which would give you: {100001: set([apple, fruit]), 100002: set([carrot])}
Also if you had a list of attributes of carrot... [vegetable,orange]
you could say MyDict[100002].update([vegetable, orange])
and get: {100001: set([apple, fruit]), 100002: set([carrot, vegetable, orange])}
does this answer your question?
EDIT:
to read into CSV...
infile = open('MyFile.csv', 'r')
for line in infile.readlines():
spl_line = line.split(',')
if int(spl_line[0]) in MyDict.keys():
MyDict[spl_line[0]].update(spl_line[1:])
This isn't an answer to the question, but here is a possible way of simplifying your current code.
This:
invids = set(invidsandnames.keys())
invnames = set(invidsandnames.values())
invonhandset = set(inventory)
missinginvidsset = invids - invonhandset
missinginvids = list(missinginvidsset)
missinginvnames = getNames(invidsandnames, missinginvids)
missinginvnameswithids = dict(zip(missinginvnames, missinginvids))
Can be replaced with:
invonhandset = set(inventory)
missinginvnameswithids = {k: v for k, v in invidsandnames.iteritems() if k in in inventory}
Or:
invonhandset = set(inventory)
for key in invidsandnames.keys():
if key not in invonhandset:
del invidsandnames[key]
missinginvnameswithids = invidsandnames
Have you considered making a temporary RDB (python has sqlite support baked in) and for reasonable numbers of items I don't think you would have a performance issues.
I would turn each CSV file and the result from the web-api into a tables (one table per data source). You can then do everything you want to do with some SQL queries + joins. Once you have the data you want, you can then dump it back to CSV.

Categories