how to remove the square brackets that csv.writer.writerow create - python

I'm using csv.writer.writerow to write multiple parameters: chromosome.weight_height, chromosome.weight_holes, chromosome.weight_bumpiness, chromosome.weight_line_clear
for each row inside a csv file, the problem is that write the correct values but with the square brackets, and i don't want this.
Is there a way to remove the brackets?
def write_generation_to_file(generation_number, population):
with open(configuration.WEIGHTS_FILES_FOLDER + "generation_" + str(generation_number) + "_weights.csv", 'w',
newline='') as file:
for chromosome in population:
writer = csv.writer(file)
writer.writerow([chromosome.weight_height, chromosome.weight_holes, chromosome.weight_bumpiness,
chromosome.weight_line_clear])
EDIT 1:
Those are some example of the parameters that i give to the function.
As i know numpy.random.uniform return only one element.
If i try to do chromosome.weight_height[0], for example, it throw exception
weight_height = numpy.random.uniform(-2,2)
weight_holes = numpy.random.uniform(-2,2)
weight_bumpiness = numpy.random.uniform(-2,2)
weight_line_clear = numpy.random.uniform(-2,2)

Its likely that weight_height, weight_holes etc are actually lists with a single element in them. Try giving the first element in the list during the writerow and see if that fixes it. E.g:
writer.writerow([chromosome.weight_height[0], chromosome.weight_holes[0], chromosome.weight_bumpiness[0], chromosome.weight_line_clear[0]])
This may or may not solve your problem as my answer is based on the assumption that these values are single element lists.

Related

Is there a way to "transform" a CSV table into a simple nested if... else block in python?

I'm fairly new to python and I'm looking forward to achieve the following:
I have a table with several conditions as in the image below (maximum 5 conditions) along with various attributes. Each condition comes from a specific set of values, for example Condition 1 has 2 possible values, Condition 2 has 4 possible values, Condition 3 has 2 possible values etc..
What I would like to do: From the example table above, I would like to generate a simple python code so that when I execute my function and import a CSV file containing the table above, I should get the following output saved as a *.py file:
def myFunction(Attribute, Condition):
if Attribute1 & Condition1:
myValue = val_11
if Attribute1 & Condition2:
myValue = val_12
...
...
if Attribute5 & Condition4:
myValue = val_54
NOTE: Each CSV file will contain only one sheet and the titles for the columns do not change.
UPDATE, NOTE#2: Both "Attribute" and "Condition" are string values, so simple string comparisons would suffice.
Is there a simple way to do this? I dove into NLP and realized that it is not possible (at least from what I found in the literature). I'm open to all forms of suggestions/answers.
You can't really use "If"s and "else"s, since, if I understand your question correctly, you want to be able to read the conditions, attributes and values from a CSV file. Using "If"s and "else"s, you would only be able to check a fixed range of conditions and attributes defined in your code. What I would do, is to write a parser (piece of code, which reads the contents of your CSV file and saves it in another, more usable form).
In this case, the parser is the parseCSVFile() function. Instead of the ifs and elses comparing attributes and conditions, you now use the attributes and conditions to access a specific element in a dictionary (similar to an array or list, but you can now use for example string keys instead of the numerical indexes). I used a dictionary containing a dictionary at each position to split the CSV contents into their rows and columns. Since I used dictionaries, you can now use the strings of the Attributes and Conditions to access your values instead of doing lots of comparisons.
#Output Dictionary
ParsedDict = dict()
#This is either ';' or ',' depending on your operating system or you can open a CSV file with notepad for example to check which character is used
CSVSeparator = ';'
def parseCSVFile(filePath):
global ParsedDict
f = open(filePath)
fileLines = f.readlines()
f.close()
#Extract the conditions
ConditionsArray = (fileLines[0].split(CSVSeparator))[1:]
for x in range(len(fileLines)-1):
#Remove unwanted characters such as newline characters
line = fileLines[1 + x].strip()
#Split by the CSV separation character
LineContents = line.split(CSVSeparator)
ConditionsDict = dict()
for y in range(len(ConditionsArray)):
ConditionsDict.update({ConditionsArray[y]: LineContents[1 + y]})
ParsedDict.update({LineContents[0]: ConditionsDict})
def myFunction(Attribute, Condition):
myValue = ParsedDict[Attribute][Condition]
The "[1:]" is to ignore the contents in the first column (empty field at the top left and the "Attribute x" fields) when reading either the conditions or the values
Use the parseCSVFile() function to extract the information from the csv file
and the myFunction() to get the value you want

Why is my list coming up blank when trying to import data from a CSV file?

Python is completely new to me and i'm still trying to figure out the basics... we were given a project to analyse and determine particular things in a csv file that was given to us.
There are many columns, but the first is most important as one of the variables in the function we need to create is for the first column. It's labelled 'adultids' where a combination of letters and numbers are given, and one 'adultid' takes up 15 rows of different information - there's many different 'adultids' within the file.
So start it off I am trying to make a list from that csv file that contains only the information from the 'adultsID' given (which, as a variable in the function, is a list of two 'adultids' from the csv file), basically trying to single out that information from the rest of the data in the csv file. When i run it, it comes up with '[]', and i cant figure out why... can someone tell me whats wrong?
I'm not sure if any of that makes sense, its very hard to describe, so i apologise in advance, but here is the code i tried:)
def file_to_read(csvfile, adultIDs):
with open(csvfile, 'r') as asymfile:
lst = asymfile.read().split("\n")
new_lst = []
if adultIDs == True:
for row in lst:
adultid, point, x, y, z = row.split(',')
if adultid == adultIDs:
new_lst.append([adultid, point, x, y, z])
return new_lst
Try this.
This is because if you give adultIDs to False Then you get the output [], because you assign the new_lst to []
def file_to_read(csvfile, adultIDs):
with open(csvfile, 'r') as asymfile:
lst = asymfile.read().split("\n")
new_lst = []
if adultIDs == True:
for row in lst:
adultid, point, x, y, z = row.split(',')
if adultid == adultIDs:
new_lst.append([adultid, point, x, y, z])
return new_lst
return lst
As far as I understand, you pass a list of ids like ['R10', 'R20', 'R30'] to the second argument of your function. Those ids are also contained in the csv-file, you are trying to parse. In this case you should, probably, rewrite your function, in a way, that checks if the adultid from a row of your csv-file is contained in the list adultIDs that you pass into your function. I'd rather do it like this:
def file_to_read(csvfile, adult_ids): # [1]
lst = []
with open(csvfile, 'r') as asymfile:
for row in asymfile: # [2]
r = row[:-1].split(',') # [3]
if r[0] in adult_ids: # [4]
lst.append(r)
return lst
Description for commented digits in brackets:
Python programmers usually prefer snake_case names for variables and arguments. You can learn more about it in PEP 8. Although it's not connected to your question, it may just be helpful for your future projects, when other programmers will review your code. Here I also would recommend
You don't need to read the whole file in a variable. You can iterate it row by row, thus saving memory. This may be helpful if you use huge files, or lack of memory.
You need to take the whole string except last character, which is \n.
in checks if the adult_id from the row of a csv-file is contained in the argument, that you pass. Thus I would recommend using set datatype for adult_ids rather than list. You can read about sets in documentation.
I hope I got your task right, and that helps you. Have a nice day!

Is there a way to fill in the gaps of missing/empty data for each of the printed rows to ensure all data is structured similarly?

Please see the below code that I am using to scrape content from a dynamically generated page and then placing into a CSV. The problem I am running into now is that each "row" could potentially be missing certain elements that I would like to insert as "blank" or some other placeholder so that all of the rows are correctly located under the correct column header when I view this in excel.
with open(filename, 'w', newline='') as csvfile:
csvwriter = csv.writer(csvfile)
for heading in All_Heading:
driver.execute_script("return arguments[0].scrollIntoView(true);", heading)
#print("------------- " + heading.text + " -------------")
ChildElement = heading.find_elements_by_xpath("./../div/div")
for child in ChildElement:
driver.execute_script("return arguments[0].scrollIntoView(true);", child)
#print(heading.text)
#print(child.text)
row = [heading.text, *child.text.split("\n")] # You can also use a tuple here
print (row)
csvwriter.writerow(row)
Is it possible to place the sort of logic I am after in the writer statement or will I need to specifically go after each element I am after to know if it is empty or not? In the end I want each row to contain the same number of elements, even if it means some of them are blank or filler text as this will keep the overall struture of the data intact.
Examples of the output below:
As you can see some elements were empty and as such the following elements end up out of line/in the wrong column. (11,12,13,26)
On the same note, is it possible to know extra information about the element I am dealing with in the loop that prints the row? If I knew the class I could then know if its the title, price, weight, brand or so on.
Thank you!
It’s not possible to identify what’s missing from the text itself, at least not reliably since you’d have to search for certain substrings to identify the fields. However if you do have all these values in separate elements under child, you can associate each element with its field in two ways assuming there is a class uniquely identifying each field:
You can go through all the children using WebElement. find_elements(By.XPATH, ".//*"), record what fields are present using someClass in WebElement.get_attribute("class"), and later fill in blanks for the missing ones.
You can use WebElement .find_elements() (which searches through WebElement's children) and filter by the known class name i.e. find_elements(by=By.CLASS_NAME, value=className):
# ...
fieldClasses = ["Title", "Price", ...] # These are just example classes
for fieldClass in fieldClasses:
element = child.find_elements(by=By.CLASS_NAME, value=fieldClass)
row.append(element.text if element else "Blank")
# ...
You can simply replace empty spaces with some text, before saving the rows.
Ex:
row = [heading.text, *child.text.split("\n")] # You can also use a tuplehere
print (row)
for i in range(len(row)):
if len(row[i]) < 1:
row[i] = "Filler"
csvwriter.writerow(row)

Removing extra formatting from a python list element ( [''] )

Re-learning python after not using it for a few years - so go nice on me.
The basis, is I am reading in data from a .csv file, the information i am reading in is as follows
E1435
E46371
E1696
E27454
However, when using print(list[0]) for example, it produces
['E1435']
I am trying to use these pieces of data to interpolate into an API request string, and the " [' '] " in them is breaking the requests - basically, I need the elements in the list to not have the square brackets and quotes when using them as variables.
My interpolation is as follows, in case the way I'm interpolating is the problem:
req = requests.get('Linkgoeshere/%s' % list[i])
Edit;
A sample of the data i'm using is listed above with "E1435, E46371" etc. each item in the csv is a new row in the same column.
As per a request, i have produced a minimal reproduction of my experience.
import csv
#list to store data from csv
geoCode = []
#Read in locations from a designated file
with open('Locations.csv','rt')as f:
data = csv.reader(f)
for row in data:
geoCode.append(row)
i=0
for item in geoCode:
#print the items in the list
print(geoCode[i])
i+=1
It appears that list[i] is itself a nested list, so you need another subscript to get to the element inside it:
print(list[i][0])
NB: Avoid naming variables list as it overrides the built-in list type. Try using a plural word like codes or ids instead.

Result of my CSV file generated contains comma's, brackets

I am generating a csv which contains the results I expect, all the numbers are right.
However, presentation wise it contains parentheses around everything and comma's etc.
Is there a way I can remove these?
I tried adding comma as a delimiter but that didn't solve it.
Example output:
Sure: ('Egg',)
results = []
results1 = []
results2 = []
results3 = []
results4 = []
results5 = []
results6 = []
cur.execute(dbQuery)
results.extend(cur.fetchall())
cur.execute(dbQuery1)
results1.extend(cur.fetchall())
cur.execute(dbQuery2)
results2.extend(cur.fetchall())
cur.execute(dbQuery3)
results3.extend(cur.fetchall())
cur.execute(dbQuery4)
results4.extend(cur.fetchall())
cur.execute(dbQuery5)
results5.extend(cur.fetchall())
cur.execute(dbQuery6)
results6.extend(cur.fetchall())
with open("Stats.csv", "wb") as csv_file:
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['Query1', 'Query2', 'Query3', 'Query4', 'Query5', 'Query6', 'Query7'])
csv_writer.writerows(zip(*[results, results1, results2, results3, results4, results5, results6]))
The zip function is returning a list of tuples [(x, y), (t, s)...]
The writerows method expects a list of lists. So, I think you should format the zip return before call the writerows. Something like that should work:
result = zip(results, results1, results2, results3, results4, results5, results6)
csv_writer.writerows([list(row) for row in result])
EDIT:
I think I understood the problem you are having here (so ignore my previous answer above).
The fetchall function is returning a list of tuples like [(x,), (y,)]
So, then your resultsX variables will have this format. Then, you are applying a zip between these lists (see here what zip does).
If for example we have
results = [(x,), (y,)]
results1 = [(t,), (z,)]
When you run the zip(results, results1), it will return:
[((x,), (t,)), ((y,), (z,))]
So, that's the format of the list you are passing to the writerows, which means the first row will be: ((x,), (t,)) where the element one is: (x,) and the second one is (t,)
So, not sure what you are expecting to write in the CSV with the zip function. But the result you are getting is because your elements to write in the csv are tuples instead of values.
I don't know the query you are doing here, but if you are expecting just one field per each result, maybe then you need to strip out the tuple in each resultsX variable. You can take a look how to do it in this thread: https://stackoverflow.com/a/12867429/1130381
I hope it helps, but that's the best I can do with the info you provided.

Categories