I'm trying to create variables within these loops but python gives me a syntax error. tme and empty both give me this problem. I'm trying to get python to read an excel sheet and put the values in a list. The variables I'm trying to create are supposed to help me find the empty spaces in python and remove the value from another list I created so that the values I obtain correspond to the correct input and the inputs that have no value in excel are removed from that list. It gives an index error too if there's a fix for that I'd appreciate it, but I can figure that out if the variables work correctly.
span = list()
print(span)
price = list() #creates a list with all the values before giving index error
for q in range(1,chart.nrows):
for i in range (1,chart.ncol):
L = (chart.cell_value(q,i))
if L != '': #all values that exist will be appended to the list
price.append(L)
else: #find x and y parts of space, have that popped from span list
moth = chart.cell_value(q,0)
m = str(moth)
y = year = str(chart.cell_value(0,i)
tme = m + ' ' + y
empty = span.index(tme)
span.pop(empty)
Just from a syntax point of view you were missing a closing bracket in
y = year = str(chart.cell_value(0,i))
Related
My data cutting loop seems to run ok in the loop, but when it prints the result outside the loop, the contents are unchanged. Presuming it's buggy because I'm trying to assign to what the for loop is running through, but I don't know.
For reference, it's a small web review scraper project I'm working on. To get it formatted to CSV with pandas I think all the data needs to end at the same point (length), so I'm cutting any lists that are longer than the shortest. The values "cust_stars_result, rev_result, cust_res" are all lists with basics strings stored inside, in this case equal to lengths 16, 12, and 15. I try to slice everything down to 12 in the end but the results are overwritten. What is the right/best way to go about this?
star_len = len(cust_stars_result)
rev_len = len(rev_result)
custname_len = len(cust_res)
print('customer name length: ' + str(custname_len) + ' -- review length: ' + str(rev_len) + ' -- star length: ' + str(star_len))
datalen = [star_len, rev_len, custname_len]
print(min(datalen))
datapack = [cust_stars_result, rev_result, cust_res]
# LOOPER FOR CULLING
for data in datapack:
if len(data) != min(datalen):
print("operating culler to make data even length")
print(len(data))
data = data[: min(datalen)]
print(len(data)) #this comes out OK
else:
print("equal length, skipping culler")
pass
print(datapack) # prints the original values
Inside your loop you update the data variable but that's just reassigning the value of that variable. You want to do something like
for i, data in enumerate(datapack):
...
datapack[i] = data[: min(datalen)]
This will update the datapack element
While "trying to assign to what the for loop is running through" is a real issue, in this case the problem is rather that your code is not assigning anything to datapack when you change data. Instead, what it does is assign each item in datapack to data, so when you change data, datapack remain unchanged.
Instead, try either adding each item to new list, and then assigning datapack to equal the new list:
temp = []
for data in datapack:
...
temp.append(data[:min(datalen)])
datapack = temp
Or try using a range or enumerate loop:
for i, data in enumerate(datapack):
...
datapack[i] = data[:min(datalen)]
There are more fancy ways (but less readable and debuggable) to accomplish what you're doing here (slicing off the end of the list), such as the below which uses list comprehension and map:
mindatalen = min(map(len, datapack))
datapack = [data[:mindatalen]for data in datapack]
I am trying to make my code look better and create functions that do all the work from running just one line but it is not working as intended. I am currently pulling data from a pdf that is in a table into a pandas dataframe. From there I have 4 functions, all calling each other and finally returning the updated dataframe. I can see that it is full updated when I print it in the last method. However I am unable to access and use that updated dataframe, even after I return it.
My code is as follows
def data_cleaner(dataFrame):
#removing random rows
removed = dataFrame.drop(columns=['Unnamed: 1','Unnamed: 2','Unnamed: 4','Unnamed: 5','Unnamed: 7','Unnamed: 9','Unnamed: 11','Unnamed: 13','Unnamed: 15','Unnamed: 17','Unnamed: 19'])
#call next method
col_combiner(removed)
def col_combiner(dataFrame):
#Grabbing first and second row of table to combine
first_row = dataFrame.iloc[0]
second_row = dataFrame.iloc[1]
#List to combine columns
newColNames = []
#Run through each row and combine them into one name
for i,j in zip(first_row,second_row):
#Check to see if they are not strings, if they are not convert it
if not isinstance(i,str):
i = str(i)
if not isinstance(j,str):
j = str(j)
newString = ''
#Check for double NAN case and change it to Expenses
if i == 'nan' and j == 'nan':
i = 'Expenses'
newString = newString + i
#Check for leading NAN and remove it
elif i == 'nan':
newString = newString + j
else:
newString = newString + i + ' ' + j
newColNames.append(newString)
#Now update the dataframes column names
dataFrame.columns = newColNames
#Remove the name rows since they are now the column names
dataFrame = dataFrame.iloc[2:,:]
#Going to clean the values in the DF
clean_numbers(dataFrame)
def clean_numbers(dataFrame):
#Fill NAN values with 0
noNan = dataFrame.fillna(0)
#Pull each column, clean the values, then put it back
for i in range(noNan.shape[1]):
colList = noNan.iloc[:,i].tolist()
#calling to clean the column so that it is all ints
col_checker(colList)
noNan.iloc[:,i] = colList
return noNan
def col_checker(col):
#Going through, checking and cleaning
for i in range(len(col)):
#print(type(colList[i]))
if isinstance(col[i],str):
col[i] = col[i].replace(',','')
if col[i].isdigit():
#print('not here')
col[i] = int(col[i])
#If it is not a number then make it 0
else:
col[i] = 0
Then when I run this:
doesThisWork = data_cleaner(cleaner)
type(doesThisWork)
I get NoneType. I might be doing this the long way as I am new to this, so any advice is much appreciated!
The reason you are getting NoneType is because your function does not have a return statement, meaning that when finishing executing it will automatically returns None. And it is the return value of a function that is assigned to a variable var in a statement like this:
var = fun(x)
Now, a different thing entirely is whether or not your dataframe cleaner will be changed by the function data_cleaner, which can happen because dataframes are mutable objects in Python.
In other words, your function can read your dataframe and change it, so after the function call cleaner is different than before. At the same time, your function can return a value (which it doesn't) and this value will be assigned to doesThisWork.
Usually, you should prefer that your function does only one thing, so expect that the function changes its argument and return a value is usually bad practice.
I working on a text file that contains multiple information. I converted it into a list in python and right now I'm trying to separate the different data into different lists. The data is presented as following:
CODE/ DESCRIPTION/ Unity/ Value1/ Value2/ Value3/ Value4 and then repeat, an example would be:
P03133 Auxiliar helper un 203.02 417.54 437.22 675.80
My approach to it until now has been:
Creating lists to storage each information:
codes = []
description = []
unity = []
cost = []
Through loops finding a code, based on the code's structure, and using the code's index as base to find the remaining values.
Finding a code's easy, it's a distinct type of information amongst the other data.
For the remaining values I made a loop to find the next value that is numeric after a code. That way I can delimitate the rest of the indexes:
The unity would be the code's index + index until isnumeric - 1, hence it's the first information prior to the first numeric value in each line.
The cost would be the code's index + index until isnumeric + 2, the third value is the only one I need to store.
The description is a little harder, the number of elements that compose it varies across the list. So I used slicing starting at code's index + 1 and ending at index until isnumeric - 2.
for i, carc in enumerate(txtl):
if carc[0] == "P" and carc[1].isnumeric():
codes.append(carc)
j = 0
while not txtl[i+j].isnumeric():
j = j + 1
description.append(" ".join(txtl[i+1:i+j-2]))
unity.append(txtl[i+j-1])
cost.append(txtl[i+j])
I'm facing some problems with this approach, although there will always be more elements to the list after a code I'm getting the error:
while not txtl[i+j].isnumeric():
txtl[i+j] list index out of range.
Accepting any solution to debug my code or even new solutions to problem.
OBS: I'm also going to have to do this to a really similar data font, but the code would be just a sequence of 7 numbers, thus harder to find amongst the other data. Any solution that includes this facet is also appreciated!
A slight addition to your code should resolve this:
while i+j < len(txtl) and not txtl[i+j].isnumeric():
j += 1
The first condition fails when out of bounds, so the second one doesn't get checked.
Also, please use a list of dict items instead of 4 different lists, fe:
thelist = []
thelist.append({'codes': 69, 'description': 'random text', 'unity': 'whatever', 'cost': 'your life'})
In this way you always have the correct values together in the list, and you don't need to keep track of where you are with indexes or other black magic...
EDIT after comment interactions:
Ok, so in this case you split the line you are processing on the space character, and then process the words in the line.
from pprint import pprint # just for pretty printing
textl = 'P03133 Auxiliar helper un 203.02 417.54 437.22 675.80'
the_list = []
def handle_line(textl: str):
description = ''
unity = None
values = []
for word in textl.split()[1:]:
# it splits on space characters by default
# you can ignore the first item in the list, as this will always be the code
# str.isnumeric() doesn't work with floats, only integers. See https://stackoverflow.com/a/23639915/9267296
if not word.replace(',', '').replace('.', '').isnumeric():
if len(description) == 0:
description = word
else:
description = f'{description} {word}' # I like f-strings
elif not unity:
# if unity is still None, that means it has not been set yet
unity = word
else:
values.append(word)
return {'code': textl.split()[0], 'description': description, 'unity': unity, 'values': values}
the_list.append(handle_line(textl))
pprint(the_list)
str.isnumeric() doesn't work with floats, only integers. See https://stackoverflow.com/a/23639915/9267296
Loosing my mind over this - been through many different articles, forums and questions but still can't figure this out.
I'm a programming noob, and I am currently using IronPython to create a Windows Form application. Visual Studio 2015 is my IDE.
I am wanting to create 100+ label elements on the main form from an array which has been loaded in from a CSV. The array works fine, the code for that is here:
with open('sites.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
sites.append(row[0])
The array looks like this:
Sitename 1,Sitename 2,Sitename 3,Sitename 4 etc.
Inside my "MyForm" class which is where I create my child controls before showing the form I have a loop to go through this array and create a label for each sitename entry in the list.
#Create Label for each Site
for s in sites:
sitename = str(s) #Convert Name into a String
elementname = sitename.replace(" ","") + "Label"
elementname = Label()
elementname.Name = str(elementname)
elementname.Parent = self
elementname.Location = Point(lastx,lasty)
counter = counter + counter
lasty = lasty + 10
The variable sitename will convert the current site name entry in the list to a string value with any spaces (e.g. Northern Site).
The variable elementname takes the sitename variable and removes the spaces and adds the word Label to the end of the name.
I then try and create a label object on the form with with the name held in the elementname variable.
Although this doesn't cause an error or exception when I run, the result only outputs one label with the name of first entry in the array/list.
I might be coming this from the wrong angle. I can see why it isn't working by stepping through the code. It creates every single label with the variable elementnamenot the sitenamelabel I was intending it to.
I've attempted to generate variables dynamically using a dictionary but this didn't seem to work, and attempted to create an array of label's on the form and then populate these with the loop but this didn't seem to work.
You will have to actually add every label you have created to the form. A good way to do this is to create a list of labels and add them at the end using the AddRange method.
labels = [] # create a blank list to hold all labels
for s in sites:
sitename = str(s) #Convert Name into a String
elementname = sitename.replace(" ","") + "Label"
elementname = Label()
elementname.Name = str(elementname)
elementname.Parent = self
elementname.Location = Point(lastx,lasty)
labels.append(elementname) # add each element to the list
counter = counter + counter
lasty = lasty + 10
# Add all labels to the form
MyForm.Controls.AddRange(System.Array[Label](labels))
So basically I'm trying to check if a bunch of strings in a list called list9000 contain an "#" sign. What I want is to empty the list once it has 6 elements, but before clearing it to do a for loop checking the elements for any "#" signs. I've tried using del and other emptying techniques, but it just doesn't seem to work. Here's my work so far:
if( len(list9000) == 6):
# print(list9000)
i = list9000.count("#")
if(i>1):
amount9000 = amount9000 - i + 1
numWrong = numWrong - i + 1
list9000[:] = []
list9000.append(line)
This is just a snippet of my code. There are about 300 lines of other code. I am reading a text file, in which I add the lines of text in the file to my list. If I could solve this problem, I would be basically done with my project!
Edit: I've tried using del list9000[:], but it doesn't work.
**Update: ** I have printed out the length of the list, and it doesn't seem to be 6 most of the time, but rather increased by 6 every time.
You can just reassign list9000 to a new empty list object. But be careful, Python creates reference if you are assigning mutable objects. A simple list9000[:] = [] should do the job to clear your list.
if( len(list9000) == 6):
# print(list9000)
i = 0
for item in list9000:
if("#" in item):
i += i + 1
if(i>1):
amount9000 = amount9000 - i + 1
numWrong = numWrong - i + 1
list9000[:] = []