While loop for excel parsing

While loop for excel parsing - python

error message screenshotI'm quite new to Python and I need to create a nested loop for excel parsing. I have a spreadsheet with 4 columns ID, Model, Part Number, Part Description, Year and I need a parser to go through each line and to return in format:
Part Number, Toyota > Model > Year | Toyota > Model > Year etc...
so that part number is returned only once listing all of the multiple fitting models and years.
I was able to achieve the same through the code below but it is not switching to the second part Part Number
import pandas as pd
import xlrd
workbook = pd.read_excel('Query1.xls')
workbook.head()
i = 0
l = int(len(workbook))
a = workbook['Part Number'].iloc[i]
while i < l:
b = 0
c = workbook['Part Number'].iloc[b]
print(a)
while c == a:
#print(c)
print(b, 'TOYOTA >', workbook['Model'].iloc[b], ' > ', workbook['Year'].iloc[b], ' | ', end = ' ')
b = b + 1
print()
i = i + b

Your code gets stuck in an infinite loop, because you do not update the value c as you iterate through the rows. Here's how you could implement this better:
part_number_group = None
for i in range(len(df)): # or `for i, row in df.iterrows():`
part_number = df.loc[i, "Part Number"]
if part_number != part_number_group:
if part_number_group is not None:
print()
print(part_number)
part_number_group = part_number
print(i, 'TOYOTA >', df.loc[i, 'Model'], ' > ', df.loc[i, 'Year'], ' | ', end = ' ')
But instead, you should use groupby, which saves the need to iterate through rows at all:
df["Model-Year"] = df.index.astype(str) + " TOYOTA > " + df["Model"] + " > " + df["Year"].astype(str)
for part_number, group in df.groupby("Part Number"):
print(part_number)
print(*group["Model-Year"], sep=" | ")

Trying to reuse some of your code, you may go over all unique part numbers using a for loop. For loops make it easier to no get stuck in an infinite loop because you specify the start and stop conditions upfront. Then you can query all entries with that same part number and print them with your suggested print function:
import pandas as pd
import xlrd
workbook = pd.read_excel('Query1.xls')
for num in pd.unique(workbook["Part Number"]):
print('\n', num)
part_df = workbook.query("`Part Number` == #num")
for i in range(len(part_df)):
print(i, 'TOYOTA >', part_df['Model'].iloc[i], ' > ', part_df['Year'].iloc[i], ' | ', end=' ')

Related

Best pythonic way to merge consecutive upper case characters in a string python

I woud need assistance to find the best pythonic way to merge consecutive upper case characters in a string python
Example:
Input: You can pay N O W or Pay me Back MY Money later
Output: You can pay NOW or Pay me Back MY Money later
I am going with a very quick & dirty approach temporarily
s='lets P A Y N O W'
new_s = s
replace_maps = []
replace_str = ''
prev_cap = False
for i, c in enumerate(s):
if c == ' ':
continue
if c.isupper():
if prev_cap:
replace_str += c
else:
start = i
replace_str = c
prev_cap = True
else:
end = i
if prev_cap:
replace_maps.append([start, end, replace_str])
prev_cap = False
replace_str = ''
else:
end = i
if prev_cap:
replace_maps.append([start, end, replace_str])
prev_cap = False
replace_str = ''
new_s = s[:replace_maps[0][0]] + replace_maps[0][2] + s[replace_maps[0][1]:]
new_s
Output: lets PAYNOWW

The best idea is to use Look-aheads ?= and Look-behinds ?<= and check for Upper case letters.
for more info on regex
this regex should make the job
import re
data = "I could not find a C O V I D patient in the hospital."
re.sub(r"(?<=[A-Z])\s(?=[A-Z])", r'', data)
'I could not find a COVID patient in the hospital.'
EDIT
Regarding your new input after question modification
data = "You can pay N O W or Pay me Back MY Money later"
re.sub(r"(?<=[A-Z])\s(?=[A-Z] )", r'', data)
output
'You can pay NOW or Pay me Back MY Money later'

without regex:
mystring = mystring.split(" ")
res = mystring[0]
for i in range(1, len(mystring)):
if not (mystring[i-1].isupper() and mystring[i].isupper()):
res+= " "
res += mystring[i]

I don't know what the most pythonic way could be. I can only tell you what I came up.
import re
def merge_cons_up(string):
pattern = re.compile(" [A-Z](?![a-zA-Z0-9_.-])")
sub_text = re.findall(pattern=pattern, string=string)
repl = "".join(sub_text).replace(" ", "")
sub = re.sub(pattern=pattern, string=string, repl=" " + repl, count=1)
final_string = re.sub(pattern=pattern, string=sub, repl="")
return final_string
print(merge_cons_up("I could not find a C O V I D patient in the hospital."))
Output:
I could not find a COVID patient in the hospital.

Sum of Digits in a specific way in Python

I'm looking for a code that runs, i.e:
int(input) = 2565
Printed Output should be like:
2 + 5 + 6 + 5 = 18 = 1 + 8 = 9
I wrote the code that gives final answer "9". But I couldn't managed to write it with every digit separated "+" sign. Assuming that I need to use while loop but how can I write the code so it will be like the output above?

You can use something like this:
def sum_of_digits(s):
if s < 10:
return s
return sum_of_digits(sum(int(c) for c in str(s)))
> sum_of_digits(2565)
9
It recursively checks if the numerical value is less than 10. If it does, it returns this value. If not, it adds the digits, then recursively calls itself on the result.
Edit
To print out the steps as it goes along, you could do something like this:
def sum_of_digits(s):
if s < 10:
print(s)
return s
print(' + '.join(c for c in str(s)) + ' = ')
return sum_of_digits(sum(int(c) for c in str(s)))

First, initiate an empty string output_str.
With a while loop which contniues when our integer is > 9:
[s for s in str(x)] would create a list of the digits (as strings) of our integer. It's called a list comprehension, is very useful, and my advice is to read a bit about it.
With " + ".join() we create a string with " + " between the
digits. Add this string at the end of output_str.
Add " = " to the end of output_str.
Calculate the sum of the digits (we cannot use sum(lst_of_digits) because it's a list of strings. sum([int(s) for s in lst_of_digits]) converts the string list into an inter list, which can be summed using sum()). Store the sum into x.
Add the new x + " = " to output_string.
At the end of the string, we have a redundant " = " (because the last (5) was not needed), let's just remove the last 3 chars (=) from it.
x = 2565
output_str = ""
while x > 9:
lst_of_digits = [s for s in str(x)]
output_str += " + ".join(lst_of_digits)
output_str += " = "
x = sum([int(s) for s in lst_of_digits])
output_str += f"{x} = "
output_str = output_str[:-3]
outputs:
output_str = '2 + 5 + 6 + 5 = 18 = 1 + 8 = 9'

You can play around with the end keyword argument of the print function which is the last character/string that print will put after all of its arguments are, well, printed, by default is "\n" but it can be change to your desire.
And the .join method from string which put the given string between the given list/iterable of strings to get the desire result:
>>> " + ".join("123")
'1 + 2 + 3'
>>>
Mixing it all together:
def sum_digit(n):
s = sum(map(int,str(n)))
print(" + ".join(str(n)),"=",s, end="")
if s<10:
print()
return s
else:
print(" = ",end="")
return sum_digit(s)
Here we first get the sum of the digit on s, and print it as desire, with end="" print will not go to the next line which is necessary for the recursive step, then we check if done, and in that case print a new empty line if not we print an additional = to tie it for the next recursive step
>>> sum_digit(2565)
2 + 5 + 6 + 5 = 18 = 1 + 8 = 9
9
>>>
This can be easily be modify to just return the accumulated string by adding an extra argument or to be iterative but I leave those as exercise for the reader :)

I am a noob but this should do what you want.
Cheers,
Guglielmo
import math
import sys
def sumdigits(number):
digits = []
for i in range( int(math.log10(number)) + 1):
digits.append(int(number%10))
number = number/10
digits.reverse()
string = ''
thesum = 0
for i,x in enumerate(digits):
string += str(x)
thesum += x
if i != len(digits)-1: string += ' + '
else: string += ' = '
if thesum > 10:
return string,thesum,int(math.log10(number))+1
else:
return string,thesum,0
def main():
number = float(sys.argv[1])
finalstring = ''
string,thesum,order = sumdigits(number)
finalstring += string
finalstring += str(thesum)
while order > 0:
finalstring += ' = '
string,thesum,order = sumdigits(thesum)
finalstring += string
finalstring += str(thesum)
print 'myinput = ',int(number)
print 'Output = ',finalstring
if __name__ == "__main__":
main()

Python multiplication table, seperte with strings

I want to create a function that takes 2 parameter, and prints the multiplication table for this number in a nice format where rows are separated by lines. This is the target:
target design
I have tried, but have no idea where to integrate the "--------" string. Any ideas?
def multi_table(x,y):
for row in range(1, x+1):
for col in range(1, y+1):
num = row * col
if num < 10: blank = ' '
else:
if num < 100: blank = ' '
print(blank, num, end = '')
print()
multi_table(4,5)

You need to add the print statement between the row and column loop. You also need to ensure that you end the print statement with a new line character \n. Refer below.
def multi_table(x,y):
for row in range(1, x+1):
print("---------------------\n")
for col in range(1, y+1):
num = row * col
if num < 10: blank = ' '
else:
if num < 100: blank = ' '
print(blank, num, end = '')
print()
multi_table(4,5)

The print() is used to go to the next line, and that's where you want to add the "---------------". So change the print() to print('\n------------------------\n'). \n indicates to go to the next line.

To compensate for y, you can use the following,
also, you can simplify the formatting with the format string method:
def multi_table(x,y):
for row in range(1, x+1):
print('----' * y)
for col in range(1, y+1):
num = row * col
print('{:4}'.format(num), end = '')
print()

Python join - how to join a data in loop?

How can I join a data below,
# Convert Spark DataFrame to Pandas
pandas_df = df.toPandas()
print pandas_df
age name
0 NaN Michael
1 30 Andy
2 19 Justin
My current attempt,
persons = ""
for index, row in pandas_df.iterrows():
persons += str(row['name']) + ", " + str(row['age']) + "/ "
print row['name'], row['age']
print persons
Result,
Michael, nan/ Andy, 30.0/ Justin, 19.0/
But I am after (no slash at the end),
Michael, nan/ Andy, 30.0/ Justin, 19.0

If you want to keep your method of looping through each , then you can simply remove the last / by doing rstrip() on it to strip from the right side. Example -
persons = ""
for index, row in pandas_df.iterrows():
persons += str(row['name']) + ", " + str(row['age']) + "/ "
print row['name'], row['age']
person = person.rstrip("/ ")
print persons
Example/Demo -
>>> person = "Michael, nan/ Andy, 30.0/ Justin, 19.0/ "
>>> person = person.rstrip('/ ')
>>> person
'Michael, nan/ Andy, 30.0/ Justin, 19.0'
But if you really do not want the print row['name'], row['age'] inside the loop, then you can convert this into a generator function and let str.join() handle what you want. Example -
person = "/".join(",".join([str(row['name']), str(row['age'])]) for _, row in pandas_df.iterrows())

I think this will do
persons = []
str_pearsons=""
for index, row in pandas_df.iterrows():
persons.append( str(row['name']) + ", " + str(row['age']))
str_pearsons="/ ".join(persons)

You can achieve this easily in a one liner that will be vectorised:
In [10]:
'/ '.join(df['name'] + ', ' + df['age'].astype(str))
Out[10]:
'Michael, nan/ Andy, 30.0/ Justin, 19.0'

Python Convert String Literal to Float

I am working through the book "Introduction to Computation and Programming Using Python" by Dr. Guttag. I am working on the finger exercises for Chapter 3. I am stuck. It is section 3.2, page 25. The exercise is: Let s be a string that contains a sequence of decimal numbers separated by commas, e.g., s = '1.23,2.4,3.123'. Write a program that prints the sume of the numbers in s.
The previous example was:
total = 0
for c in '123456789':
total += int(c)
print total.
I've tried and tried but keep getting various errors. Here's my latest attempt.
total = 0
s = '1.23,2.4,3.123'
print s
float(s)
for c in s:
total += c
print c
print total
print 'The total should be ', 1.23+2.4+3.123
I get ValueError: invalid literal for float(): 1.23,2.4,3.123.

Floating point values cannot have a comma. You are passing 1.23,2.4,3.123 as it is to float function, which is not valid. First split the string based on comma,
s = "1.23,2.4,3.123"
print s.split(",") # ['1.23', '2.4', '3.123']
Then convert each and and every element of that list to float and add them together to get the result. To feel the power of Python, this particular problem can be solved in the following ways.
You can find the total, like this
s = "1.23,2.4,3.123"
total = sum(map(float, s.split(",")))
If the number of elements is going to be too large, you can use a generator expression, like this
total = sum(float(item) for item in s.split(","))
All these versions will produce the same result as
total, s = 0, "1.23,2.4,3.123"
for current_number in s.split(","):
total += float(current_number)

Since you are starting with Python, you could try this simple approach:
Use the split(c) function, where c is a delimiter. With this you will have a list numbers (in the code below). Then you can iterate over each element of that list, casting each number to a float (because elements of numbers are strings) and sum them:
numbers = s.split(',')
sum = 0
for e in numbers:
sum += float(e)
print sum
Output:
6.753

From the book Introduction to Computation and Programming using Python at page 25.
"Let s be a string that contains a sequence of decimal numbers separated by commas, e.g., s
= '1.23,2.4,3.123'. Write a program that prints the sum of the numbers in s."
If we use only what has been taught so far, then this code is one approach:
tmp = ''
num = 0
print('Enter a string of decimal numbers separated by comma:')
s = input('Enter the string: ')
for ch in s:
if ch != ',':
tmp = tmp + ch
elif ch == ',':
num = num + float(tmp)
tmp = ''
# Also include last float number in sum and show result
print('The sum of all numbers is:', num + float(tmp))

total = 0
s = '1.23,2.4,3.123'
for c in s.split(','):
total = total + float(c)
print(total)

Works Like A Charm
Only used what i have learned yet
s = raw_input('Enter a string that contains a sequence of decimal ' +
'numbers separated by commas, e.g. 1.23,2.4,3.123: ')
s = "," + s+ ","
total =0
for i in range(0,len(s)):
if s[i] == ",":
for j in range(1,(len(s)-i)):
if s[i+j] == ","
total = total + float(s[(i+1):(i+j)])
break
print total

This is what I came up with:
s = raw_input('Enter a sequence of decimal numbers separated by commas: ')
aux = ''
total = 0
for c in s:
aux = aux + c
if c == ',':
total = total + float(aux[0:len(aux)-1])
aux = ''
total = total + float(aux) ##Uses last value stored in aux
print 'The sum of the numbers entered is ', total

I think they've revised this textbook since this question was asked (and some of the other's have answered.) I have the second edition of the text and the split example is not on page 25. There's nothing prior to this lesson that shows you how to use split.
I wound up finding a different way of doing it using regular expressions. Here's my code:
# Intro to Python
# Chapter 3.2
# Finger Exercises
# Write a program that totals a sequence of decimal numbers
import re
total = 0 # initialize the running total
for s in re.findall(r'\d+\.\d+','1.23, 2.2, 5.4, 11.32, 18.1,22.1,19.0'):
total = total + float(s)
print(total)
I've never considered myself dense when it comes to learning new things, but I'm having a hard time with (most of) the finger exercises in this book so far.

s = input('Enter a sequence of decimal numbers separated by commas: ')
x = ''
sum = 0.0
for c in s:
if c != ',':
x = x + c
else:
sum = sum + float(x)
x = ''
sum = sum + float(x)
print(sum)
This is using just the ideas already covered in the book at this point. Basically it goes through each character in the original string, s, using string addition to add each one to the next to build a new string, x, until it encounters a comma, at which point it changes what it has as x to a float and adds it to the sum variable, which started at zero. It then resets x back to an empty string and repeats until all the characters in s have been covered

Here's a solution without using split:
s='1.23,2.4,3.123,5.45343'
pos=[0]
total=0
for i in range(0,len(s)):
if s[i]==',':
pos.append(len(s[0:i]))
pos.append(len(s))
for j in range(len(pos)-1):
if j==0:
num=float(s[pos[j]:pos[j+1]])
total=total+num
else:
num=float(s[pos[j]+1:pos[j+1]])
total=total+num
print total

My way works:
s = '1.23, 211.3'
total = 0
for x in s:
for i in x:
if i != ',' and i != ' ' and i != '.':
total = total + int(i)
print total

My answer is here:
s = '1.23,2.4,3.123'
sum = 0
is_int_part = True
n = 0
for c in s:
if c == '.':
is_int_part = False
elif c == ',':
if is_int_part == True:
total += sum
else:
total += sum/10.0**n
sum = 0
is_int_part = True
n = 0
else:
sum *= 10
sum += int(c)
if is_int_part == False:
n += 1
if is_int_part == True:
total += sum
else:
total += sum/10.0**n
print total

I have managed to answer the question with the knowledge gained up until 3.2 the section for loop
s = '1.0, 1.1, 1.2'
print 'List of decimal number'
print s
total = 0.0
for c in s:
if c == ',':
total += float(s[0:(s.index(','))])
d = int(s.index(','))+1
s = s[(d+1) : len(s)]
s = float(s)
total += s
print '1.0 + 1.1 + 1.2 = ', total
This is the answer to the question i feel that the split function is not good for beginner like you and me.

Considering the fact that you might not yet be exposed to more complex functions, simply try these out.
total = 0
for c in "1.23","2.4",3.123":
total += float(c)
print total

My answer:
s = '2.1,2.0'
countI = 0
countF = 0
totalS = 0
for num in s:
if num == ',' or (countF + 1 == len(s)):
totalS += float(s[countI:countF])
if countF < len(s):
countI = countF + 1
countF += 1
print(totalS) # 4.1
This only works if the numbers are floats

Here is my answer. It is similar to the one by user5716300 above, but since I am also a beginner I explicitly created a separate variable s1 for the split string:
s = "1.23,2.4,3.123"
s1 = s.split(",") #this creates a list of strings
count = 0.0
for i in s1:
count = count + float(i)
print(count)

If we are just sticking with the content for that chapter, I came up with this: (though using that sum method mentioned by theFourthEye is also pretty slick):
s = '1.23,3.4,4.5'
result = s.split(',')
result = list(map(float, result))
n = 0
add = 0
for a in result:
add = add + result[n]
n = n + 1
print(add)

I just wanna to post my answer because I am reading this book now.
s = '1.23,2.4,3.123'
ans = 0.0
i = 0
j = 0
for c in s:
if c == ',':
ans += float(s[i:j])
i = j + 1
j += 1
ans += float(s[i:j])
print(str(ans))

Using knowledge from the book:
s = '4.58,2.399,3.1456,7.655,9.343'
total = 0
index = 0
for string in s:
index += 1
if string == ',':
temp = float(s[:index-1])
s = s[index:]
index = 0
total += temp
temp = 0
print(total)
Here I used string slicing, and by slicing the original string every time our 'string' variable is equal to ','. Also using an index variable to keep track of the number that is before the comma. After slicing the string, the number that gets input into tmp is cleared with the comma in front of it, the string becoming another string without that number.
Because of this, the index variable needs to be reset every time this happens.

Here's mine using the exact string in the question and only what has been taught so far.
total = 0
temp_num = ''
for char in '1.23,2.4,3.123':
if char == ',':
total += float(temp_num)
temp_num = ''
else:
temp_num += char
total += float(temp_num) #to catch the last number that has no comma after it
print(total)

I know this isn't covered in the book up to this point but I happened to learn the use of the eval() function on my own prior to getting to this question and used it to solve.
total = 0
s = "1.23,2.4,3.123"
x = eval(s)
y = sum(x)
print(y)

I think this is the easiest way to answer the question. It uses the split command, which is not introduced in the book at this moment but a very useful command.
s = input('Insert string of decimals, e,g, 1.4,5.55,12.651:')
sList = s.split(',') #create a list of these values
print(sList) #to check if list is correctly created
total = 0 #for creating the variable
for each in sList:
total = total + float(each)
print(total)

total =0
s = {1.23,2.4,3.123}
for c in s:
total = total+float(c)
print(total)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

While loop for excel parsing - python

Related

Best pythonic way to merge consecutive upper case characters in a string python

Sum of Digits in a specific way in Python

Python multiplication table, seperte with strings

Python join - how to join a data in loop?

Python Convert String Literal to Float

Categories

Resources