python add data to existing excel cell Win32com - python

Assume I have A1 as the only cell in a workbook, and it's blank.
I want my code to add "1" "2" and "3" to it so it says "1 2 3"
As of now I have:
NUMBERS = [1, 2, 3, 4, 5]
ThisSheet.Cells(1,1).Value = NUMBERS
this just writes the first value to the cell. I tried
ThisSheet.Cells(1,1).Value = Numbers[0-2]
but that just puts the LAST value in there. Is there a way for me to just add all of the data in there? This information will always be in String format, and I need to use Win32Com.
update:
I did
stringVar = ', '.join(str(v) for v in LIST)
UPDATE:this .join works perfectly for the NUMBERS list. Now I tried attributing it to another list that looks like this
LIST=[Description Good\nBad, Description Valid\nInvalid]
If I print LIST[0] The outcome is
Description Good
Bad
Which is what I want. But if I use .join on this one, it prints
('Description Good\nBad, Description Valid\nInvalid')
so for this one I need it to print as though I did LIST[0] and LIST[1]

So if you want to put each number in a different cell, you would do something like:
it = 1
for num in NUMBERS:
ThisSheet.Cells(1,it).Value = num
it += 1
Or if you want the first 3 numbers in the same cell:
ThisSheet.Cells(1,it).Value = ' '.join([str(num) for num in NUMBERS[:3]])
Or all of the elements in NUMBERS:
ThisSheet.Cells(1,1).Value = ' '.join([str(num) for num in NUMBERS])
EDIT
Based on your question edit, for string types containing \n and assuming every time you find a newline character, you want to jump to the next row:
# Split the LIST[0] by the \n character
splitted_lst0 = LIST[0].split('\n')
# Iterate through the LIST[0] splitted by newlines
it = 1
for line in splitted_lst0:
ThisSheet.Cells(1,it).Value = line
it += 1
If you want to do this for the whole LIST and not only for LIST[0], first merge it with the join method and split it just after it:
joined_list = (''.join(LIST)).split('\n')
And then, iterate through it the same way as we did before.

Related

Turn each word into a variable with python

I have a text which I need to delete the first two words and store the numbers into a variable.
I am trying to split the words and then create a loop to store each word in a variable.
My text is: "ABA BLLO 70000000 12-2022"
So I am trying to store the numbers, which can alternate depending on the data set and create a variable for each of them.
text = "ABA BLLO 70000000 12-2022"
a = text.strip().strip("")
for a in text:
print(a)
So I would have three variables:
number = 70000000
month = 12
year = 2022
You can use the split function to split the string on white-spaces and convert all the splitted strings into a list. Then you can slice the array to remove the first two elements and destructure the remaining array into variables.
text = "ABA BLLO 70000000 12-2022"
x,y = text.split()[2:]
print(x,y)
NOTE : This would work only if there's a fixed format for the input string.
If i get your point right, then try to check this code:
text = "ABA BLLO 70000000 12-2022"
counter = 0
word = []
number = []
for a in text.split(" "):
if counter <= 1:
word.append(a)
else:
number.append(a)
counter += 1
print(word)
print(number)
The output will be
['ABA', 'BLLO']
['70000000', '12-2022']
I don't know if I'm catching your drift but here's my answer:
new_text = text.split(" ")
for i in range(2, len(new_text)):
if i == 2:
number = new_text[i]
else:
month = new_text[i].split("-")[0]
year = new_text[i].split("-")[1]
print(f"Number: {number}\nMonth: {month}\nYear: {year}")
After using something like
tempSplit = text.split()
You're going to get a list class.
result = [s for s in tempSplit if s.isdigit()]
And with that you can get int objects but problem with this last fourth element is a Date object you have to use another function for that.
As #roganjosh suggested with the comment you should check other tutorials to find out about different functions. Like for this instance maybe you can try split function then learn how to get only numbers from a list.
To get dates
month = tempSplit[3].split("-")[0]
year = tempSplit[3].split("-")[1]

How to print a specific line in string Python?

So I got a string output out of a css selector while web crawling and the string has 7 lines, 6 of them are useless and I only want the 4th line.
The string is as follows:
کارکرد:
۵۰,۰۰۰
رنگ:
سفید
وضعیت بدنه:
بدون رنگ
قیمت صفر : ۳۱۵,۰۰۰,۰۰۰ تومان
Is there a way to print only the 4th line?
The crawling code:
color = driver.find_elements_by_css_selector("div[class='col']")
for c in color:
print(c.text)
Yes, of course! See python documentation about list items
color = driver.find_elements_by_css_selector("div[class='col']")
print(color[3].text)
List items are indexed, the first item has index [0], the second item has index 1 etc.
I'm not sure if i understand the problem correctly, but assuming you have a string with multiple lines, a solution could be:
string = '''this string
exists
on multiple
lines
so lets pick
a line
'''
def select_line(string, line_index):
return string.splitlines()[line_index]
result = select_line(string,3)
print(result)
This function would select the number line you want (index 0 being the first line)
If you want items one to four try this:
for idx in range(4):
print(color[idx].text)
And if you want only 4th try this: (in python index in list start from zero.)
print(color[3].text)

How do I properly write a CSV file within a for loop in python?

I am using the following code to scrape content from a webpage with the end goal of writing to a CSV. On the first iteration I had this portion working, but now that my data is formatted differently it writes the data in a way that gets mangled when I try to view it in excel.
If I use the code below the "heading.text" data is correctly put into one cell when viewed in excel. Where as the contents of "child.text" is packed into one cell rather then being split based on the commas. You will see I have attempted to clean up the content of "child.text" in an effort to see if that was my issue.
If I remove "heading.text" from "z" and try again, it writes in a way that has excel showing one letter per cell. In the end I would like each value that is seperated by commas to display in one cell when viewed in excel, I believe I am doing something (many things?) incorrectly in structuring "z" and or when I write the row.
Any guidance would be greatly appreciated. Thank you.
csvwriter = csv.writer(csvfile)
for heading in All_Heading:
driver.execute_script("return arguments[0].scrollIntoView(true);", heading)
print("------------- " + heading.text + " -------------")
ChildElement = heading.find_elements_by_xpath("./../div/div")
for child in ChildElement:
driver.execute_script("return arguments[0].scrollIntoView(true);", child)
#print(heading.text)
#print(child.text)
z = (heading.text, child.text)
print (z)
csvwriter.writerow(z)
When I print "z" I get the following:
('Flower', 'Afghani 3.5g Pre-Pack Details\nGREEN GOLD ORGANICS\nAfghani 3.5g Pre-Pack\nIndica\nTHC: 16.2%\n1/8 oz - \n$45.00')
When I print "z" with the older code that split the string on "\n" I get the following:
('Flower', "Cherry Limeade 3.5g Flower - BeWell Details', 'BE WELL', 'Cherry Limeade 3.5g Flower - BeWell', 'Hybrid', 'THC: 18.7 mg', '1/8 oz - ', '$56.67")
csv.writerow() takes an iterable, each element of which is separated by the writer's delimiter i.e. made a different cell.
First let’s see what’s been happening with you till now:
(heading.text, child.text) has two elements i.e. two cells, heading.text and child.text
(child.text) is simply child.text (would be a tuple if it was (child.text**,**)) and a string's elements are each letter. Hence each letter made its own cell.
To get different cells in a row we need separate elements in our iterable so we want an iterable like [header.text, child.text line 1, child.text line 2, ...]. You were right in splitting the text into lines but the lines weren’t being added to it correctly.
Tuples being immutable I’ll use a list instead:
We know heading.text is to take a single cell so we can write the following to start with
row = [heading.text] # this is what your z is
We want each line to be a separate element so we split child.text:
lines = child.text.split("\n")
# The text doesn’t start or end with a newline so this should suffice
Now we want each element to be added to the row separately, we can make use of the extend() method on lists:
row.extend(lines)
# [1, 2].extend([3, 4, 5]) would result in [1, 2, 3, 4, 5]
To cumulate it:
row = [heading.text]
lines = child.text.split("\n")
row.extend(lines)
or unpacking it in a single line:
row = [heading.text, *child.text.split("\n")] # You can also use a tuple here

How to find the title of a file that sits in between title tags

I have some files that have "TITLE..." then have "JOURNAL..." followed directly afterward. The specific lines are varied and are not static per file. I am trying to pull all of the information that exists between "...TITLE..." and "...JOURNAL...". So far, I am able to only pull the line that contains "TITLE", but for some files, that spills onto the next line.
I deduced that I must use a=line.find("TITLE") and b=line.find("JOURNAL")
then set up a for loop of for i in range(a,b): which displays all of the numerical values of the strings from 698-768, but only displays the number instead of the string. How do I display the string? and how do I then, clean that up to not display "TITLE", "JOURNAL", and the whitespaces in between those two and the text I need? Thanks!
This is the one that displays the single line that "TITLE" exists on
def extract_title():
f=open("GenBank1.gb","r")
line=f.readline()
while line:
line=f.readline()
if "TITLE" in line:
line.strip("TITLE ")
print(line)
f.close()
extract_title()
This the the current block that displays all of thos enumbers in increasing order on seperate lines.
def extract_title():
f=open("GenBank1.gb","r")
line=f.read()
a=line.find("TITLE")
b=line.find("JOURNAL")
line.strip()
f.close()
if "TITLE" in line and "JOURNAL" in line:
for i in range(a,b):
print(i)
extract_title()
Currently, I have from 698-768 displayed like:
698
699
700
etc...
I want to first get them like, 698 699 700,
then convert them to their string value
then I want to understand how to strip the white spaces and the "TITLE" and "JOURNAL" values. Thanks!
I am not sure if I get what you want to achieve here but if I understood it correctly you have a string similar to this "TITLE 659 JOURNAL" and want to get the value in the middle ? If so you could use the slicing notation as such:
line = f.read()
a = line.find("TITLE") + 5 # Because find gives index of the start so we add length
b = line.find("JOURNAL")
value = line[a:b]
value = value.strip() # Strip whitespace
If we now were to return value or print it out we get:
'659'
Similar if you want to get the value after JOURNAL you could use slicing notation again:
idx = line.find("JOURNAL") + 7
value = line[idx:] # Start after JOURNAL till end of string
you don't need the loop. just use slicing:
line = 'fooTITLEspamJOURNAL'
start = line.find('TITLE') + 5 # 5 is len('TITLE')
end = line.find('JOURNAL')
print(line[start:end])
output
spam
another option is to split
print(line.split('TITLE')[1].split('JOURNAL')[0])
str.split() returns list. we use indexes to get the element we want.
in slow motion:
part2 = line.split('TITLE')[1]
title = part2.split('JOURNAL')[0]
print(title)

How do I avoid errors when parsing a .csv file in python?

I'm trying to parse a .csv file that contains two columns: Ticker (the company ticker name) and Earnings (the corresponding company's earnings). When I read the file using the following code:
f = open('earnings.csv', 'r')
earnings = f.read()
The result when I run print earnings looks like this (it's a single string):
Ticker;Earnings
AAPL;52131400000
TSLA;-911214000
AMZN;583841600
I use the following code to split the string by the break line character (\n), followed by splitting each resulting line by the semi-colon character:
earnings_list = earnings.split('\n')
string_earnings = []
for string in earnings_list:
colon_list = string.split(';')
string_earnings.append(colon_list)
The result is a list of lists where each list contains the company's ticker at index[0] and its earnigns at index[1], like such:
[['Ticker', 'Earnings\r\r'], ['AAPL', '52131400000\r\r'], ['TSLA', '-911214000\r\r'], ['AMZN', '583841600\r\r']]
Now, I want to convert the earnings at index[1] of each list -which are currently strings- intro integers. So I first remove the first list containing the column names:
headless_earnings = string_earnings[1:]
Afterwards I try to loop over the resulting list to convert the values at index[1] of each list into integers with the following:
numerical = []
for i in headless_earnings:
num = int(i[1])
numerical.append(num)
I get the following error:
num = int(i[1])
IndexError: list index out of range
How is that index out of range?
You certainly mishandle the end of lines.
If I try your code with this string: "Ticker;Earnings\r\r\nAAPL;52131400000\r\r\nTSLA;-911214000\r\r\nAMZN;583841600" it works.
But with this one: "Ticker;Earnings\r\r\nAAPL;52131400000\r\r\nTSLA;-911214000\r\r\nAMZN;583841600\r\r\n" it doesn't.
Explanation: split creates a last list item containing only ['']. So at the end, python tries to access [''][1], hence the error.
So a very simple workaround would be to remove the last '\n' (if you're sure it's a '\n', otherwise you might have surprises).
You could write this:
earnings_list = earnings[:-1].split('\n')
this will fix your error.
If you want to be sure you remove a last '\n', you can write:
earnings_list = earnings[:-1].split('\n') if earnings[-1] == '\n' else earnings.split('\n')
EDIT: test code:
#!/usr/bin/env python2
earnings = "Ticker;Earnings\r\r\nAAPL;52131400000\r\r\nTSLA;-911214000\r\r\nAMZN;583841600\r\r\n"
earnings_list = earnings[:-1].split('\n') if earnings[-1] == '\n' else earnings.split('\n')
string_earnings = []
for string in earnings_list:
colon_list = string.split(';')
string_earnings.append(colon_list)
headless_earnings = string_earnings[1:]
#print(headless_earnings)
numerical = []
for i in headless_earnings:
num = int(i[1])
numerical.append(num)
print numerical
Output:
nico#ometeotl:~/temp$ ./test_script2.py
[52131400000, -911214000, 583841600]

Categories