Im new with selenium/python and that my problem:
I have a simple site with a couple of news.
I try to write script that iterates over all news, open each one, do something and goes back to all other news
All news have same xpath, difference only with last symbol - i try to put this symbol as variable and loop over all news, with increment my variable after every visited news:
x = len(driver.find_elements_by_class_name('cards-news-event'))
print (x)
for i in range(x):
driver.find_element_by_xpath('/html/body/div[1]/div[1]/div/div/div/div[2]/div/div[3]/div/div[1]/div/**a["'+i+'"]**').click()
do something
i = i+1
Python return error: "Except type "str", got "int" instead. Google it couple of hours but really can't deal with it
Very appreciate for any help
You are trying to add a string and a int which is is why the exception. Use str(i) instead of i
xpath_string = '/html/body/div[1]/div[1]/div/div/div/div[2]/div/div[3]/div/div[1]/div/**a[{0}]**'.format(str(i))
driver.find_element_by_xpath(xpath_string).click()
In the above the {0} is replaced with str(i). You can use .format to substitute multiple variables in a string by providing them as positional values, it is more elegant and easy to use that using + to concatenate strings.
refer: http://thepythonguru.com/python-string-formatting/
Related
I can't seem to get this formatting working in Python. I am trying to define a function that holds an argument on the form - "[Some].[Name]"
Can anyone tell me how I can this working? I think I have tried all combinations of ' and ", but regardless both the [.] and ["] in the argument seems to not work.
In the below code I am trying to define the argument as "VWS.co"
def get_stock_data(Company):
#This function defines the data to be collected.
#send a get request to query Company's end of day stock prices in period
global VWS_data
Stock_data = yf.Ticker(Company)
Stock_data = Stock_data.history(period="5y")
# look at the first 5 rows of the dataframe
print(Stock_data)
print(Stock_data.describe(include='all'))
get_stock_data("VWS.co")
Edit:
Using escape characters get_stock_data(""VWS.co"") got the definition working. However, something is still wrong. When I run the script it still only works using "VWS.co" as the definition. See below code, the VWS_data_with_arg works. VWS_data does not. Am i missing something really obvious here?
def get_stock_data(Company):
Stock_data = yf.Ticker("VWS.co")
Stock_data_with_arg = yf.Ticker(Company)
VWS_data = Stock_data.history(period="5y")
VWS_data_with_arg = Stock_data_with_arg.history(period="5y")
print(VWS_data) #This returns the expected values
print(VWS_data_with_arg) #This returns an empty dataset
get_stock_data("\"VWS.co\"")
You should use escape characters.
get_stock_data("\"VWS.co\"")
You can use raw strings
get_stock_data(r'"VWS.co"')
I am trying to take the value from the input and put it into the browser.find_elements_by_xpath("//div[#class='v1Nh3 kIKUG _bz0w']") function. However, the string formatting surely doesn't work, since it's the list, hence it throws the AttributeError.
Does anyone know any alternatives to use with lists (possibly without iterating over each file)?
xpath_to_links = input('Enter the xpath to links: ')
posts = browser.find_elements_by_xpath("//div[#class='{}']").format(devops)
AttributeError: 'list' object has no attribute 'format'
Looks like the reason of error is that you are placing the format function in the wrong place, so instead of operating on string "//div[#class='{}']" you call it for the list returned by find_elements_by_xpath. Could you please try to replace your code with one of the following lines ?
posts = browser.find_elements_by_xpath("//div[#class='{}']".format(devops))
posts = browser.find_elements_by_xpath(f"//div[#class='{devops}']")
I'm trying to use the "ls" python command in maya, to list certain objects with a matching string in the name in concatination with a wildcard.
Simple sample code like this:
from maya.cmds import *
list = ls('mesh*')
This code works and will return a list of objects with the matching string in the name, however, I would like to use a variable instead of hard coding in the string. More like this:
from maya.cmds import *
name = 'mesh'
list = ls('name*')
OR like this:
from maya.cmds import *
name = 'mesh'
list = ls('name' + '*')
However, in both examples, it returns an empty list unlike the first. I'm not sure why this is the case because in those examples, the string concatination should come out to 'mesh*' like the first example. I couldn't find an answer on this website, so I chose to ask a question.
Thank you.
JD
PS. If there is a better way to query for objects in maya, let me know what it's called and I'll do some research into what that is. At the moment, this is the only way I know of how to search for objects in maya.
As soon as you add quotes around your variable name like this 'name', you are actually just creating a new string instead of referring to the variable.
There are many different ways to concatenate a string in Python to achieve what you want:
Using %:
'name%s' % '*'
Using the string's format method:
'{}*'.format(name)
Simply using +:
name + '*'
All of these will yield the same output, 'mesh*', and will work with cmds.ls
Personally I stick with format, and this page demonstrates a lot of reasons why.
I want to crawl a webpage for some information and what I've done so far It's working but I need to do a request to another url from the website, I'm trying to format it but it's not working, this is what I have so far:
name = input("> ")
page = requests.get("http://www.mobafire.com/league-of-legends/champions")
tree = html.fromstring(page.content)
for index, champ in enumerate(champ_list):
if name == champ:
y = tree.xpath(".//*[#id='browse-build']/a[{}]/#href".format(index + 1))
print(y)
guide = requests.get("http://www.mobafire.com{}".format(y))
builds = html.fromstring(guide.content)
print(builds)
for title in builds.xpath(".//table[#class='browse-table']/tr[2]/td[2]/div[1]/a/text()"):
print(title)
From the input, the user enters a name; if the name matches one from a list (champ_list) it prints an url and from there it formats it to the guide variable and gets another information but I'm getting errors such as invalid ipv6.
This is the output url (one of them but they're similar anyway) ['/league-of-legends/champion/ivern-133']
I tried using slicing but it doesn't do anything, probably I'm using it wrong or it doesn't work in this case. I tried using replace as well, they don't work on lists; tried using it as:
y = [y.replace("'", "") for y in y] so I could see if it removed at least the quotes but it didn't work neither; what can be another approach to format this properly?
I take it y is the list you want to insert into the string?
Try this:
"http://www.mobafire.com{}".format('/'.join(y))
Edit: Just for clarification I am using python, and would like to do this within python.
I am in the middle of collecting data for a research project at our university. Basically I need to scrape a lot of information from a website that monitors the European Parliament. Here is an example of how the url of one site looks like:
http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A7-2010-0190&language=EN
The numbers after the reference part of the address refers to:
A7 = Parliament in session (previous parliaments are A6 etc.),
2010 = year,
0190 = number of the file.
What I want to do is to create a variable that has all the urls for different parliaments, so I can loop over this variable and scrape the information from the websites.
P.S: I have tried this:
number = range(1,190,1)
for i in number:
search_url = "http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A7-2010-" + str(number[i]) +"&language=EN"
results = search_url
print results
but this gives me the following error:
Traceback (most recent call last):
File "", line 7, in
IndexError: list index out of range
If I understand correctly, you just want to be able to loop over the parliments?
i.e. you want A7, A6, A5...?
If that's what you want a simple loop could handle it:
for p in xrange(7,0, -1):
parliment = "A%d" % p
print p
for the other values similar loops would work just as well:
for year in xrange(2010, 2000, -1):
print year
for filenum in xrange(100,200):
fnum = "%.4d" % filenum
print fnum
You could easily nest your loops in the proper order to generate the combination(s) you need. HTH!
Edit:
String formatting is super useful, and here's how you can do it with your example:
# Just create a string with the format specifier in it: %.4d - a [d]ecimal with a
# precision/width of 4 - so instead of 3 you'll get 0003
search_url = "http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A7-2010-%.4d&language=EN"
# This creates a Python generator. They're super powerful and fun to use,
# and you can iterate over them, just like a collection.
# 1 is the default step, so no need for it in this case
for number in xrange(1,190):
print search_url % number
String formatting takes a string with a variety of specifiers - you'll recognize them because they have % in them - followed by % and a tuple containing the arguments to the format string.
If you want to add the year and parliment, change the string to this:
search_url = "http://www.europarl.europa.eu/sides/getDoc.do?type=REPORT&mode=XML&reference=A%d-%d-%.4d&language=EN"
where the important changes are here:
reference=A%d-%d-%.4d&language=EN
That means you'll need to pass 3 decimals like so:
print search_url % (parliment, year, number)
Can you use python and wget ? Loop through the sessions that exist, and create a string to give to wget? Or is that overkill?
Sorry I can't give this as a comment, but I don't have a high enough score yet.
Looking at the code you quoted in the comment above, your problem is you are trying to add a string and an integer. While some languages will do on the fly conversion (useful when it works but confusing when it doesn't), you have to explicitly convert it with str().
It should be something like:
"http://firstpartofurl" + str(number[i]) + "restofurl"
or, you can use string formatting (using % etc. as Wayne's answer).
Use selenium. Since it controls uses a real browser, it can handle sites using complex javascript. Many language bindings are available, including python.