python selenium - extract numeric string, convert to int and find median - python

I'm working on extracting the value (string) of an object from the site, converting it to an int, and finding the intermediate value.
prara = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'price-value')))
y = []
for j in prara:
print(j)
o = j.text
s = o.replace(",", "")
y += s
prval = statistics.median(y)
print(prval)
>>>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="c39b2f69-4bc9-4ebe-87c4-afa3f1dcaeee")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="f5c2372b-6592-418f-8d31-fe266332cc6d")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="ad22afa5-f29d-444d-a5a0-6c8efe558e31")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="89196683-0c7d-4f0e-8c6c-30218b5e2f07")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="f4153c2a-3a23-4baa-9f13-4bc3e21e1073")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="9b7c1d8a-4420-44d8-977b-b6bf00a4b8d3")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="8d8ab946-832e-4865-9421-3ace5facbb99")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="d5403964-b12f-4506-ab51-6c1f2d2ea2e8")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="89ebdb12-7436-4407-a930-54f124d42d06")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="c314426c-d94e-461d-b526-597532812a1b")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="fcb717fe-a6f8-4aa1-9640-4651fd0efdf6")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="9880ed2a-d09c-48fc-9ffd-aac9792324a2")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="e4252dee-ae4d-42a8-b289-42dac1fc6af3")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="cda7c9a2-5957-4861-aead-19f9d643b53b")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="ae9d1f9c-65e0-44a0-9df2-73e2a4171ea0")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="3fda6be7-7338-4f65-858e-a9a266e6219e")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="a84f64cd-8397-424b-9f9c-f8987cfbc26b")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="c0097638-060e-479c-b8a9-8694c498459b")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="e375ffc4-ef6f-4c8f-aa09-398fa1f2efa5")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="481bf4e6-54f9-4999-8727-1f75a9dac033")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="1d65f8d3-0448-4d67-aff3-2f55ee83071b")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="7c5bc5f3-be42-49ce-83ad-a3acfd3e5437")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="5f2e1114-4907-4a0e-b9f6-ba748211f4a7")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="927b6a96-2d55-4f20-9a0c-602aede912b3")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="f1028ec1-419c-4c20-829b-ee976bad5e6a")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="4b57a9f4-b0e3-487d-bb56-9901ef9901c8")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="37fa14b1-1014-4b61-82f8-10113630028d")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="287b4979-bc59-42c2-a549-be328a2116fd")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="5d3eb54a-fbe7-4d59-b71b-ee43c2ec3500")>
<selenium.webdriver.remote.webelement.WebElement (session="a0ddfb9d3848df719d38dac2458be774", element="206ae32d-2b68-4915-8839-32a978e3e040")>
2
I tried to extract the value, but I got the wrong number 2. Is there a way to extract a number and find the median?

The following code works:
prara = WebDriverWait(browser, 20).until(EC.presence_of_all_elements_located((By.CLASS_NAME, 'price-value')))
y = []
for j in prara:
print(j.text)
o = j.text
s = o.replace(",", "")
y.append(float(s))
prval = statistics.median(y)
print(prval)
It's returning the median as 109000.0
Let me know if something is unclear.

Related

Python problem with downloading links from www

I have a problem with the code and I admit I don't understand what is happening. I have from the download page 13 links. Up to this point self.img = driver.find_element(By.XPATH, self.link_photo) everything is ok. Further after displaying 5 it stops working. It seems to be doing something strange here. It seems to be doing something strange here 'self.imgURL = self.img.get_attribute('srcset')'.Can someone help me fix this code? Thank you very much for your help.
# Download a link of each photo
self.table_of_mini_photo = []
for num, i in enumerate(range(1, int(self.total)+20)):
self.link_photo = f'//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[{i}]/a/div/div/div[1]/div[1]/div/img'
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[2]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[3]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[4]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[6]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[7]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[8]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[10]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[12]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[17]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[15]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[16]/a/div/div/div[1]/div[1]/div/img
# //*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[15]/a/div/div/div[1]/div[1]/div/img
try:
print(self.link_photo)
self.img = driver.find_element(By.XPATH, self.link_photo)
print(self.img)
self.imgURL = self.img.get_attribute('srcset')
print(self.imgURL)
self.table_of_mini_photo.append(self.imgURL)
except:
pass
The console result confirms that I get 13 'selenium : that is links' but physically I get 5. Behind the commented in code are XPath that I download.
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[1]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[2]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="671dace6-a8c2-4204-a164-1266b9385073")>
https://ireland.apollo.olxcdn.com:443/v1/files/9li55l5fl99q3-PL/image;s=100x0;q=50 100w,
https://ireland.apollo.olxcdn.com:443/v1/files/9li55l5fl99q3-PL/image;s=200x0;q=50 200w,
https://ireland.apollo.olxcdn.com:443/v1/files/9li55l5fl99q3-PL/image;s=300x0;q=50 300w,
https://ireland.apollo.olxcdn.com:443/v1/files/9li55l5fl99q3-PL/image;s=400x0;q=50 400w,
https://ireland.apollo.olxcdn.com:443/v1/files/9li55l5fl99q3-PL/image;s=600x0;q=50 600w
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[3]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="431518f9-0ee5-4317-a200-39bd7840939e")>
https://ireland.apollo.olxcdn.com:443/v1/files/mbbgmgrg6z343-PL/image;s=100x0;q=50 100w,
https://ireland.apollo.olxcdn.com:443/v1/files/mbbgmgrg6z343-PL/image;s=200x0;q=50 200w,
https://ireland.apollo.olxcdn.com:443/v1/files/mbbgmgrg6z343-PL/image;s=300x0;q=50 300w,
https://ireland.apollo.olxcdn.com:443/v1/files/mbbgmgrg6z343-PL/image;s=400x0;q=50 400w,
https://ireland.apollo.olxcdn.com:443/v1/files/mbbgmgrg6z343-PL/image;s=600x0;q=50 600w
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[4]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="07505827-b5fb-4af1-92f5-5391fdbf12d2")>
https://ireland.apollo.olxcdn.com:443/v1/files/w53tb2o5ili71-PL/image;s=100x0;q=50 100w,
https://ireland.apollo.olxcdn.com:443/v1/files/w53tb2o5ili71-PL/image;s=200x0;q=50 200w,
https://ireland.apollo.olxcdn.com:443/v1/files/w53tb2o5ili71-PL/image;s=300x0;q=50 300w,
https://ireland.apollo.olxcdn.com:443/v1/files/w53tb2o5ili71-PL/image;s=400x0;q=50 400w,
https://ireland.apollo.olxcdn.com:443/v1/files/w53tb2o5ili71-PL/image;s=600x0;q=50 600w
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[5]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[6]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="0d7a591e-e318-41c6-9e2f-2519bbaeb212")>
https://ireland.apollo.olxcdn.com:443/v1/files/yjsjx3lv5hnx-PL/image;s=100x0;q=50 100w,
https://ireland.apollo.olxcdn.com:443/v1/files/yjsjx3lv5hnx-PL/image;s=200x0;q=50 200w,
https://ireland.apollo.olxcdn.com:443/v1/files/yjsjx3lv5hnx-PL/image;s=300x0;q=50 300w,
https://ireland.apollo.olxcdn.com:443/v1/files/yjsjx3lv5hnx-PL/image;s=400x0;q=50 400w,
https://ireland.apollo.olxcdn.com:443/v1/files/yjsjx3lv5hnx-PL/image;s=600x0;q=50 600w
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[7]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="79ea57a4-9bd5-4213-93c8-acceb60094a0")>
https://ireland.apollo.olxcdn.com:443/v1/files/5gs5bqemn8wr-PL/image;s=100x0;q=50 100w,
https://ireland.apollo.olxcdn.com:443/v1/files/5gs5bqemn8wr-PL/image;s=200x0;q=50 200w,
https://ireland.apollo.olxcdn.com:443/v1/files/5gs5bqemn8wr-PL/image;s=300x0;q=50 300w,
https://ireland.apollo.olxcdn.com:443/v1/files/5gs5bqemn8wr-PL/image;s=400x0;q=50 400w,
https://ireland.apollo.olxcdn.com:443/v1/files/5gs5bqemn8wr-PL/image;s=600x0;q=50 600w
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[8]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="d4c89d12-0b89-42bd-8984-8ca1be8f106e")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[10]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="5953e136-712e-43d4-a36f-55ddcc460594")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[11]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[12]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="80e8fd03-4fe2-4797-9b4b-1228f8f6a706")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[13]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="2b0d5ded-6480-4c43-87c2-ef9eece4f6e4")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[14]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="f480c79e-2b10-4379-8b60-26377d36e936")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[15]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="a04b82e4-7021-47d5-8746-f08765fe7b17")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[16]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="3cc93e75-aea9-4366-a1eb-c240fee343ca")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[17]/a/div/div/div[1]/div[1]/div/img
<selenium.webdriver.remote.webelement.WebElement (session="8cb74bec440fc1d3621f9a5a995c27d6", element="57e77c14-1ef4-4732-bc2f-b938b1841820")>
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[18]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[19]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[20]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[21]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[22]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[23]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[24]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[25]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[26]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[27]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[28]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[29]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[30]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[31]/a/div/div/div[1]/div[1]/div/img
//*[#id="root"]/div[1]/div[2]/form/div[5]/div/div[2]/div[32]/a/div/div/div[1]/div[1]/div/img
I prefer the 911 Turbo. :)
I found a simple CSS selector that retrieves all images.
div[type='list'] > div > img
Also, you are pulling srcset which is a bunch of different sizes of the same image. If all you want is an image, you can just pull src and get a single image so you don't have to parse the string, etc.
self.imgURL = self.img.get_attribute('src')
The updated code loops through only the retrieved images, grabs and prints the 'src', and then appends it to the list.
# Download a link of each photo
self.table_of_mini_photo = []
for img in driver.find_elements(By.CSS_SELECTOR, "div[type='list'] > div > img")
imgURL = img.get_attribute('src')
print(imgURL)
self.table_of_mini_photo.append(imgURL)

Get the src value from a list of webelements with Selenium

Hello trying to adapt a solution from this video
#scroller 100 fois pour reveler le plus d'image ( comment etre sur qu'on est à la fin ?)
n_scrolls = 100
for i in range(1, n_scrolls):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
#recupère toutes les balises img puis leurs attribut src (je ne comprend pa bien cette façon d'assigner les elements)
images = driver.find_elements_by_tag_name('img')
print(images)
images = [images.get_attribute('src') for img in images]
But the output is:
[<selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="4a4b2838-67d6-4787-a168-9e25e948a21a")>, <selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="7e3ae8f5-160a-4da6-b3a7-2be10bad8f3f")>, <selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="8c45c421-3f25-4498-85a2-565506835984")>, <selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="5ef69be7-13d7-41f1-9ec8-d8ac6e843fdd")>, <selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="626ae474-bccc-40c3-ac60-5b5f021b7bf0")>, <selenium.webdriver.remote.webelement.WebElement (session="50f35cc8388f3b59df4042bcc09c7079", element="ea0b589d-8e90-470a-ad59-a661f8d52b31")>
Instead of the img tag element, is it possible to get the src attributes?
I can find the mistake you have made.
Instead of this
images = [images.get_attribute('src') for img in images]
It should be
images = [img.get_attribute('src') for img in images]
Since you are iterating the list.
Now print the list you will get all src values.
print(images)
As images is the list of the <img> elements, while iterating the WebElements within the for loop you need to extract the src attribute from each individual img and also may like to avoid editing the same list and create a different list altogether. So effectively, your line of code will be:
images = driver.find_elements_by_tag_name('img')
print(images)
image_src_list = [img.get_attribute('src') for img in images]
print(image_src_list)

I am trying to click on each link using selenium and python

below is my code,
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait as wait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
xpath='//ol[#class="ranking"]'
driver=webdriver.Ie("C:\Users\test\Downloads\IEDriverServer_Win32_3.0.0\IEDriverServer.exe")
driver.get('https://www.havocscope.com/country-profile/')
state=[states.find_elements_by_css_selector('li a') for states in driver.find_elements_by_xpath(xpath)]
print (state)
output of above code is a below.
[[<selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="3cc4a03b-9786-4fc6-9698-012f0665bba3")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="5d323e16-451a-4e10-a514-312770ff84bc")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="da69e2ea-4a48-468f-951f-85d6df7d6888")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="9a37d096-dea9-4f76-9680-e3166abebcae")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8d666290-67ed-4290-9fde-9d4e166a5400")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="7b3ba305-1f4e-4723-b898-ef8df1b21e1c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8604ff20-3bcf-4df3-a66c-57a22c263b13")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="4f20ea21-942a-43cb-8b72-a20fbd0792a9")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="4d54c7aa-fe98-4b5c-8637-01fe48f196f2")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="88a684ed-a49b-44b9-8f34-100c8356a335")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="a2eb4247-8898-4d44-9a83-d260ee4d3385")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="af760bd6-f413-4086-ad83-bcbe03cb4956")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="d47dcb8d-8228-4908-831e-c64aad15f0c2")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="9f2a7747-99ad-4c6c-83bc-6dbd6dad1690")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="86b40883-f289-4a51-af8e-36eddfdac163")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="726f6fec-f93e-4da7-8f18-d23c1c2bc710")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="fb460c1b-a7fb-4dcd-9bd1-ead6bea63a22")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="dfa5b06a-18a6-4317-ba37-dded6e333b83")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="50fd97ad-802c-4956-a535-971b29cc564d")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ac2f8180-0c6b-4f75-b141-ec899556bd47")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="940e5f2b-2367-45b7-ab57-36c684e6ab7c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="0eb98392-9955-401e-8b8c-afca5d19b630")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8cf3e4a8-c497-4e44-b1ec-0208f2a28cd8")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="7f9dbd8d-a4cc-40a5-adbb-459ec52d60df")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="da237e2e-ab69-4f49-b16e-ae32e2b9f413")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="0a6b7174-904c-4429-8774-5234860d2537")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="29dae3bb-bc9f-4df4-8af3-5cb743535777")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="405f14b6-8893-4ed6-8190-656d0e0b4a1d")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="79e30290-2d14-4cde-95df-735d34a79d92")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="91684d86-0f40-485c-a898-ab7d9e30ee79")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="42d22a9e-5243-44ef-b22f-572a69855de6")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="f7e3319c-3f47-4071-9c14-a55a8a449018")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8b61ca05-0d11-4441-bd06-fa890b05c5e1")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="e0f9fb4c-9eec-4f5b-9e42-ef1088d01530")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ac9b04d3-89d1-416a-bcb6-1a93887660e9")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="4140a5b4-36b3-452b-b982-c32b4a11a93e")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="0baa6ce4-cbf9-4e0d-9e46-f4822c0887e9")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="d0c9ef01-5000-402e-9d21-5229d73b5d6e")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="42702842-e3ab-4003-ad27-375aa41cd217")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="013c29bd-cfa2-47e6-9f00-5e5f94d80a4f")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="50e0b8f6-d08d-411c-be4e-6e5520537e29")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="47f44233-90f5-40d7-841b-af854a1d15c5")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="d401447c-ff98-46c8-bf90-6cc4a2025c43")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="c2968ca7-0a7c-48f9-9a15-d0205d82d3ff")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="344f4af6-3a00-4d5d-8beb-94b9738ff460")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8cdfe780-d6a6-47f3-86ca-4781e24bbfae")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="c028291b-defc-4468-862b-95f1dbb5424c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="6a7d969b-8470-46cb-8492-d5e1636b607d")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="a9d40a2b-a0cc-44a8-844d-bb748fd0ed4f")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="505685b3-2aeb-4868-866e-5eaf80646a71")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="7427c0c4-d429-4fcc-830e-2f6fec1be49c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="6d5ac707-9ff4-4d2c-817a-c0f6621cc183")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="47b8ef86-3684-46e9-89a1-fe1459020569")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="436b0a8e-fb5b-426f-9681-7d97c66550f0")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="aaf68e53-78e1-4729-8ccc-e0c5f377f3db")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="11b1eec1-6452-4659-8b55-1cd89f613c95")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="5082275b-acd7-47e8-aef0-d6e0c621a636")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="c4a40605-fa6b-4b29-a6e8-cc54dc300d94")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="5310c7dd-ca10-4225-b2d3-63398b36d54f")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="acbb0b10-5de8-49a0-a486-dbbdd3d5653c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ed203cf1-1614-4374-b753-1895252dbc8e")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="3899983b-676c-4ba6-9871-8b279019ce67")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="92040605-22c1-4650-a0c6-e6c1710e0205")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="d0e36a46-a49e-4e4a-98dc-3e4584fb0a6a")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="5cdc85a0-c698-4af4-90c3-0866ec7b757a")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="6742d9e2-cf95-4c99-b90c-ddeb04ffe4e6")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="4b769a03-87b5-4c82-90e6-65d6a698953e")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ed41eeef-5ba6-4964-83ad-4e09657aabb9")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="97c07b5a-77df-4342-9293-4f30cda0274a")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ed9898fc-2baa-4995-8d10-6e72af6fd7ac")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ab8fe5a8-2867-48d1-ad8b-a51b103129e3")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="f1b10c0b-fe49-4bc1-9025-9e3e7c5bd603")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8a65e6f5-1cb0-4457-849b-60eaabc31968")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="19a08c63-cbcc-4fed-b628-18375d332dd1")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="b7942bf7-141a-4875-85d2-253c99534b4b")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ffb5f44d-b2ea-41a9-8c2f-b43c793e821a")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8889aeb1-21bd-45d0-adea-bbcef8544e9a")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="673f5ff1-8d3d-4feb-98f0-0390abdbb715")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="86306a74-c512-4d2e-ac60-11996e3a0e42")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="1b2df781-70e0-4123-81c8-0b86ec8f2c12")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8099ce05-0b81-47d3-9f8e-48f18cc300c0")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="b192a83a-ec5d-40cf-9027-95cb11a15f4c")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="829221a9-2504-4ee2-876e-758b0b7e87e2")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="bd24c85d-f3b8-461e-9bd4-228231ffe3bd")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="8b6b5ee3-fa9b-4075-95a6-acc622f6cfc5")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="7b869364-ca2d-40f4-aa8d-5d116c04e005")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="81f8ee44-2009-4a46-93d4-02a23cd92986")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="99d26d1a-7793-4a6b-89fb-c486aeac16ec")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="b5d98ac3-d171-4071-b4aa-6731b1c5c7f7")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="ba3872e9-cd7d-4119-9a6f-07d61b38b8a7")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="f724357e-5182-4c13-9cf6-4f95e8f1a747")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="5356695e-614c-44d8-859d-98f4d2ee4639")>, <selenium.webdriver.remote.webelement.WebElement (session="8745dbe7-bd85-4a9f-a5b4-140dec901b65", element="1f74ff86-f212-4331-bbb5-7adc4692541a")>]]
Now, I am confused as to how to iterate through this by clicking on each link, going to next page, and navigating back.
I tried for loop below but that does not work. Also, if I include print len(state) in the code, it shows 1. could it be because of 'state' being a list with in a list (notice 2 square brackets in o/p of print (state)). if so, how would I convert it to one list??
for i in range (0,len(state)):
state[i].click()
below is the html
<ol class="ranking">
<li>United States<span>$625.63 Billion</span></li>
<li>China<span>$261 Billion</span></li>
<li>Mexico<span>$126.08 Billion</span></li>
<li>Spain<span>$124.06</span></li>
<li>Italy<span>$111.05 Billion</span></li>
<li>Japan<span>$108.3 Billion</span></li>
<li>Canada<span>$77.83</span></li>
<li>India<span>$68.59 Billion</span></li>
<li>United Kingdom<span>$61.96</span></li>
<li>Russia<span>$49.04 Billion</span></li>
<li>Germany<span>$39.67 Billion</span></li>
<li>South Korea<span>$26.2 Billion</span></li>
<li>Indonesia<span>$23.05 Billion</span></li>
<li>Philippines<span>$17.27 Billion</span></li>
<li>Turkey<span>$17.16 Billion</span></li>
<li>Brazil<span>$17 Billion</span></li>
<li>Australia<span>$14.62 Billion</span></li>
<li>Colombia<span>$14.50 Billion</span></li>
<li>Venezuela<span>$14.19 Billion</span></li>
<li>Thailand<span>$13.95 Billion</span></li>
<li>Paraguay<span>$13 Billion</span></li>
<li>Morocco<span>$12.7 Billion</span></li>
<li>Iran<span>$10.64 Billion</span></li>
<li>Guatemala<span>$10.11 Billion</span></li>
<li>Saudi Arabia<span>$10.1 Billion</span></li>
<li>France<span>$9.85 Billion</span></li>
<li>Nigeria<span>$8.4 Billion</span></li>
<li>Afghanistan<span>$7.3 Billion</span></li>
<li>Israel<span>$7.05 Billion</span></li>
<li>Peru<span>$6.7 Billion</span></li>
<li>Pakistan<span>$6.53 Billion</span></li>
<li>Iraq<span>$5.17 Billion</span></li>
<li>Bulgaria<span>$4.74 Billion</span></li>
<li>Hungary<span>$4.6 Billion</span></li>
<li>Switzerland<span>$4.5 Billion</span></li>
<li>Ukraine<span>$4.31 Billion</span></li>
<li>South Africa<span>$3.93 Billion</span></li>
<li>Greece<span>$3.85 Billion</span></li>
<li>Egypt<span>$3.79 Billion</span></li>
<li>Malaysia<span>$2.99 Billion</span></li>
<li>Ireland<span>$2.98 Billion</span></li>
<li>Taiwan<span>$2.60 Billion</span></li>
</ol>
state is already a list because you used states.find_elements_by_css_selector and not states.find_element_by_css_selector
so change it to something that looks more like this:
ranking = driver.find_element_by_xpath(xpath) # notice I use element
num_countries=len(ranking.find_elements_by_css_selector('li'))
for i in range(num_countries):
# you need to refetch countries every time so the elements aren't stale
ranking = driver.find_element_by_xpath(xpath)
countries = ranking.find_elements_by_css_selector('li a')
country = countries[i]
# do what you need with the state now
country()
assert country == driver.find_element_by_css_selector('#content_box li h2').text # make sure the country is the correct one
# go back
driver.back()
NOTE: I renamed the var to countries because they're countries.

unable to fetch full data inside<div>

HTML:
<div>
Está en: <b>
Inicio /
Valle Del Cauca /
Cali /
Zona Sur /
Zona Sur /
<a>Los Naranjos Conjunto Campestre</a></b>
</div>
Unable to fetch all <a> tags inside <div> tag
My code:
import requests
from bs4 import BeautifulSoup
page = requests.get('https://www.fincaraiz.com.co/oceana-52/barranquilla/proyecto-nuevo-det-1041165.aspx')
soup = BeautifulSoup(page.content, 'html.parser')
first = soup.find('div' , 'breadcrumb left')
link = first.find('div')
a_link = link.findAll('a')
print (a_link)
The above coding only printing the first <a> tag
[Inicio]
Following are the output required from the above HTML
Valle Del Cauca
Cali
Zona Sur
Zona Sur
I'm not sure why it was not printing after '/' inside <b> tag
You can use lxml parser, html.parser normalizes/prettify the actual source before BS4 parse it.
soup = BeautifulSoup(page.content, 'lxml')

python selenium data-style-name

So there's a bit of html that looks like this
<a class="" data-style-name="Black" data-style-id="16360" "true" data-description="null"<img width="32" height="32"
and I was wondering if I could get the text "Black" out of it and than click it, but there's no class name too loop through and the xpath doesn't return anything
data-style-name is called an attribute of your a element and "Black" is its value.
Here is a way to access attribute's value with selenium & python:
elements = driver.find_elements_by_xpath("//a[#data-style-name]")
for element in elements:
print element.get_attribute("data-style-name")
If you want to select only elements with attribute data-style-name with value "Black":
driver.find_elements_by_xpath("//a[#data-style-name=Black]")
More about xpath: https://www.w3.org/TR/xpath/#section-Introduction
Have you try on find_element_by_xpath()?
a_check = browser.find_element_by_xpath("/html/body/a[#data-style-name='Black']")
Which returns:
<selenium.webdriver.remote.webelement.WebElement (session="6c94ac24e0ec3a3320ec21b24055f4fa", element="0.1043557711542944-1")>

Categories