I am searching for elements that contains a certain string with find_elements, then for each of these elements, I need to ensure that each one contain a span with a certain text, is there a way that I can create a for loop and use the list of elements I got?
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
I believe you should be able to search within each element of a loop, something like this:
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
for element in today:
span = element.find_element(By.TAG_NAME, "span")
if span.text == "The text you want to check":
...do something...
Let me know if that works.
Sure, you can.
You can do something like the following:
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
for element in today:
span = element.find_elements(By.XPATH,'.//span[contains(text(),"the_text_in_span")]')
if !span:
print("current element doesn't contain the desired span with text")
To make sure your code doesn't throw exception in case of no span with desired text found in some element you can use find_elements method. It returns a list of WebElements. In case of no match it will return just an empty list. An empty list is interpreted as a Boolean false in Python while non-empty list is interpreted as a Boolean true.
Related
I am parsing a website with beautifulsoup in python, and after finding all elements, I want to strip the digits from the result list and add them to a list:
## find all prices on page
prices = soup.find_all("div", class_="card-footer")
#print(prices)
## extract digits
stripped = [] # declare empty list
for p in prices:
print(p.get_text(strip=True))
stripped.append(re.findall(r'\d+', p.get_text(strip=True)))
print(stripped)
Result:
[['555'], ['590'], ['599'], ['1000'], ['5000'], ['5000'], ['9999'], ['10000'], ['12000']]
How do I have to do it, to end up with a one-dimensional list only?
Since I only need the "stripped" list, maybe there is also an easier way to extract digits other than using re.findall and do it directly in the line prices = soup.find_all("div", class_="card-footer")?
Thanks!
find_all returns a list. Therefore, if you're only interested in the first element (there probably is only one in your case) then:
stripped.append(re.findall(r'\d+', p.get_text(strip=True))[0])
I know that the find() command finds only the first occurrence and that find_all() finds all of them. Is there a way to find a specific number?
If i want to find only the first two occurrences is there a method for that or does that need to be resolved in a loop?
You can use CSS selectors knowing the child position you need to extract. Let's assume the HTML you have is like this:
<div id="id1">
<span>val1</span>
<span>val2</span>
<span>val2</span>
</div>
Then you can select the first element by the following:
child = div.select('span:nth-child(1)')
Replace 1 by the number you want
If you want to select multiple occurrences, you can concatenate the children like this:
child = div.select('span:nth-child(1)') + div.select('span:nth-child(2)')
to get the first two children
nth-child selector can also get you the odd number of occurrences:
child = div.select('span:nth-child(2n+1)')
where n starts from 0:
n: 0 => 2n+1: 1
n: 1 => 2n+1: 3
..
Edited after addressing the comment, thanks!
If you are looking for first n elements:
As pointed out in comments, you can use find_all to find all elements and then select necessary amount of it with list slices.
soup.find_all(...)[:n] # get first n elements
Or more efficiently, you can use limit parameter of find_all to limit the number of elements you want.
soup.find_all(..., limit = n)
This is more efficient because it doesn't iterate through the whole page. It stops execution after reaching to limit.
Refer to the documentation for more.
If you are looking for the n(th) element:
In this case you can use :nth-child property of css selectors:
soup.select_one('span:nth-child(n)')
I have a list which contains a string shown below. I have defined mylist in the global space as a string using "".
mylist = ""
mylist = ["1.22.43.45"]
I get an execution error stating that the split operation is not possible as it is being performed on a list rather than the string.
mylist.rsplit(".",1)[-1]
I tried to resolve it by using the following code:
str(mylist.rsplit(".",1)[-1]
Is this the best way to do it? The output I want is 45. I am splitting the string and accessing the last element. Any help is appreciated.
mylist=["1.22.43.45"]
newstring = mylist[0].rsplit(".",1)[-1]
First select the element in your list then split then choose the last element in the split
Just because you assigned mylist = "" first, doesn't mean it'll cast the list to a string. You've just reassigned the variable to point at a list instead of an empty string.
You can accomplish what you want using:
mylist = ["1.22.43.45"]
mylist[-1].rsplit('.', 1)[-1]
Which will get the last item from the list and try and perform a rsplit on it. Of course, this won't work if the list is empty, or if the last item in the list is not a string. You may want to wrap this in a try/except block to catch IndexError for example.
EDIT: Added the [-1] index to the end to grab the last list item from the split, since rsplit() returns a list, not a string. See DrBwts' answer
You can access the first element (the string, in your case) by the index operator []
mylist[0].rsplit(".", 1)[-1]
I have a list of elements which i retrieve through find_elements_by_xpath
results = driver.find_elements_by_xpath("//*[contains(#class, 'result')]")
Now I want to iterate through all the elements returned and find specific child elements
for element in results:
field1 = element.find_elements_by_xpath("//*[contains(#class, 'field1')]")
My problem is that the context for the xpath selection gets ignored in the iteration so field1 always just returns the first element with the field1 class on the page regardless of the current element
As #Andersson posted the fix is quite simple, all that was needed was the dot at the beginning of the expression:
for element in results:
field1 = element.find_elements_by_xpath(".//*[contains(#class, 'field1')]")
It's easier to use css selectors (less typing) and find all the elements at once:
for element in driver.find_elements_by_css_selector(".result .field1")
field1 = element
As the topic states:
list = ["a", "b"]
element = "ac"
Can I use the:
if element in list:
If element is equal to the element in (list + "c")
Pseudocode to what I want to achieve:
if element in (list+c)
What is the best way to get this behavior in python?
Edit: I know there are many ways to get around this, but can this be done in one line as the code above.
More efficient would be:
if any(x+'c' == element for x in your_list):
as it avoids scanning through the list twice (once to make the "+c" versions, once to check if element is in the resulting list). It'll also "short-circuit" (that is, quickly move on) if it finds the element before going through the entire list.
P.S. - it's best not to name variables list, since that's already the name for the actual list type.
if element in [elem + 'c' for elem in my_list]:
# ...
Never a good practice to call a variable list (or int, float, map, tuple, etc.), because you are loosing those built-in types.
if element[0] in list:
You don't want to add "c" to every item in the list and check to see whether "ac" is in the resut; you want to check to see if the first letter of "ac" is in the list. It's the same thing except a lot easier.
if element[:-1] in list:
It is better to calculate the element without 'c'. So you are making just one calculation.