Loop over lists in Xpath - python

How would I loop over a list in the Xpath? I thought it was similar to using the .format function like it would when looping over a string, this obviously does not work. I wanted to get the href from each list - for example:
for i in range(0, 15):
test_page = response.xpath('.//li[{i}]/a[#class="pagination__item"]/#href').format(i)
Taken from the link: https://www.ebay.com/b/Collectible-Card-Games-Accessories/2536/bn_1852210?LH_BIN=1&LH_PrefLoc=2&mag=1&rt=nc&_pgn=4&_sop=16
Expected output:
A way to index i or some type of boolean logic i.e. to select between 1 and 30.

Related

Check if elements list are in column DataFrame

Objective: I have a list of 200 elements(urls) and I would like to check if each one is in a specific column of the Dataframe. If it is, I would like to remove the element from the list.
Problem: I am trying a similar solution by adding to a new list the ones that are not there but it adds all of them.
pruned = []
for element in list1:
if element not in transfer_history['Link']:
pruned.append(element)
I have also tried the solution I asked for without success. I think it's a simple thing but I can't find the key.
for element in list1:
if element in transfer_history['Link']:
list1.remove(element)
When you use in with a pandas series, you are searching the index, not the values. To get around this, convert the column to a list using transfer_history['Link'].tolist(), or better, convert it to a set.
links = set(transfer_history["Link"])
A good way to filter the list is like this:
pruned = [element for element in list1 if element not in links]
Don't remove elements from the list while iterating over it, which may have unexpected results.
Remember, your syntax for transfer_history['Link'] is the entire column itself. You need to call each item in the column using another array transfer_history['Link'][x]. Use a for loop to iterate through each item in the column.
Or a much easier way is to just check if the item is in a list made of the entire column with a one liner:
pruned = []
for element in list1:
if element not in [link for link in transfer_history['Link']]:
pruned.append(element)
If the order of the urls doesn't matter, this can be simplified a lot using sets:
list1 = list(set(list1) - set(transfer_history['Link']))

How to search for an element inside another one with selenium

I am searching for elements that contains a certain string with find_elements, then for each of these elements, I need to ensure that each one contain a span with a certain text, is there a way that I can create a for loop and use the list of elements I got?
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
I believe you should be able to search within each element of a loop, something like this:
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
for element in today:
span = element.find_element(By.TAG_NAME, "span")
if span.text == "The text you want to check":
...do something...
Let me know if that works.
Sure, you can.
You can do something like the following:
today = self.dataBrowser.find_elements(By.XPATH, f'//tr[child::td[child::div[child::strong[text()="{self.date}"]]]]')
for element in today:
span = element.find_elements(By.XPATH,'.//span[contains(text(),"the_text_in_span")]')
if !span:
print("current element doesn't contain the desired span with text")
To make sure your code doesn't throw exception in case of no span with desired text found in some element you can use find_elements method. It returns a list of WebElements. In case of no match it will return just an empty list. An empty list is interpreted as a Boolean false in Python while non-empty list is interpreted as a Boolean true.

Find specific number of things with Beautiful soup

I know that the find() command finds only the first occurrence and that find_all() finds all of them. Is there a way to find a specific number?
If i want to find only the first two occurrences is there a method for that or does that need to be resolved in a loop?
You can use CSS selectors knowing the child position you need to extract. Let's assume the HTML you have is like this:
<div id="id1">
<span>val1</span>
<span>val2</span>
<span>val2</span>
</div>
Then you can select the first element by the following:
child = div.select('span:nth-child(1)')
Replace 1 by the number you want
If you want to select multiple occurrences, you can concatenate the children like this:
child = div.select('span:nth-child(1)') + div.select('span:nth-child(2)')
to get the first two children
nth-child selector can also get you the odd number of occurrences:
child = div.select('span:nth-child(2n+1)')
where n starts from 0:
n: 0 => 2n+1: 1
n: 1 => 2n+1: 3
..
Edited after addressing the comment, thanks!
If you are looking for first n elements:
As pointed out in comments, you can use find_all to find all elements and then select necessary amount of it with list slices.
soup.find_all(...)[:n] # get first n elements
Or more efficiently, you can use limit parameter of find_all to limit the number of elements you want.
soup.find_all(..., limit = n)
This is more efficient because it doesn't iterate through the whole page. It stops execution after reaching to limit.
Refer to the documentation for more.
If you are looking for the n(th) element:
In this case you can use :nth-child property of css selectors:
soup.select_one('span:nth-child(n)')

How to nest loop number into an xpath in python?

I have the xpath to follow a user on a website in selenium. Here is what I thought of doing so far:
followloop = [1,2,3,4,5,6,7,8,9,10]
for x in followloop:
driver.find_element_by_xpath("/html/body/div[7]/div/div/div[2]/div[>>>here is where i want the increment to increase<<<]/div[2]/div[1]/div/button").click()
So where I stated in the code is where I want the number to go up in increments. Also as you see with the for loop Im' doing 1,2,3,4,5...can I code it more simply to be like 1-100? Because I don't just want 1-10 I will want it 1-whatever higher number.
I tried to just put x in where I want it but realised that python won't pick up that it's a variable I want to put in that slot and will just consider it as part of the xpath. So how do I make it put the increasing number variable number in there on each loop?
You need to convert the index from the for loop into a string and use it in your xpath:
follow_loop = range(1, 11)
for x in follow_loop:
xpath = "/html/body/div[7]/div/div/div[2]/div["
xpath += str(x)
xpath += "]/div[2]/div[1]/div/button"
driver.find_element_by_xpath(xpath).click()
Also, there will generally be a neater/better way of selecting an element instead of using the XPath /html/body/div[7]/div/div/div[2]. Try to select an element by class or by id, eg:
//div[#class="a-classname"]
//div[#id="an-id-name"]
I would use a wildcard '%s' for your task and range() as indicated in previous comments:
for x in range(0,100):
driver.find_element_by_xpath("/html/body/div[7]/div/div/div[2]/div[%s]/div[2]/
div[1]/div/button").click() % x
Use a format string.
And use range() (or xrange() for larger numbers) instead. It does exactly what you want.
for x in range(10):
driver.find_element_by_xpath("/html/body/div[7]/div/div/div[2]/div[%d]/div[2]/div[1]/div/button" % (x,)).click()

Introductory Python task from the edX MIT class

I have recently started learning Python in the MIT class on edX.
However, I have been having some trouble with certain exercises. Here is one of them:
"Write a procedure called oddTuples, which takes a tuple as input, and returns a new tuple as output, where every other element of the input tuple is copied, starting with the first one. So if test is the tuple ('I', 'am', 'a', 'test', 'tuple'), then evaluating oddTuples on this input would return the tuple ('I', 'a', 'tuple'). "
The correct code, according to the lecture, is the following:
def oddTuples(aTup):
'''
aTup: a tuple
returns: tuple, every other element of aTup.
'''
# a placeholder to gather our response
rTup = ()
index = 0
# Idea: Iterate over the elements in aTup, counting by 2
# (every other element) and adding that element to
# the result
while index < len(aTup):
rTup += (aTup[index],)
index += 2
return rTup
However, I have tried to solve it myself in a different way with the following code:
def oddTuples(aTup):
'''
aTup: a tuple
returns: tuple, every other element of aTup.
'''
# Your Code Here
bTup=()
i=0
for i in (0,len(aTup)-1):
if i%2==0:
bTup=bTup+(aTup[i],)
print(bTup)
print(i)
i+=1
return bTup
However, my solution does not work and I am unable to understand why (I think it should do essentially the same thing as the code the tutors provide).
I just like to add that the pythonic solution for this problem uses slices with a stepwidth and is:
newTuple = oldTuple[::2]
oldTuple[::2] has the meaning: Get copy of oldtuple from start (value is omitted) to end (omitted) with a spepwidth of 2.
I think I get the problem here.
In your for loop you specify two fixed values for i:
0
len(aTup)-1
Want you really want is the range of values from 0 to len(aTup)-1:
0
1
2
...
len(aTup)-1
In order to convert start and end values into all values in a range you need to use Python's range method:
for i in range(0,len(aTup)-1):
(Actually if you take a look into range's documentation, you will find out there is a third parameter called skip. If you use it your function becomes kind of irrelevant :))
Your code should read:
for i in range(0,len(aTup)):
# i=0, 1, 2 ..., len(aTup)-1.
rather than
for i in (0,len(aTup)-1):
# i=0 or i=len(aTup)-1.
The lines for i in (0,len(aTup)-1): and i+=1 aren't quite doing what you want. As in other answers, you probably want for i in range(0,len(aTup)-1): (insert range), but you also want to remove i+=1, since the for-in construct sets the value of i to each of the items in the iterable in turn.
Okay when running your code the output is the following:
('I', 'tuple')
This is because the problem in the code you wrote is the way you implement the for loop.
Instead of using:
for i in (0,len(aTup)-1):
You should change that to the following and your code will work:
for i in range(len(aTup)):
the range function basically creates a list of integers ranging from 0 to the length of your tuple - 1.
So your code should after editing it should look like:
def oddTuples(aTup):
bTup=()
for i in range(len(aTup)):
if i%2==0:
bTup=bTup+(aTup[i],)
return bTup

Categories