How can I iterate through the full web table with beautifulsoup?

How can I iterate through the full web table with beautifulsoup? - python

i want to scrape a web table with selenium and beautifulsoup. The table contains 10 x 'resultMainRow' and 4 x 'resultMainCell'.
Inside the 4th resultMainCell, there have 8 span classes, for each holding an img src.
The following html code represents one of the table rows. I could only print out the relevant source code of the table. How can I iterate through the full web table together with the img src?
<div class="resultMainTable">
<div class="resultMainRow">
<div class="resultMainCell_1 tableResult2">
<a href="javascript:genResultDetails(2);"
title="Best of the date">20/006 </a></div>
<div class="resultMainCell_2 tableResult2">21/01/2020</div>
<div class="resultMainCell_3 tableResult2"></div>
<div class="resultMainCell_4 tableResult2">
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_3abc”> </span>
<span class="resultMainCellInner">
<img height="25" src = "/info/images/icon/no_14 " ></span>
<span class="resultMainCellInner">
<img height="25" src "/info/images/icon/no_21 " ></span>
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_28 " ></span>
<span class="resultMainCellInner">
<img height="25" src=" /info/images/icon/no_37 "></span>
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_44 "></span>
<span class="resultMainCellInner">
<img height="6" src="/info/images/icon_happy " ></span>
<span class="resultMainCellInner"
<img height="25" src="/info/images/icon/smile "></span>
</div>
</div>
The table contains 10 x 'resultMainRow' and 4 x 'resultMainCell'.
Inside the 4th resultMainCell, there have 8 span classes, for each holding an img src.
My code is as following:
soup = BeautifulSoup(driver.page_source, 'lxml')
sixsix = soup.findAll("div", {"class": "resultMainTable"})
print (sixsix)
for row in sixsix:
images = soup.findAll('img')
for image in images:
if len(images) == 8:
aaa = images[1].find('src')
bbb = images[2].find('src')
ccc = images[3].find('src')
ddd = images[4].find('src')
eee = images[5].find('src')
fff = images[6].find('src')
ggg = images[7].find('src')
hhh = images[8].find('src')
print ((row.text), (image('src')))

You can try this script to iterate over all rows of the table and extract text from first three cells and 8 URLs from src attributes:
from bs4 import BeautifulSoup
txt = '''
<div class="resultMainTable">
<div class="resultMainRow">
<div class="resultMainCell">text1</div>
<div class="resultMainCell">text2</div>
<div class="resultMainCell">text3</div>
<div class="resultMainCell">
<div>
<div>
<span>
<img src="1" />
<img src="2" />
<img src="3" />
<img src="4" />
<img src="5" />
<img src="6" />
<img src="7" />
<img src="8" />
</span>
</div>
</div>
</div>
</div>
<div class="resultMainRow">
<div class="resultMainCell">text3</div>
<div class="resultMainCell">text4</div>
<div class="resultMainCell">text5</div>
<div class="resultMainCell">
<div>
<div>
<span>
<img src="9" />
<img src="10" />
<img src="11" />
<img src="12" />
<img src="13" />
<img src="14" />
<img src="15" />
<img src="16" />
</span>
</div>
</div>
</div>
</div>
</div>'''
soup = BeautifulSoup(txt, 'html.parser')
for row in soup.select('div.resultMainTable .resultMainRow'):
v1, v2, v3, v4 = row.select('div.resultMainCell')
imgs = [img['src'] for img in v4.select('img')]
print(v1.text, v2.text, v3.text, *imgs)
Prints:
text1 text2 text3 1 2 3 4 5 6 7 8
text3 text4 text5 9 10 11 12 13 14 15 16
EDIT (With real HTML code from edited question):
from bs4 import BeautifulSoup
txt = '''<div class="resultMainTable">
<div class="resultMainRow">
<div class="resultMainCell_1 tableResult2">
<a href="javascript:genResultDetails(2);"
title="Best of the date">20/006 </a></div>
<div class="resultMainCell_2 tableResult2">21/01/2020</div>
<div class="resultMainCell_3 tableResult2"></div>
<div class="resultMainCell_4 tableResult2">
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_3abc"> </span>
<span class="resultMainCellInner">
<img height="25" src = "/info/images/icon/no_14 " ></span>
<span class="resultMainCellInner">
<img height="25" src "/info/images/icon/no_21 " ></span>
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_28 " ></span>
<span class="resultMainCellInner">
<img height="25" src=" /info/images/icon/no_37 "></span>
<span class="resultMainCellInner">
<img height="25" src="/info/images/icon/no_44 "></span>
<span class="resultMainCellInner">
<img height="6" src="/info/images/icon_happy " ></span>
<span class="resultMainCellInner"
<img height="25" src="/info/images/icon/smile "></span>
</div>
</div>'''
soup = BeautifulSoup(txt, 'html.parser')
for row in soup.select('div.resultMainTable .resultMainRow'):
v1, v2, v3, v4 = row.select('div[class^="resultMainCell"]')
imgs = [img['src'] for img in v4.select('img')]
print(v1.text, v2.text, v3.text, *imgs)
Prints:
20/006 21/01/2020 /info/images/icon/no_3abc /info/images/icon/no_14 /info/images/icon/no_28 /info/images/icon/no_37 /info/images/icon/no_44 /info/images/icon_happy

Related

Unable to scrape h1 class with python/beautiful soup

I am trying to scrape a title from an h1 class, but I keep getting "None"
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find('h1', {'class': 'prod-name'})
print(title)
I've also tried using this way:
name_div = soup.find_all('div', {'class': 'col-md-12 col-sm-12 col-xs-12'})[0]
name = name_div.find('h1').text
print(name)
in which case I get: "IndexError: list index out of range"
Can anybody help me out?
This is the source code:
<div class="row attachDetails __web-inspector-hidebefore-shortcut__">
<div class="row">
<div class="col-md-12 col-sm-12 col-xs-12">
<div class="brand-desc">POLO RALPH LAUREN</div>
<h1 class="prod-name">ARAN CREWNECK SWEATER</h1>
<div class="panel-group" id="accordion">
<div class="borders-overview">
<div class="panel-heading">
<h4 class="panel-title">
<label class="overview-label collapsed" data-angle="overview-label" data-toggle="collapse" data-parent="#accordion" href="#collapse1">
<a class="fa fa-angle-up pull-right"></a>
<a class="over-view">OVERVIEW</a>
<span class="color-disp over-view">COLOR: FAWN GREY HEATHER</span>
<span class="style-num over-view">MATERIAL# : 710766783002
</span></label>
</h4>
</div>
<div id="collapse1" class="panel-collapse collapse">
<div class="short-desc-section"></div>
</div>
</div>
<div class="border-details">
<div class="panel-heading">
<h4 class="panel-title">
<label class="prod-details collapsed" data-angle="prod-details" data-toggle="collapse" data-parent="#accordion" href="#collapse2">
<a class="detail-link">Details</a>
<a class="fa fa-angle-up pull-right"></a>
</label>
</h4>
</div>
<div id="collapse2" class="long-desc panel-collapse collapse">
<div><ol><li>STANDARD FIT</li><li>COTTON</li></ol></div>
<ol>
<div><li><b>Board:</b> S196SC23</li></div>
<!--***********************************************************************************************************-->
</ol>
</div>
</div>
</div>
</div>
</div>
</div>

Parsing all elements which have tag before

I have following html code:
<div class="1">
<fieldset>
<legend>AAA</legend>
<div class="row">aaa</div>
<div class="row">aaa</div>
<div class="row">aaa</div>
...
</fieldset>
</div>
<div class="1">
<fieldset>
<legend>BBB</legend>
<div class="row">bbb</div>
<div class="row">bbb</div>
<div class="row">bbb</div>
...
</fieldset>
</div>
I'm trying to display only the text inside all rows, where parent tag is legend BBB (in this example - bbb,bbb,bbb).
Currently I've created the code below, but it doesn't look pretty, and I don't know how to find all rows:
bs = BeautifulSoup(request.txt, 'html.parser')
if(bs.find('legend', text='BBB')):
value = parser.find('legend').next_element.next_element.next_element.get_text().strip()
print(value)
Is there any simply way to do this? div class name is the same, just "legend" is variable.

Added a <legend>CCC</legend> so that you may see it scales.
html = """<div class="1">
<fieldset>
<legend>AAA</legend>
<div class="row">aaa</div>
<div class="row">aaa</div>
<div class="row">aaa</div>
...
</fieldset>
</div>
<div class="1">
<fieldset>
<legend>BBB</legend>
<div class="row">bbb</div>
<div class="row">bbb</div>
<div class="row">bbb</div>
...
</fieldset>
</div>
<div class="1">
<fieldset>
<legend>CCC</legend>
<div class="row">ccc</div>
<div class="row">ccc</div>
<div class="row">ccc</div>
...
</fieldset>
</div>"""
after_tag = bs.find("legend", text="BBB").parent # Grabs parent div <fieldset>.
divs = after_tag.find_all("div", {"class": "row"}) # Finds all div inside parent.
for div in divs:
print(div.text)
bbb
bbb
bbb

from bs4 import BeautifulSoup
html = """
<div class="1">
<fieldset>
<legend>AAA</legend>
<div class="row">aaa</div>
<div class="row">aaa</div>
<div class="row">aaa</div>
...
</fieldset>
</div>
<div class="1">
<fieldset>
<legend>BBB</legend>
<div class="row">bbb</div>
<div class="row">bbb</div>
<div class="row">bbb</div>
...
</fieldset>
</div>
"""
soup = BeautifulSoup(html, features='html.parser')
elements = soup.select('div > fieldset')[1]
tuple_obj = ()
for row in elements.select('div.row'):
tuple_obj = tuple_obj + (row.text,)
print(tuple_obj)
the tuple object prints out
('bbb', 'bbb', 'bbb')

Python - How to use soup with random class characters

So I have been trying to figure out how to scrape a website for a buy/sell site and I have came to a place where I found everything in a HTML but the class contains different random numbers such as:
<div aria-label="Adidas NMD x Bape" class="styled__Wrapper-sc-1kpvi4z-0 eDiSuB" to="/annons/skane/adidas_nmd_x_bape/87267675">
<article class="styled__Article-sc-1kpvi4z-1 hbWRzz">
<div class="styled__ImageWrapper-sc-1kpvi4z-4 kxhCJn">
<div class="ListImage__Wrapper-sc-1rp77jc-0 cvipJS"><img alt="Adidas NMD x Bape" class="ListImage__StyledImg-sc-1rp77jc-1 iwClwW" sizes="
(min-width: 768px) 180px,
120px
" src="https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big" srcset="
https://cdn.blocket.com/pictures/1692451915.jpg?type=thumb 120w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big 180w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal 240w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=store_presentation 360w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal_retina 540w,
" /></div>
</div>
<div class="styled__Content-sc-1kpvi4z-2 dwtNsH">
<div class="styled__LocationTimeWrapper-sc-1kpvi4z-17 dvvNDw">
<div class="styled__SubjectSymbol-sc-1kpvi4z-11 cbBbUz"></div>
<p class="styled__TopInfoWrapper-sc-1kpvi4z-22 kEcJNb"><a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/hela_sverige/personligt/klader_skor?cg=4080&q=bape&st=s">Kläder & skor</a> · <a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/skane/personligt/klader_skor?cg=4080&q=bape&r=23&st=s">Skåne</a></p>
<p class="styled__Time-sc-1kpvi4z-18 bGSnhf">Idag 14:06</p>
</div>
<div class="styled__SubjectWrapper-sc-1kpvi4z-10 kZyTSM">
<h2 class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq styled__StyledTitle-sc-1kpvi4z-6 bSElwy"><a class="Link-sc-139ww1j-0 styled__StyledTitleLink-sc-1kpvi4z-7 edlhAW" href="/annons/skane/adidas_nmd_x_bape/87267675">Adidas NMD x Bape</a></h2></div>
<div class="styled__ParamsWrapper-sc-1kpvi4z-13 cRZIFG"></div>
<div class="styled__SalesInfo-sc-1kpvi4z-20 bbHjGJ">
<div class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq Price__Wrapper-sc-1v2maoc-0 heunWX"><span>3 000 kr<div class="TextCallout2__TextCallout2Wrapper-sc-19qvftl-0 eERYUj Price__StyledVatPrice-sc-1v2maoc-1 hMWxAJ"></div></span></div>
</div>
</div>
</article>
</div>
I do see all the tags I am looking for such as:
Adidas NMD x Bape
3 000 kr
Skåne
/annons/skane/adidas_nmd_x_bape/87267675
https://cdn.blocket.com/pictures/1692451915.jpg
I do have a quite knowledge about soup and how to scrape basic but when it come to this advanced then I am out of my mind so I am here asking what kind of tip you guys can provide me on how I can be able to scrape those values I am looking for?
updated
test = eachPart.select_one('h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a').text
print(test)
print(eachPart.select_one('[aria-label="{}"] img[alt="{}"]'.format(test, test))['src'])
print(eachPart.select_one('h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a')['href'])
print(eachPart.select_one('div[class^="TextSubHeading__TextSubHeadingWrapper"] >span').text)
for test in eachPart.select('p[class^="styled__TopInfoWrapper"] a')[1:]:
print(test.text)

Identify the Parent tag first to find the main tag and then find all child tag.
Use CSS selector which is more convenient.
from bs4 import BeautifulSoup
html='''<div aria-label="Adidas NMD x Bape" caria-label="Adidas NMD x Bape"lass="styled__Wrapper-sc-1kpvi4z-0 eDiSuB" to="/annons/skane/adidas_nmd_x_bape/87267675">
<article class="styled__Article-sc-1kpvi4z-1 hbWRzz">
<div class="styled__ImageWrapper-sc-1kpvi4z-4 kxhCJn">
<div class="ListImage__Wrapper-sc-1rp77jc-0 cvipJS"><img alt="Adidas NMD x Bape" class="ListImage__StyledImg-sc-1rp77jc-1 iwClwW" sizes="
(min-width: 768px) 180px,
120px
" src="https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big" srcset="
https://cdn.blocket.com/pictures/1692451915.jpg?type=thumb 120w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big 180w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal 240w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=store_presentation 360w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal_retina 540w,
" /></div>
</div>
<div class="styled__Content-sc-1kpvi4z-2 dwtNsH">
<div class="styled__LocationTimeWrapper-sc-1kpvi4z-17 dvvNDw">
<div class="styled__SubjectSymbol-sc-1kpvi4z-11 cbBbUz"></div>
<p class="styled__TopInfoWrapper-sc-1kpvi4z-22 kEcJNb"><a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/hela_sverige/personligt/klader_skor?cg=4080&q=bape&st=s">Kläder & skor</a> · <a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/skane/personligt/klader_skor?cg=4080&q=bape&r=23&st=s">Skåne</a></p>
<p class="styled__Time-sc-1kpvi4z-18 bGSnhf">Idag 14:06</p>
</div>
<div class="styled__SubjectWrapper-sc-1kpvi4z-10 kZyTSM">
<h2 class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq styled__StyledTitle-sc-1kpvi4z-6 bSElwy"><a class="Link-sc-139ww1j-0 styled__StyledTitleLink-sc-1kpvi4z-7 edlhAW" href="/annons/skane/adidas_nmd_x_bape/87267675">Adidas NMD x Bape</a></h2></div>
<div class="styled__ParamsWrapper-sc-1kpvi4z-13 cRZIFG"></div>
<div class="styled__SalesInfo-sc-1kpvi4z-20 bbHjGJ">
<div class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq Price__Wrapper-sc-1v2maoc-0 heunWX"><span>3 000 kr<div class="TextCallout2__TextCallout2Wrapper-sc-19qvftl-0 eERYUj Price__StyledVatPrice-sc-1v2maoc-1 hMWxAJ"></div></span></div>
</div>
</div>
</article>
</div>'''
soup=BeautifulSoup(html,"html.parser")
print(soup.select_one('[aria-label="Adidas NMD x Bape"] img[alt="Adidas NMD x Bape"]')['src'])
print(soup.select_one('[aria-label="Adidas NMD x Bape"] h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a').text)
print(soup.select_one('[aria-label="Adidas NMD x Bape"] h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a')['href'])
print(soup.select_one('[aria-label="Adidas NMD x Bape"] div[class^="TextSubHeading__TextSubHeadingWrapper"] >span').text)
Output:
https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big
Adidas NMD x Bape
/annons/skane/adidas_nmd_x_bape/87267675
3 000 kr
EDIT
from bs4 import BeautifulSoup
html='''<div aria-label="Adidas NMD x Bape" class="styled__Wrapper-sc-1kpvi4z-0 eDiSuB" to="/annons/skane/adidas_nmd_x_bape/87267675">
<article class="styled__Article-sc-1kpvi4z-1 hbWRzz">
<div class="styled__ImageWrapper-sc-1kpvi4z-4 kxhCJn">
<div class="ListImage__Wrapper-sc-1rp77jc-0 cvipJS"><img alt="Adidas NMD x Bape" class="ListImage__StyledImg-sc-1rp77jc-1 iwClwW" sizes="
(min-width: 768px) 180px,
120px
" src="https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big" srcset="
https://cdn.blocket.com/pictures/1692451915.jpg?type=thumb 120w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=gallery_big 180w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal 240w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=store_presentation 360w,
https://cdn.blocket.com/pictures/1692451915.jpg?type=mob_iphone_vi_normal_retina 540w,
" /></div>
</div>
<div class="styled__Content-sc-1kpvi4z-2 dwtNsH">
<div class="styled__LocationTimeWrapper-sc-1kpvi4z-17 dvvNDw">
<div class="styled__SubjectSymbol-sc-1kpvi4z-11 cbBbUz"></div>
<p class="styled__TopInfoWrapper-sc-1kpvi4z-22 kEcJNb"><a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/hela_sverige/personligt/klader_skor?cg=4080&q=bape&st=s">Kläder & skor</a> · <a class="Link-sc-139ww1j-0 TopInfoLink__StyledLink-lzfj8j-0 bjnLor" href="/annonser/skane/personligt/klader_skor?cg=4080&q=bape&r=23&st=s">Skåne</a></p>
<p class="styled__Time-sc-1kpvi4z-18 bGSnhf">Idag 14:06</p>
</div>
<div class="styled__SubjectWrapper-sc-1kpvi4z-10 kZyTSM">
<h2 class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq styled__StyledTitle-sc-1kpvi4z-6 bSElwy"><a class="Link-sc-139ww1j-0 styled__StyledTitleLink-sc-1kpvi4z-7 edlhAW" href="/annons/skane/adidas_nmd_x_bape/87267675">Adidas NMD x Bape</a></h2></div>
<div class="styled__ParamsWrapper-sc-1kpvi4z-13 cRZIFG"></div>
<div class="styled__SalesInfo-sc-1kpvi4z-20 bbHjGJ">
<div class="TextSubHeading__TextSubHeadingWrapper-sc-1ilszdp-0 jIvScq Price__Wrapper-sc-1v2maoc-0 heunWX"><span>3 000 kr<div class="TextCallout2__TextCallout2Wrapper-sc-19qvftl-0 eERYUj Price__StyledVatPrice-sc-1v2maoc-1 hMWxAJ"></div></span></div>
</div>
</div>
</article>
</div>'''
soup=BeautifulSoup(html,"html.parser")
print(soup.select_one('[class^="styled__Wrapper-sc-"] img[class^="ListImage__StyledImg-sc-"]')['src'])
print(soup.select_one('[class^="styled__Wrapper-sc-"] h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a').text)
print(soup.select_one('[class^="styled__Wrapper-sc-"] h2[class^="TextSubHeading__TextSubHeadingWrapper"] >a')['href'])
print(soup.select_one('[class^="styled__Wrapper-sc-"] div[class^="TextSubHeading__TextSubHeadingWrapper"] >span').text)

How to extract image/href url from div class using scrapy

I having hard time to extract href url from given website code
<div class="expando expando-uninitialized" style="display: none" data-cachedhtml=" <div class="media-preview" id="media-preview-66hch1" style="max-width: 534px"> <div class="media-preview-content"> <a href="https://i.redd.it/nctvpvsnbpsy.jpg" class="may-blank"> <img class="preview" src="https://i.redditmedia.com/UELqh-mbh5mwnXr67PoBbi23nwZuNl2v3flNbkmewQE.jpg?w=534&amp;s=1426be7f811e5d5043760f8882674070" width="534" height="768"> </a> </div> </div> " data-pin-condition="function() {return this.style.display != 'none';}"><span class="error">loading...</span></div>

Probably, you can use regular expressions for this. Here is example:
s = """<div class="expando expando-uninitialized" style="display: none" data-cachedhtml=" <div class="media-preview" id="media-preview-66hch1" style="max-width: 534px"> <div class="media-preview-content"> <a href="https://i.redd.it/nctvpvsnbpsy.jpg" class="may-blank"> <img class="preview" src="https://i.redditmedia.com/UELqh-mbh5mwnXr67PoBbi23nwZuNl2v3flNbkmewQE.jpg?w=534&amp;s=1426be7f811e5d5043760f8882674070" width="534" height="768"> </a> </div> </div> " data-pin-condition="function() {return this.style.display != 'none';}"><span class="error">loading...</span></div>"""
re.search('href="(.*jpg)&quot', s).groups()[0]
# 'https://i.redd.it/nctvpvsnbpsy.jpg'

BeautifulSoup: scraping for a span gives me a result, for another span it gives "None"

I am coding a scraper for Etsy and when I scrape the span for reviews I get the right output. However when I scrape for the span with the price it gives me only None values and I don't understand why. If someone could help, it would be great!
#html parsing
page_soup = soup(page_html, "html.parser")
#grabs each listing card
divs = page_soup.find_all("div", {"class": "v2-listing-card__shop"})
for i in divs:
shop = i.p.text
reviews = i.find("span", {"class" : "text-body-smaller text-gray-lighter display-inline-block vertical-align-middle icon-b-1"})
prices = i.find("span", {"class" : "currency-value"})
print shop
print reviews.text
print prices
Here are the two span elements as on the website:
<div class="v2-listing-card__info">
<p class="text-gray text-truncate mb-xs-0 text-body">
Blush Watercolor Flowers & Leaves with Different Shades Clipart Separate Elements Hand Painted Commercial Use | S15 Fairy Tale
</p>
<div class="v2-listing-card__shop">
<p class="text-gray-lighter text-body-smaller display-inline-block mr-xs-1">PatishopArt</p>
<div class="v2-listing-card__rating icon-t-2">
<div class="stars-svg stars-smaller ">
<input name="initial-rating" type="hidden" value="5"/>
<input name="rating" type="hidden" value="5"/>
<span class="screen-reader-only">5 out of 5 stars</span>
<div aria-hidden="true" class="rating lit rating-first icon-b-2" data-rating="1">
<span class="etsy-icon stars-svg-star" title="Disappointed"><svg aria-hidden="true" focusable="false" viewbox="3 3 18 18" xmlns="http://www.w3.org/2000/svg"><path d="M19.985,10.36a0.5,0.5,0,0,0-.477-0.352H14.157L12.488,4.366a0.5,0.5,0,0,0-.962,0l-1.67,5.642H4.5a0.5,0.5,0,0,0-.279.911L8.53,13.991l-1.5,5.328a0.5,0.5,0,0,0,.741.6l4.231-2.935,4.215,2.935a0.5,0.5,0,0,0,.743-0.6l-1.484-5.328,4.306-3.074A0.5,0.5,0,0,0,19.985,10.36Z"></path></svg></span>
<div class="rating lit" data-rating="2">
<span class="etsy-icon stars-svg-star" title="Not a fan"><svg aria-hidden="true" focusable="false" viewbox="3 3 18 18" xmlns="http://www.w3.org/2000/svg"><path d="M19.985,10.36a0.5,0.5,0,0,0-.477-0.352H14.157L12.488,4.366a0.5,0.5,0,0,0-.962,0l-1.67,5.642H4.5a0.5,0.5,0,0,0-.279.911L8.53,13.991l-1.5,5.328a0.5,0.5,0,0,0,.741.6l4.231-2.935,4.215,2.935a0.5,0.5,0,0,0,.743-0.6l-1.484-5.328,4.306-3.074A0.5,0.5,0,0,0,19.985,10.36Z"></path></svg></span>
<div class="rating lit" data-rating="3">
<span class="etsy-icon stars-svg-star" title="It's okay"><svg aria-hidden="true" focusable="false" viewbox="3 3 18 18" xmlns="http://www.w3.org/2000/svg"><path d="M19.985,10.36a0.5,0.5,0,0,0-.477-0.352H14.157L12.488,4.366a0.5,0.5,0,0,0-.962,0l-1.67,5.642H4.5a0.5,0.5,0,0,0-.279.911L8.53,13.991l-1.5,5.328a0.5,0.5,0,0,0,.741.6l4.231-2.935,4.215,2.935a0.5,0.5,0,0,0,.743-0.6l-1.484-5.328,4.306-3.074A0.5,0.5,0,0,0,19.985,10.36Z"></path></svg></span>
<div class="rating lit" data-rating="4">
<span class="etsy-icon stars-svg-star" title="Like it"><svg aria-hidden="true" focusable="false" viewbox="3 3 18 18" xmlns="http://www.w3.org/2000/svg"><path d="M19.985,10.36a0.5,0.5,0,0,0-.477-0.352H14.157L12.488,4.366a0.5,0.5,0,0,0-.962,0l-1.67,5.642H4.5a0.5,0.5,0,0,0-.279.911L8.53,13.991l-1.5,5.328a0.5,0.5,0,0,0,.741.6l4.231-2.935,4.215,2.935a0.5,0.5,0,0,0,.743-0.6l-1.484-5.328,4.306-3.074A0.5,0.5,0,0,0,19.985,10.36Z"></path></svg></span>
<div class="rating lit" data-rating="5">
<span class="etsy-icon stars-svg-star" title="Love it"><svg aria-hidden="true" focusable="false" viewbox="3 3 18 18" xmlns="http://www.w3.org/2000/svg"><path d="M19.985,10.36a0.5,0.5,0,0,0-.477-0.352H14.157L12.488,4.366a0.5,0.5,0,0,0-.962,0l-1.67,5.642H4.5a0.5,0.5,0,0,0-.279.911L8.53,13.991l-1.5,5.328a0.5,0.5,0,0,0,.741.6l4.231-2.935,4.215,2.935a0.5,0.5,0,0,0,.743-0.6l-1.484-5.328,4.306-3.074A0.5,0.5,0,0,0,19.985,10.36Z"></path></svg></span>
</div>
</div>
</div>
</div>
</div>
</div>
<span class="text-body-smaller text-gray-lighter display-inline-block vertical-align-middle icon-b-1">(110)</span>
</div>
</div>
<p class="n-listing-card__price text-gray strong mt-xs-0">
<span class="currency-symbol">$</span><span class="currency-value">6.60</span>
</p>
<!-- This shows Free shipping on its own line , we only show it if it wasn't shown above -->
</div>

You are only checking in divs of type listing-card__shop but it looks to me as if the span in question, is outside of those divs

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How can I iterate through the full web table with beautifulsoup? - python

Related

Unable to scrape h1 class with python/beautiful soup

Parsing all elements which have tag before

Python - How to use soup with random class characters

How to extract image/href url from div class using scrapy

BeautifulSoup: scraping for a span gives me a result, for another span it gives "None"

Categories

Resources