Selenium Python cannot get elements

Selenium Python cannot get elements - python

Here is the HTML code:
<i>
<div id="Calendar">
<div class="Title">booking</div>
<div class="calendarHolder">
<div class="month">
<div class="left"></div>
<span class="monthName">APRIL 2018</span>
<div class="right"></div>
</div>
<div class="dayHolder">
<div class="day holiday"><div class="dCell">SUN</div></div>
<div class="day"><div class="dCell">MON</div></div>
<div class="day"><div class="dCell">TUE</div></div>
<div class="day"><div class="dCell">WED</div></div>
<div class="day"><div class="dCell">THU</div></div>
<div class="day"><div class="dCell">FRI</div></div>
<div class="day last"><div class="dCell">SAT</div></div>
<div class="clearboth"></div>
</div>
<div class="dateHolder">
<div class="date"><div class="dCell">1</div></div>
<div class="date"><div class="dCell">2</div></div>
<div class="date"><div class="dCell">3</div></div>
<div class="date"><div class="dCell">4</div></div>
<div class="date"><div class="dCell">5</div></div>
<div class="date"><div class="dCell">6</div></div>
<div class="date last"><div class="dCell">7</div></div>
<div class="date"><div class="dCell">8</div></div>
<div class="date"><div class="dCell">9</div></div>
<div class="date"><div class="dCell">10</div></div>
<div class="date"><div class="dCell">11</div></div>
<div class="date"><div class="dCell">12</div></div>
<div class="date"><div class="dCell">13</div></div>
<div class="date last"><div class="dCell">14</div></div>
<div class="date time_day stin"><div class="dCell">15</div>
<input type="hidden" name="pid" value="5000">
</div>
<div class="date"><div class="dCell">16</div></div>
</div>
</i>
There is a calendar which I need to click on an available date to perform further actions.
What I actually need is to click on the one with the class name "date time_day stin".
I have simply try the exact Xpath and also the following but it return error:
No such element: Unable to locate element:
{"method":"class name","selector":"date.time_night.stin"}
driver.find_element_by_class_name('date.time_day.stin').click()
It return the same error by:
Dates =
driver.find_element_by_css_selector("div#date.time_day.stin>div").text
print (Dates)
Then I have tried different things of whether can I get the text of each div to find out what's the problem.
The only things I can get among all the find_element(s) trial are texts "SUN" to "SAT" with Class name "dCell" by:
lst = []
driver.find_elements_by_css_selector("div#startBookingBlock div.dCell")
for i in calendar:
lst.append(i.text)
print (lst)
But it still didn't return the others dates by the same class names, so.
I revamp it as the following and it returns "[]":
lst = []
calendar = driver.find_elements_by_css_selector("div#dateHolder div.dCell")
for i in calendar:
lst.append(i.text)
print (lst)
Then I tried to write more specifically as follow:
calendar = driver.find_elements_by_xpath("//div[#id='startBookingBlock' and #class='dateHolder' and #class='date' and #class='dCell']")
print (calendar)
However, it still prints "[]".
It can't seem to get anything under the "dateHolder" class but I just cannot figure out why would this happen, can anyone suggest? Thanks!

I think your HTML code is broken -- it's missing one </input> after the input tag and two </div>s at the end.
Try running your code on this one (I just added those three):
<div id="Calendar">
<div class="Title">booking</div>
<div class="calendarHolder">
<div class="month">
<div class="left"></div>
<span class="monthName">APRIL 2018</span>
<div class="right"></div>
</div>
<div class="dayHolder">
<div class="day holiday"><div class="dCell">SUN</div></div>
<div class="day"><div class="dCell">MON</div></div>
<div class="day"><div class="dCell">TUE</div></div>
<div class="day"><div class="dCell">WED</div></div>
<div class="day"><div class="dCell">THU</div></div>
<div class="day"><div class="dCell">FRI</div></div>
<div class="day last"><div class="dCell">SAT</div></div>
<div class="clearboth"></div>
</div>
<div class="dateHolder">
<div class="date"><div class="dCell">1</div></div>
<div class="date"><div class="dCell">2</div></div>
<div class="date"><div class="dCell">3</div></div>
<div class="date"><div class="dCell">4</div></div>
<div class="date"><div class="dCell">5</div></div>
<div class="date"><div class="dCell">6</div></div>
<div class="date last"><div class="dCell">7</div></div>
<div class="date"><div class="dCell">8</div></div>
<div class="date"><div class="dCell">9</div></div>
<div class="date"><div class="dCell">10</div></div>
<div class="date"><div class="dCell">11</div></div>
<div class="date"><div class="dCell">12</div></div>
<div class="date"><div class="dCell">13</div></div>
<div class="date last"><div class="dCell">14</div></div>
<div class="date time_day stin"><div class="dCell">15</div>
<input type="hidden" name="pid" value="5000"></input>
</div>
<div class="date"><div class="dCell">16</div></div>
</div></div></div>
The XML that works for me (in an online XPath tester) is as follows:
div[#id="Calendar"]/div[#class="calendarHolder"]/div[#class="dateHolder"]/div[#class="date time_day stin"]
Does this give the element you want?

Related

Unable to scrape h1 class with python/beautiful soup

I am trying to scrape a title from an h1 class, but I keep getting "None"
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find('h1', {'class': 'prod-name'})
print(title)
I've also tried using this way:
name_div = soup.find_all('div', {'class': 'col-md-12 col-sm-12 col-xs-12'})[0]
name = name_div.find('h1').text
print(name)
in which case I get: "IndexError: list index out of range"
Can anybody help me out?
This is the source code:
<div class="row attachDetails __web-inspector-hidebefore-shortcut__">
<div class="row">
<div class="col-md-12 col-sm-12 col-xs-12">
<div class="brand-desc">POLO RALPH LAUREN</div>
<h1 class="prod-name">ARAN CREWNECK SWEATER</h1>
<div class="panel-group" id="accordion">
<div class="borders-overview">
<div class="panel-heading">
<h4 class="panel-title">
<label class="overview-label collapsed" data-angle="overview-label" data-toggle="collapse" data-parent="#accordion" href="#collapse1">
<a class="fa fa-angle-up pull-right"></a>
<a class="over-view">OVERVIEW</a>
<span class="color-disp over-view">COLOR: FAWN GREY HEATHER</span>
<span class="style-num over-view">MATERIAL# : 710766783002
</span></label>
</h4>
</div>
<div id="collapse1" class="panel-collapse collapse">
<div class="short-desc-section"></div>
</div>
</div>
<div class="border-details">
<div class="panel-heading">
<h4 class="panel-title">
<label class="prod-details collapsed" data-angle="prod-details" data-toggle="collapse" data-parent="#accordion" href="#collapse2">
<a class="detail-link">Details</a>
<a class="fa fa-angle-up pull-right"></a>
</label>
</h4>
</div>
<div id="collapse2" class="long-desc panel-collapse collapse">
<div><ol><li>STANDARD FIT</li><li>COTTON</li></ol></div>
<ol>
<div><li><b>Board:</b> S196SC23</li></div>
<!--***********************************************************************************************************-->
</ol>
</div>
</div>
</div>
</div>
</div>
</div>

Beautifulsoup: Get a range of divs

I just found out about how to process webpages in python using BeautifulSoup.
There's a list of div from which I want to get those in a specific range. The range is defined by two div that have a h2 child.
How would I do that? Thank you for your support!
EDIT: I added an actual representation of my html code below instead of a previous "simplified" version that was missing tags.
The new code shows a root div with class foo-bar-details.
Nested are 9 div tags. Two of which have a nested h2 tag. All of those 9 div tags contain img elements deeply nested within. What I need is each img element of those divs that are between the ones containing the h2 element.
An expected outcome if applied to the html code below would be:
<img src="../../images/123456_thumb.jpg" alt="Image 123456" title="Image 123456">
<img src="../../images/67890_thumb.JPG" alt="Image 67890 " title="Image 67890">
This is the html code:
<div class="foo-bar-details">
<div class="padding-y-10 padding-x-40 gray-sand-bg" id="sec-feat-3-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>fsuhfsdf </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Feat</strong><span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/39826_thumb.JPG" alt="Image 39826" title="Image 39826 ">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="padding-y-10 padding-x-40 gray-sand-bg" id="sec-feat-3-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>JHFDFD </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Feat</strong><span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/223234_thumb.JPG" alt="Image 223234" title="Image 223234 ">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="padding-y-10 padding-x-40 gray-sand-bg" id="sec-feat-3-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>sdfsdf </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Feat</strong><span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/223823_thumb.JPG" alt="Image 223823" title="Image 223823 ">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="element-header mystic-bg padding-y-10 padding-x-20" id="elem-4">
<h2 class="h3 margin-bottom-5">
Foo
</h2>
<ul class="list-inline margin-0">
<li> Foo feature </li>
...
</ul>
</div>
<div id="info-panel-header" class="padding-y-10 padding-x-40">
<div class="row">
<div class="col-se-6 element-info">
<div class="col-se-12">
<div class="row">
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/123456_thumb.jpg" alt="Image 123456" title="Image 123456">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="padding-y-10 padding-x-40 gray-wild-sand-bg" id="sec-feat-4-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Foo strin: </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Barbar</strong><span class="icon-help"></span>
</p>
</div>
</div>
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Mine: </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
TEST<span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/67890_thumb.JPG" alt="Image 67890 " title="Image 67890">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="element-header mystic-bg padding-y-10 padding-x-20" id="elem-5">
<h2 class="h3 margin-bottom-5">
Bar
</h2>
<ul class="list-inline margin-0">
<li> Bar feature </li>
...
</ul>
</div>
<div class="padding-y-10 padding-x-40 gray-sand-bg" id="sec-feat-3-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>fsuhfsdf </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Feat</strong><span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/39826_thumb.JPG" alt="Image 39826" title="Image 39826 ">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
<div class="padding-y-10 padding-x-40 gray-sand-bg" id="sec-feat-3-1">
<div class="row">
<div class="col-sm-6 info-panel">
<div class="row">
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>fsuhfsdf </strong>
</p>
</div>
<div class="col-sm-6 margin-bottom-10">
<p class="margin-0">
<strong>Feat</strong><span class="icon-help"></span>
</p>
</div>
</div>
</div>
<div class="col-sm-6 foo-images">
<div class="row">
<img src="../../images/209876_thumb.JPG" alt="Image 209876" title="Image 209876 ">
<div class="img-description">
</div>
</div>
</div>
</div>
</div>
</div>

Here is a solution involving lxml.html:
We extract all divs between the first and last divs which contain an h2 tag:
import lxml.html
# HTML file saved as "file.html"
file_name = "file.html"
with open(file_name, 'r') as f:
tree = lxml.html.fromstring(f.read())
# all_div = tree.findall('div')
all_div = tree.find_class('foo-bar-details')[0].findall('div')
start, stop = None, None
for k, div in enumerate(all_div):
if div.findall('h2') and start is None:
print("Range starts at %d" % k)
start = k
continue
if div.findall('h2') and start is not None:
print("Range stops at %d" % k)
stop = k + 1 # add one as range stops at k - 1
continue
# div_list = all_div[start:stop]
img_list = [_.xpath('.//img') for _ in all_div[start:stop]]
print(img_list)
# [[], [<Element img at 0x20b58d73f40>], [<Element img at 0x20b58d73f90>], []]
# Or
img_list = [_.xpath('.//img/#src') for _ in all_div[start:stop]]
print(img_list)
# [[], ['../../images/123456_thumb.jpg'], ['../../images/67890_thumb.JPG'], []]

Another solution involving SimplifiedDoc:
from simplified_scrapy.simplified_doc import SimplifiedDoc
html ='''
<div class="foo-bar-details">
<div class="element-header mystic-bg padding-y-10 padding-x-20" id="elem-4">
<h2 class="h3 margin-bottom-5">
Foo
</h2>
<ul class="list-inline margin-0">
<li> Foo feature </li>
...
</ul>
</div>
<div id="info-panel-header" class="padding-y-10 padding-x-40">Test 1</div>
<div class="padding-y-10 padding-x-40 gray-wild-sand-bg" id="foo-feat-4-1">Test 2</div>
<div class="padding-y-10 padding-x-40 " id="foo-feat-4-2">Test 3</div>
<div class="padding-y-10 padding-x-40 gray-wild-sand-bg" id="foo-feat-4-3">Test 4</div>
<div class="element-header mystic-bg padding-y-10 padding-x-20" id="elem-5">
<h2 class="h3 margin-bottom-5">
Bar
</h2>
<ul class="list-inline margin-0">
<li> Bar feature </li>
...
</ul>
</div>
</div>
'''
doc = SimplifiedDoc(html)
divs = doc.select('div.foo-bar-details').divs.contains('<h2')
print ([div.id for div in divs])
divs = doc.select('div.foo-bar-details').divs.notContains('<h2')
print ([div.id for div in divs])
Result:
['elem-4', 'elem-5']
['info-panel-header', 'foo-feat-4-1', 'foo-feat-4-2', 'foo-feat-4-3']
Simplifieddoc library does not rely on the third-party library, which is lighter and faster, perfect for beginners.
Here are more examples here

If I understand you correctly, you want to find <img> tags and corresponding <h2> to which the images belong to.
This example (txt variable contains the HTML snippet from your question):
from bs4 import BeautifulSoup
soup = BeautifulSoup(txt, 'html.parser')
out = {}
for img in soup.select('div:has(h2) ~ div img'):
out.setdefault(img.find_previous('h2').get_text(strip=True), []).append(img['src'])
from pprint import pprint
pprint(out)
Prints:
{'Bar': ['../../images/39826_thumb.JPG', '../../images/209876_thumb.JPG'],
'Foo': ['../../images/123456_thumb.jpg', '../../images/67890_thumb.JPG']}

Iterate div tags using lxml and retrieve text for a dictionary in python

I just came to know lxmlx in python and I'm in the need for some help as I have no experience with XPath.
I want to get text data from a webpage into a dictionary.
I'm referring to the html snippet I posted below. Within the original html page there's a div element of the class general-info that I retrieve using the following line:
general_info = document_tree.xpath("//div[contains(concat(' ', normalize-space(#class), ' '), 'general-info')]")
From here on I want to iterate over the nested divs and get the 2 <p> tags as key and value. The text inside the <strong> being the key.
There can also be empty div tags and there can be a special case where the key and the value for the dictionary can be within the same div (see the last element).
EDIT:
The number of elements can change, so it would be best to use the <strong> tags as starting point and then search for the next <p> tag.
This is code that I was able to write using BeautifulSoup:
generalinfo = documentSoup.findAll("div", {"class": "general-info"})
if generalinfo:
strongs = generalinfo[0].find_all('strong')
for descr in strongs:
p = descr.find_next_sibling("p")
if p:
key = descr.text.strip().rstrip(':')
details_dict[key] = p.text.strip()
else:
nextdiv = descr.parent.parent.find_next_sibling("div")
if nextdiv:
child = nextdiv.findChild()
if child:
key = descr.text.strip()[:-1]
details_dict[key] = child.text.strip()
I am going for the following output:
['Title:' : 'This is a title',
'Owner:' : 'This is an owner',
'Category:' : 'This is a categroy',
'Type:' : 'This is a type',
'Special case:' : 'This is a special case']
If anyone can help me out here I will appreciate this!
html code:
<body>
<main>
<div>
...
<div class="general-info margin-bottom-20 margin-top-20">
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Title:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a title</p>
</div>
</div>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Owner:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is an owner</p>
</div>
</div>
<h2 class="h3 margin-top-10 margin-bottom-10 padding-x-20">Validity</h2>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Category:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a category</p>
</div>
</div>
<div class="row padding-x-40"></div>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Type:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a type</p>
</div>
</div>
<div class="row padding-x-40">
<div>
<strong>Special case:</strong>
<p>This is a special case</p>
</div>
</div>
</div>
...

I believe this is about as generalized as I can get given the html provided:
general_info = doc.xpath("//div[contains(concat(' ', normalize-space(#class), ' '), 'general-info')]//p[#class='margin-0']")
for i in general_info :
if len(i.xpath('./strong/text()'))>0:
topic = i.xpath('./strong/text()')[0]
if len(i.text.strip())>0:
entry += i.text.replace('\n','').strip()
print(topic+' '+i.text.replace('\n','').strip())
special = general_info[0].xpath('./ancestor::div[#class="general-info margin-bottom-20 margin-top-20"]//div/div/strong')[0]
print(special.text+" ",special.xpath('./following-sibling::p/text()')[0])
Output:
('Title: This is a title',
'Owner: This is an owner',
'Category: This is a category',
'Type: This is a type',
'Special case: This is a special case')

I recommend another solution, which is very suitable for extracting data from XML.
from simplified_scrapy.spider import SimplifiedDoc
html='''
<body>
<main>
<div class="general-info margin-bottom-20 margin-top-20">
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Title:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a title</p>
</div>
</div>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Owner:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is an owner</p>
</div>
</div>
<h2 class="h3 margin-top-10 margin-bottom-10 padding-x-20">Validity</h2>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Category:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a category</p>
</div>
</div>
<div class="row padding-x-40"></div>
<div class="row padding-x-20">
<div class="col-sm-4">
<p class="margin-0">
<strong>Type:</strong>
</p>
</div>
<div class="col-sm-8">
<p class="margin-0">This is a type</p>
</div>
</div>
<div class="row padding-x-40">
<div>
<strong>Special case:</strong>
<p>This is a special case</p>
</div>
</div>
</div>
'''
data={}
doc = SimplifiedDoc(html) # create doc
divs = doc.selects('div.general-info')
# First way
for div in divs:
strongs = div.strongs
for strong in strongs:
p = strong.next
if not p:
p=strong.parent.next
data[strong.text]=p.text
print(data)
data={}
# Second way
for div in divs:
ds = div.selects('strong|p>text()')
for i in range(0,len(ds),2):
data[ds[i]]=ds[i+1]
print(data)
Result:
{'Title:': 'This is a title', 'Owner:': 'This is an owner', 'Category:': 'This is a category', 'Type:': 'This is a type', 'Special case:': 'This is a special case'}
{'Title:': 'This is a title', 'Owner:': 'This is an owner', 'Category:': 'This is a category', 'Type:': 'This is a type', 'Special case:': 'This is a special case'}
Here are more examples:https://github.com/yiyedata/simplified-scrapy-demo/blob/master/doc_examples/

Beautifulsoup access component inside?

This is the structure I get after using find() one time.
<component-thread-list :assignee-id="'44756'" :assignee-type="'professional'" :assignee-username="'bs-dangquanghuy'" :change-comment="false" :change-link="''" :email="'drdangquanghuy#gmail.com'" :is-linked-with-place="false" :is-staff="false" :tag-name="''" :tag-slug="''" :thread-create-share-button-showing="false" :verified="'True'" :view-name="'professional-detail'" assignee-name="Đặng Quang Huy" occupation="Bác sĩ" professional-name="">
<div class="loading-screen">
<div class="timeline-item no-margins">
<div class="animated-background facebook">
<div class="background-masker header-top"></div>
<div class="background-masker header-left"></div>
<div class="background-masker header-right"></div>
<div class="background-masker header-bottom"></div>
<div class="background-masker subheader-left"></div>
<div class="background-masker subheader-right"></div>
<div class="background-masker subheader-bottom"></div>
<div class="background-masker content-top"></div>
<div class="background-masker content-first-end"></div>
<div class="background-masker content-second-line"></div>
<div class="background-masker content-second-end"></div>
<div class="background-masker content-third-line"></div>
<div class="background-masker content-third-end"></div>
</div>
</div>
</div>
</component-thread-list>
How can I access the email address with Beautiful soup?

You can use this:
soup = BeautifulSoup(xml, 'lxml') # xml is the XML you've provided
email = soup.find('component-thread-list')[':email']
print(email)
# 'drdangquanghuy#gmail.com'
print(email[1:-1])
# drdangquanghuy#gmail.com

How to find desired data within multiple div in beautifulsoup

this is the html code
i am trying to select data within multiple div tags
<div class="details-wrapper apps-secondary-color">
<div class="details-section metadata">
<div class="details-section-heading">
<div class="details-section-contents">
<div class="meta-info">
<div class="title">Updated</div>
<div class="content" itemprop="datePublished">March 7, 2016</div>
</div>
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info meta-info-wide">
<div class="details-sharing-section">
</div>
<div class="details-section-divider"></div>
</div>
</div>
</div>
i want to select March 7,2016
how can i select this in beautifulsoup

You can use soup.find('div', {'itemprop': 'datePublished'}) to select the div element with itemprop datePublished.
Demo
from bs4 import BeautifulSoup
content = '''<div class="details-wrapper apps-secondary-color">
<div class="details-section metadata">
<div class="details-section-heading">
<div class="details-section-contents">
<div class="meta-info">
<div class="title">Updated</div>
<div class="content" itemprop="datePublished">March 7, 2016</div>
</div>
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info">
<div class="meta-info contains-text-link">
<div class="meta-info">
<div class="meta-info meta-info-wide">
<div class="details-sharing-section">
</div>
<div class="details-section-divider"></div>
</div>
</div>
</div>'''
soup = BeautifulSoup(content)
date = soup.find('div', {'itemprop':'datePublished'})
print(date.text)
Output
March 7, 2016

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Selenium Python cannot get elements - python

Related

Unable to scrape h1 class with python/beautiful soup

Beautifulsoup: Get a range of divs

Iterate div tags using lxml and retrieve text for a dictionary in python

Beautifulsoup access component inside?

How to find desired data within multiple div in beautifulsoup

Categories

Resources