input suggestions webscraping selenium python - python

Input Element:
<input type="text" auto_complete_item_format="<b xmlns="none">[_full_name]</b>" auto_complete_display_field="_full_name" auto_complete_id_field="suburb_id" auto_complete_search_fields="_full_name,suburb" auto_complete_minimum_characters="1" auto_complete_data_source_web_method="" auto_complete_data_source_object="" auto_complete_data_source_function="get_suburb_suggestions" auto_complete_width="300px" onblur="smart_update(this);" onkeypress="smart_keypress(this, event); " onchange="smart_date_onchange(this); " selected_id="-1" original_value="" value="3000" smart_field_type="auto_complete" class="smart_auto_complete" id="suburb" name="suburb" field="jims.addresses.smart_table.suburb" record_id="-1" bind="True" auto_complete_cursor="2" autocomplete="off" fdprocessedid="5bithf" auto_complete_last_search_string="300" selected_row_index="-1" style="width: 300px;">
Suggestions List:
<div id="suburb_list" name="suburb_list" style="display: none; position: absolute; border: 1px solid rgb(0, 0, 0); z-index: 100; left: 533px; height: 400px; width: 300px; background-color: white; opacity: 0.93; overflow: auto;"><div id="suburb_list_item_0" list_index="0" row_index="0" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(221, 221, 221); padding: 3px;"><b xmlns="none">150 Lonsdale Street, Melbourne 3000 VIC</b></div><div id="suburb_list_item_1" list_index="1" row_index="1" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(255, 255, 255); padding: 3px;"><b xmlns="none">1 Elizabeth Street, Melbourne 3000 VIC</b></div><div id="suburb_list_item_2" list_index="2" row_index="4" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: lightblue; padding: 3px;"><b xmlns="none">Carlton 3000 VIC</b></div><div id="suburb_list_item_3" list_index="3" row_index="5" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(255, 255, 255); padding: 3px;"><b xmlns="none">Docklands 3000 VIC</b></div><div id="suburb_list_item_4" list_index="4" row_index="10" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(221, 221, 221); padding: 3px;"><b xmlns="none">East Melbourne 3000 VIC</b></div><div id="suburb_list_item_5" list_index="5" row_index="12" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(255, 255, 255); padding: 3px;"><b xmlns="none">Footscray 3000 VIC</b></div><div id="suburb_list_item_6" list_index="6" row_index="15" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(221, 221, 221); padding: 3px;"><b xmlns="none">Melbourne 3000 VIC</b></div><div id="suburb_list_item_7" list_index="7" row_index="19" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(255, 255, 255); padding: 3px;"><b xmlns="none">Roxburgh Park 3000 VIC</b></div><div id="suburb_list_item_8" list_index="8" row_index="28" style="border-width: 1px; border-color: rgb(170, 170, 170); border-style: solid; background-color: rgb(221, 221, 221); padding: 3px;"><b xmlns="none">West Melbourne 3000 VIC</b></div></div>
My Code:
for num in range(30):
try:
Element = browser.find_element(By.XPATH, f'//*[#id="suburb_list_item_{num}"]')
print(Element)
except:
break
The two html code at the top are for a input suggestion list and i am trying to scrape of the website. The python code is the code i used and it should print all the elements. But it does not because the suggestion list is clicked. How could i click the text box because this didn't work:
Element = browser.find_element(By.XPATH, '//*[#id="suburb"]')
the element could not be found.
I am not allowed to share the website link.

You can try:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# <insert other code here for setting up the browser>
input_element = wait.until(EC.visibility_of_element_located((By.ID, "suburb")))
input_element.click()
wait = WebDriverWait(browser, 10)
element = wait.until(EC.visibility_of_element_located((By.ID, "suburb_list")))

Related

How to remove the container background color from QScrollBar?

Is there a way to remove the container background (seen in attached pix as white around the rounded top/bottom of the scroll bar.
Here's my stylesheet:
QScrollBar:horizontal, QScrollBar:vertical
{border: none; border-radius: 0.5em; background-color: #bfbfbf; margin: 0px; padding: 0px;}
QScrollBar:horizontal
{height: 1em;}
QScrollBar:vertical
{width: 1em;}
QScrollBar::handle:horizontal, QScrollBar::handle:vertical
{background-color: #404040; border-radius: 0.5em;}
QScrollBar::handle:horizontal, QScrollBar::handle:vertical
{min-width: 1em; min-height: 1em;}
QScrollBar::add-line:horizontal, QScrollBar::sub-line:horizontal, QScrollBar::add-page:horizontal,
QScrollBar::sub-page:horizontal
{width: 0px;}
QScrollBar::sub-line:vertical, QScrollBar::add-line:vertical, QScrollBar::add-page:vertical,
QScrollBar::sub-page:vertical
{height: 0px;}

Adding scroll to the final output view using ipywidgets

I am trying to create an app based on ipywidgets. But unfortunately the scroll is not appearing as I expect it to be. Instead the scrolls are appearing on every output element, whereas I want a single scroll if the height of the content in any of the output element exceeds. Moreover it should not cut the visibility of the content in any of the output.
%%html
<style>
.out1{
width: 700px;
height: 400px;
display: flex;
justify-content: center;
padding: 10px 10px 10px 10px;
border: 1px solid black;
}
.out2{
width: 350px;
height: 400px;
display: flex;
justify-content: center;
padding: 20px 5px 10px 10px;
border: 1px solid black;
}
.out3{
width: 350px;
height: 400px;
display: flex;
justify-content: center;
padding: 20px 5px 10px 10px;
border: 1px solid black;
}
.out4{
width: 700px;
height: 400px;
display: flex;
justify-content: center;
padding: 20px 10px 0px 10px;
border: 1px solid black;
}
.out_box{
border: 1px solid black;
height: 800px;
width: 700px;
overflow: auto;
}
</style>
out1=widgets.Output()
out1.add_class('out1')
out2=widgets.Output()
out2.add_class('out2')
out3=widgets.Output()
out3.add_class('out3')
out4=widgets.Output()
out4.add_class('out4')
box1=widgets.Box([out1], scroll=False)
box2=widgets.HBox([out2,out3], scroll=False)
box3=widgets.Box([out4], scroll=False)
out_box=widgets.VBox([box1,box2,box3])
out_box.add_class('out_box')
Attached is the snap, which I am getting. Results with multiple scroll bars. But I want a single scroll bar at the right end.

How to extract text and remove HTML tags in another column

I have a free text field column in python dataframe with html tags.
ID Free text field
1 <p><span style="background-color: rgb(255, 255, 255); color: rgb(37, 36, 35); font-family:
Arial; font-size: 10.5pt;">TExt1:</span></p><p><span style="background-color: rgb(255, 255,
255); color: rgb(37, 36, 35); font-family: Arial; font-size: 10.5pt;">Score: 5</span></p><p>
<span style="background-color: rgb(255, 255, 255); color: rgb(37, 36, 35); font-family: Arial;
font-size: 10.5pt;">B - </span><span style="background-color: rgb(255, 255, 255); color:
rgb(36, 36, 36); font-family: Arial; font-size: 10.5pt;">TExt2</span></p><p><span
style="background-color: rgb(255, 255, 255); color: rgb(37, 36, 35); font-family: Arial;
font-size: 10.5pt;">Text6</span></p><p><span style="background-color: rgb(255, 255, 255);
color: rgb(37, 36, 35); font-family: Arial; font-size: 10.5pt;">Text3</span></p><p><span
style="background-color: rgb(255, 255, 255); color: rgb(37, 36, 35); font-family: Arial;
font-size: 10.5pt;">Text4</span></p>
2 <p>Text10</p>
3 <p>Sky is blue</p>
4 <p>Text3</p><p><br></p><p>Text19</p>
5 <p> Complaint1</p><p><br></p><p>Text1</p><p>hospo 2</p><p>Tes45</p><p><br></p><p>test</p>
6 <p>Test44</p>
7 <p>Test54</p>
Tried using this;
from bs4 import BeautifulSoup
df['free text'].apply(
lambda x: list(BeautifulSoup(x, "html.parser").stripped_strings)
)
but getting this error,
TypeError: object of type 'NoneType' has no len()
What am I doing incorrectly?
Any help would be appreciated.
Thanks
Make sure to remove all missing data using 'df.dropna()' before applying the lambda function, otherwise you will get 'TypeError: object of type 'float' has no len()' error if your data frame has missing data.

Remove everything after tag in BeautifulSoup

I am extracting plaintext from HTML emails using BeautifulSoup. I've got everything working nicely except for one issue. My emails often have replies included below the message at the top. So I have threaded emails, and I end up capturing the same text repeatedly. In most cases, I want to just get rid of everything after the first <div> tag I find. If I print, soup.contents, it outputs the following:
p
None
p
None
p
None
p
None
p
None
p
None
p
None
p
None
p
None
p
None
p
None
div
None
meta
None
style
None
div
None
p
I am looking to return a BeautifulSoup object with everything passed the first div tag removed.
HTMLwise, here's the before and after I'm going for:
Before:
<p> Hi Joe </p>
<p> I will be at the meeting tonight</p>
<p> Allison </p>
<div style='border-width: 1pt medium medium; border-style: solid none none; border-color: rgb(181, 196, 223) currentColor currentColor; font-family: "Arial","sans-serif";'>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>From: </b>John Doe <jdoe#example.com></p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>Sent: </b>Wednesday, May 30, 2018 6:48 AM</p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>To: </b>Allison <allison#example.com></p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>Subject: </b>RE: meeting tonight</p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
</p>
</div>
<p>Will you be at the meeting tonight?</p>
After:
<p> Hi Joe </p>
<p> I will be at the meeting tonight</p>
<p> Allison </p>
In BeautifulSoup4, you can use the find_all_next method to delete everything after the tag, including the tag itself. This is only going to work if the elements afterwards are defined, e.g. they can't just belong to the Body element.
target = soup.find('div')
for e in target.find_all_next():
e.clear()
The easiest way in this case is just run re and remove all contents after first <div> tag:
s = """<p> Hi Joe </p>
<p> I will be at the meeting tonight</p>
<p> Allison </p>
<div style='border-width: 1pt medium medium; border-style: solid none none; border-color: rgb(181, 196, 223) currentColor currentColor; font-family: "Arial","sans-serif";'>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>From: </b>John Doe <jdoe#example.com></p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>Sent: </b>Wednesday, May 30, 2018 6:48 AM</p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>To: </b>Allison <allison#example.com></p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
<b>Subject: </b>RE: meeting tonight</p>
<p style="margin: 2px 0px; padding: 0px; color: rgb(34, 34, 34); font-family: Arial; font-size: 10pt; background-color: rgb(255, 255, 255);">
</p>
</div>
<p>Will you be at the meeting tonight?</p>"""
import re
new_s = re.sub(r'<div.*', '', s, flags=re.DOTALL).strip()
print(new_s)
Prints:
<p> Hi Joe </p>
<p> I will be at the meeting tonight</p>
<p> Allison </p>
Then you can feed this new string to BeautifulSoup:
from bs4 import BeautifulSoup
soup = BeautifulSoup(re.sub(new_s, 'lxml')
print(soup.prettify())
Outputs:
<html>
<body>
<p>
Hi Joe
</p>
<p>
I will be at the meeting tonight
</p>
<p>
Allison
</p>
</body>
</html>

Vertical scrollbar doesn't go fully down

I have a QScrollArea and I scroll down automatically, but it does not work completely - it does scroll only until half. What might be wrong with my code? I use the vertical ScrollBar. Maybe there is a problem with setSizePolicy?
code:
import sys
from PySide import QtGui, QtCore
class Window(QtGui.QMainWindow):
def __init__(self):
super(Window, self).__init__()
self.setStyleSheet("QMainWindow {background: #010206;}");
self.inicar()
def inicar(self):
groupBox = QtGui.QGroupBox('', self)
groupBox.setGeometry(100, 100, 500, 340)
groupBox.setStyleSheet('QGroupBox {border: 2px solid #101010; border-radius: 3px;}');
label = QtGui.QLabel(self)
label.setText('prueba asda asd asdad asdasdadadasd prueba asda asd asdad asdasdadadasd')
label.setWordWrap(True)
label.setStyleSheet('QLabel {color: #6E6E6E;float: left;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label2 = QtGui.QLabel(self)
label2.setText('hola que tal asdasdasd asd asdasdad asdas')
label2.setWordWrap(True)
label2.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label3 = QtGui.QLabel(self)
label3.setText('prueba')
label3.setWordWrap(True)
label3.setStyleSheet('QLabel {color: #6E6E6E;float: left;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label4 = QtGui.QLabel(self)
label4.setText('hola que tal asdasdasd asd asdasdad asdas')
label4.setWordWrap(True)
label4.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label5 = QtGui.QLabel(self)
label5.setText('hola que tal asdasdasd asd asdasdad asdas')
label5.setWordWrap(True)
label5.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label6 = QtGui.QLabel(self)
label6.setText('hola que tal asdasdasd asd asdasdad asdas')
label6.setWordWrap(True)
label6.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label7 = QtGui.QLabel(self)
label7.setText('prueba asda asd asdad asdasdadadasd')
label7.setWordWrap(True)
label7.setStyleSheet('QLabel {color: #6E6E6E;float: left;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label8 = QtGui.QLabel(self)
label8.setText('prueba asda asd asdad asdasdadadasd prueba asda asd asdad asdasdadadasd')
label8.setWordWrap(True)
label8.setStyleSheet('QLabel {color: #6E6E6E;float: left;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label9 = QtGui.QLabel(self)
label9.setText('hola que tal asdasdasd asd asdasdad asdas')
label9.setWordWrap(True)
label9.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label10 = QtGui.QLabel(self)
label10.setText('prueba asda asd asdad asdasdadadasd')
label10.setWordWrap(True)
label10.setStyleSheet('QLabel {color: #6E6E6E;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label11 = QtGui.QLabel(self)
label11.setText('hola que tal asdasdasd asd asdasdad asdas')
label11.setWordWrap(True)
label11.setStyleSheet('QLabel {color: #2F6120; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label12 = QtGui.QLabel(self)
label12.setText('hola que tal asdasdasd asd asdasdad asdas')
label12.setWordWrap(True)
label12.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label13 = QtGui.QLabel(self)
label13.setText('hola que tal asdasdasd asd asdasdad asdas')
label13.setWordWrap(True)
label13.setStyleSheet('QLabel {color: #2F6120;float: right; border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
label14 = QtGui.QLabel(self)
label14.setText('prueba asda asd asdad asdasdadadasd')
label14.setWordWrap(True)
label14.setStyleSheet('QLabel {color: #6E6E6E;float: left;border: 2px solid #071918; border-radius: 3px; padding-top: 2px; padding-bottom: 2px; padding-left: 3px; padding-right: 2px;}');
vbox = QtGui.QVBoxLayout()
vbox.addWidget(label)
vbox.addWidget(label2)
vbox.addWidget(label3)
vbox.addWidget(label4)
vbox.addWidget(label5)
vbox.addWidget(label6)
vbox.addWidget(label7)
vbox.addWidget(label8)
vbox.addWidget(label9)
vbox.addWidget(label10)
vbox.addWidget(label11)
vbox.addWidget(label12)
vbox.addWidget(label13)
vbox.addWidget(label14)
vbox.addStretch(1)
groupBox.setLayout(vbox)
scroll = QtGui.QScrollArea(self)
scroll.setVerticalScrollBarPolicy(QtCore.Qt.ScrollBarAlwaysOn)
scroll.setHorizontalScrollBarPolicy(QtCore.Qt.ScrollBarAlwaysOff)
scroll.setWidgetResizable(True)
scroll.setGeometry(100, 80, 500, 340)
scroll.setStyleSheet('QScrollArea {background: #010206; border: 2px solid #101010; border-radius: 3px;}');
scroll.setWidget(groupBox)
label.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label2.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label3.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label4.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label5.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label6.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label7.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label8.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label9.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label10.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label11.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label12.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label13.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
label14.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
maximumY = scroll.verticalScrollBar().maximum()
scroll.verticalScrollBar().setSliderPosition(maximumY)
self.setGeometry(350, 145, 730, 500)
self.setFixedSize(730, 500)
self.setWindowTitle('Mensajes')
self.show()
if __name__ == '__main__':
app = QtGui.QApplication(sys.argv)
w = Window()
sys.exit(app.exec_())
The geometry of widget are not initiated. So you have to wait it after initiate successful, You can set set slider position. In your code is after call method self.show();
.
.
label14.setSizePolicy(QtGui.QSizePolicy.Fixed, QtGui.QSizePolicy.Fixed)
self.setGeometry(350, 145, 730, 500)
self.setFixedSize(730, 500)
self.setWindowTitle('Mensajes')
self.show()
maximumY = scroll.verticalScrollBar().maximum()
scroll.verticalScrollBar().setSliderPosition(maximumY)
In the other-hand, You can create your custom QtGui.QScrollArea to auto scroll down when size of QtGui.QScrollArea has been changed. Implement method QAbstractScrollArea.resizeEvent (self, QResizeEvent);
class QScrollEndArea (QtGui.QScrollArea):
def resizeEvent (self, eventQResizeEvent):
QtGui.QScrollArea.resizeEvent(self, eventQResizeEvent)
self.verticalScrollBar().setSliderPosition(self.verticalScrollBar().maximum())
And replace QtGui.QScrollArea to your custom QScrollEndArea in your code, benefit is your not to handle set position size in your main widget, it handle itself;
scroll = QScrollEndArea(self)

Categories