Python: keep open browser in pyppeteer and create CDPSession - python

I've got two issues that I can't solve it at them moment.
1. I would like to keep the browser running so I could just re-connect using pyppeteer.launcher.connect() function but it seems to be closed imidiately even if I don't call pyppeteer.browser.Browser.close().
test01.py:
import asyncio
from pyppeteer import launch, connect
async def fetch():
browser = await launch(
headless=False,
args=['--no-sandbox']
)
print(f'Endpoint: {browser.wsEndpoint}')
await browser.disconnect()
loop = asyncio.get_event_loop()
loop.run_until_complete(fetch())
$ python test01.py
Endpoint: ws://127.0.0.1:51757/devtools/browser/00e917a9-c031-499a-a8ee-ca4090ebd3fe
$ curl -i -N -H "Connection: Upgrade" -H "Upgrade: websocket" http://127.0.0.1:51757
curl: (7) Failed to connect to 127.0.0.1 port 51757: Connection refused
2. How do I create CDP session. This code should open another browser window but it doesn't work as expected:
test02.py
import asyncio
import time
from pyppeteer import launch, connect
async def fetch():
browser = await launch(
headless=False,
args=['--no-sandbox']
)
page = await browser.newPage()
cdp = await page.target.createCDPSession()
await cdp.send('Target.createBrowserContext')
time.sleep(5)
await browser.disconnect()
loop = asyncio.get_event_loop()
loop.run_until_complete(fetch())
$ python test02.py
Future exception was never retrieved
future: <Future finished exception=NetworkError('Protocol error Target.sendMessageToTarget: Target closed.',)>
pyppeteer.errors.NetworkError: Protocol error Target.sendMessageToTarget: Target closed.

How to keep the browser running
You just need to use autoClose flag, here's the docs:
autoClose (bool): Automatically close browser process when script
completed. Defaults to True.
In this case you test01.py would look as follows:
import asyncio
from pyppeteer import launch, connect
async def fetch():
browser = await launch(
headless=False,
args=['--no-sandbox'],
autoClose=False
)
print(f'Endpoint: {browser.wsEndpoint}')
await browser.disconnect()
loop = asyncio.get_event_loop()
loop.run_until_complete(fetch())
CDP session
Here it is:
import asyncio
import time
from pprint import pprint
from pyppeteer import launch, connect
from pyppeteer.browser import BrowserContext
async def fetch():
browser = await launch(
headless=False,
args=['--no-sandbox'],
autoClose=False
)
page = await browser.newPage()
cdp = await page.target.createCDPSession()
raw_context = await cdp.send('Target.createBrowserContext')
pprint(raw_context)
context = BrowserContext(browser, raw_context['browserContextId'])
new_page = await context.newPage()
await cdp.detach()
await browser.disconnect()
loop = asyncio.get_event_loop()
loop.run_until_complete(fetch())
Inspired by Browser.createIncognitoBrowserContext from pyppeteer itself.
Notice creating additional sessions via CDP doesn't seem to be such a great idea because browser._contexts won't be updated and will become inconsistent. It's also likely that Browser.createIncognitoBrowserContext might fit your needs without resorting to CDP whatsoever

Related

What a Ubuntu Server needs to run python plawright?

My scrap script that works pretty well on Windows 10, so i am trying to use a VPS (from contabo) to make file.py running 24/7. My Server is a Ubuntu 22.04.1 without a GUI, just the CLI, python 3.10.6 and playwright 1.27.1 in both machines.
Tried: xvfb-run with --auto-servernum --server-num=1 --server-args='-screen 0, 1920x1080x24' and without it, also tried import os ;os.environ['DISPLAY'] = ':1'.
Minimum Reproducible:
from playwright.async_api import async_playwright
import asyncio
async def run(browser):
context = await browser.new_context(viewport={'width':1500, 'height':1000})
page = await context.new_page()
await page.goto("https://www.pixbetrei.com/", timeout=60000)
await page.get_by_role("button", name="Entrar").click(timeout=60000)
await context.close()
async def start():
async with async_playwright() as playwright:
browser = await playwright.firefox.launch(headless = True)
tasks = [run(browser)]
await asyncio.gather(*tasks)
await browser.close()
asyncio.run(start())
Traceback:
playwright._impl._api_types.TimeoutError: Timeout 60000ms exceeded.
=========================== logs ===========================
waiting for selector "role=button[name="Entrar"i]"
============================================================

Pyppeteer connection closed after a minute

Good day everyone. I ran this code and it works perfectly well.the main purpose is to capture websocket traffic and the problem is that it closes after a minute or there about.. please how can I fix this.. I want it to stay alive forever
import asyncio
from pyppeteer import launch
async def main():
browser = await launch(
headless=True,
args=['--no-sandbox'],
autoClose=False
)
page = await browser.newPage()
await page.goto('https://www.tradingview.com/symbols/BTCUSD/')
cdp = await page.target.createCDPSession()
await cdp.send('Network.enable')
await cdp.send('Page.enable')
def printResponse(response):
print(response)
cdp.on('Network.webSocketFrameReceived', printResponse) # Calls printResponse when a websocket is received
cdp.on('Network.webSocketFrameSent', printResponse) # Calls printResponse when a websocket is sent
await asyncio.sleep(100)
asyncio.get_event_loop().run_until_complete(main())

How to use NewMessage event of telethon on google colab

I have this code and I want to run it on google colab. It works great on my PC but on colab I always get errors like these:
SyntaxError: 'async with' outside async function
or
RuntimeError: You must use "async with" if the event loop is running (i.e. you are inside an "async def")
sometimes it does not wait for getting new messages and finish after one running.
import json
import time
import telethon as tlt
import asyncio
from telethon import events,TelegramClient
chat_name = "sample"
telegram_session="sample_1"
api_id = "0000000"
api_hash = ""
client = TelegramClient(None , api_id, api_hash)
#client.on(events.NewMessage(chats=chat_name))
async def handler(event):
get_message = event.message.to_dict()
get_message['date'] = get_message['date'].strftime("%Y-%m-%d %H:%M:%S")
message_json = json.dumps(get_message)
print(message_json)
async with client:
client.run_until_disconnected()
You need to put async with inside of async def:
...
async def main():
async with client:
await client.run_until_disconnected()
client.loop.run_until_complete(main())

How to check if Pyppeteer browser has closed?

I can't seem to find any information regarding Python's version
of Puppeteer on how to check if my browser has closed properly, following browser.close().
I have limited knowledge of JavaScript, so can't properly follow the answer puppeteer : how check if browser is still open and working.
printing((browser.on('disconnected')) seems to return a function object, which when called requires something called f.
What is the proper way to check if the browser has closed properly?
from pyppeteer import launch
async def get_browser():
return await launch({"headless": False})
async def get_page():
browser = await get_browser()
url = 'https://www.wikipedia.org/'
page = await browser.newPage()
await page.goto(url)
content = await page.content()
await browser.close()
print(browser.on('disconnected'))
#assert browser is None
#assert print(html)
loop = asyncio.get_event_loop()
result = loop.run_until_complete(get_page())
print(result)
.on methods register a callback to be fired on a particular event. For example:
import asyncio
from pyppeteer import launch
async def get_page():
browser = await launch({"headless": True})
browser.on("disconnected", lambda: print("disconnected"))
url = "https://www.wikipedia.org/"
page, = await browser.pages()
await page.goto(url)
content = await page.content()
print("disconnecting...")
await browser.disconnect()
await browser.close()
return content
loop = asyncio.get_event_loop()
result = loop.run_until_complete(get_page())
Output:
disconnecting...
disconnected
From the callback, you could flip a flag to indicate closure or (better yet) take whatever other action you want to take directly.
There's also browser.process.returncode (browser.process is a Popen instance). It's 1 after the browser has been closed, but not after disconnect.
Here's an example of the above:
import asyncio
from pyppeteer import launch
async def get_page():
browser = await launch({"headless": True})
connected = True
async def handle_disconnected():
nonlocal connected
connected = False
browser.on(
"disconnected",
lambda: asyncio.ensure_future(handle_disconnected())
)
print("connected?", connected)
print("return code?", browser.process.returncode)
print("disconnecting...")
await browser.disconnect()
print("connected?", connected)
print("return code?", browser.process.returncode)
print("closing...")
await browser.close()
print("return code?", browser.process.returncode)
asyncio.get_event_loop().run_until_complete(get_page())
Output:
connected? True
return code? None
disconnecting...
connected? False
return code? None
closing...
return code? 1
You can use browser. on('disconnected') to listen for when the browser is closed or crashed, or if the browser. disconnect() method was called. Then, you can automatically relaunch the browser, and continue with your program

Python websockets client keep connection open

In Python, I'm using "websockets" library for websocket client.
import asyncio
import websockets
async def init_sma_ws():
uri = "wss://echo.websocket.org/"
async with websockets.connect(uri) as websocket:
name = input("What's your name? ")
await websocket.send('name')
greeting = await websocket.recv()
The problem is the client websocket connection is disconnected once a response is received. I want the connection to remain open so that I can send and receive messages later.
What changes do I need to keep the websocket open and be able to send and receive messages later?
I think your websocket is disconnected due to exit from context manager after recv().
Such code works perfectly:
import asyncio
import websockets
async def init_sma_ws():
uri = "wss://echo.websocket.org/"
async with websockets.connect(uri) as websocket:
while True:
name = input("What's your name? ")
if name == 'exit':
break
await websocket.send(name)
print('Response:', await websocket.recv())
asyncio.run(init_sma_ws())
In your approach you used a asynchronous context manager which closes a connection when code in the block is executed. In the example below an infinite asynchronous iterator is used which keeps the connection open.
import asyncio
import websockets
async def main():
async for websocket in websockets.connect(...):
try:
...
except websockets.ConnectionClosed:
continue
asyncio.run(main())
More info in library's docs.

Categories