Cannot open PhantomJS webpages in desktop mode (always in mobile mode) - python

I have been trying to fix this issue through stack overflow posts, but cannot find any relevant topics to my issue.
I am creating an automated python script that would automatically login to my facebook account and would utilize some features that facebook offers.
When I use selenium, I usually have the program run on the Chrome browser and I use the code as following
driver = webdriver.Chrome()
And I program the rest of the stuff that I want to do from there since it's easy to visually see whats going on with the program. However, when I switch to the PhantomJS browser, the program runs Facebook in a mobile version of the website (Like an android/ios version of Facebook). Here is an example of what it looks like
I was wondering if anyone would be able to help me in try understanding how to convert this into desktop mode, since the mobile version of Facebook is coded differently than the desktop version, and I don't want to redo the code for this difference. I need to have this running on PhantomJS, because it will be running on a low-powered raspberry pi device that can barely open google chrome.
I have also tried the following to see if it worked, and it didn't help.
headers = { 'Accept':'*/*',
'Accept-Encoding':'gzip, deflate, sdch',
'Accept-Language':'en-US,en;q=0.8',
'Cache-Control':'max-age=0',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36'
}
driver = webdriver.PhantomJS(desired_capabilities = headers)
driver.set_window_size(1366, 768)
Any help would be greatly appreciated!!

I had the same problem with PhantomJS Selenium and Python and next code was resolve it.
from selenium import webdriver
from selenium.webdriver import DesiredCapabilities
desired_capabilities = DesiredCapabilities.PHANTOMJS.copy()
desired_capabilities['phantomjs.page.customHeaders.User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) ' \
'AppleWebKit/537.36 (KHTML, like Gecko) ' \
'Chrome/39.0.2171.95 Safari/537.36'
driver = webdriver.PhantomJS('./phantom/bin/phantomjs.exe', desired_capabilities=desired_capabilities)
driver.get('http://facebook.com')

Related

Is there a way to find a browser's user-agent with cmd or python?

I am writing a code, where I have to use headless browser, but to access a specific website, I need to send user-agent as well. I am currently doing it by sending the following snippet of code(Python/Selenium/ChromeDriver).
opts = Options()
opts.add_argument("--headless")
opts.add_argument("--no-sandbox")
opts.add_argument("user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36")
But I wanted to make the user-agent genuine, instead of same for every browser/device where the code runs, thus I want to know the user-agent of browser on user's device.
So is there any way to find a browser's user-agent by using Python/Selenium code or command prompt?
httpagentparser extracts os, browser etc... information from http user agent string
so try this
import httpagentparser as agent
s = "user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36"
print(agent.detect(s))

Python web scraping, requests object hangs

I am trying to scrape the website in python, https://www.nseindia.com/
However when I try to load the website using Requests in python the call simply hangs below is the code I am using.
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
r = requests.get('https://www.nseindia.com/',headers=headers)
The requests.get call simply hangs, not sure what I am doing wrong here? The same URL works perfectly in Chrome or any other browser.
Appreciate any help.

(fake_useragent) UserAgent() will not connect

Essentially, I had a code that has been working for a few months. I try to run the program today and, like the title says, the connection for UserAgent()is timing out. I've tried upgrading the file with "pip install ---upgrade fake_useragent" and I'm told the package is up to date. I've also tried to delete the file (in order to re-install) but I am unable to for some reason. Does anyone have any ideas as to how else I can approach this issue?
from fake_useragent import UserAgent
...
ua = UserAgent()#program cannot progress past this point
You should add a fallback user_agent to the ua object, this way if the server is down then the fallback useragent will kick in, better a working outdated u_agent than complete program crash.
from fake_useragent import UserAgent
ua = UserAgent(fallback='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36')
headers = {'User-Agent':ua.chrome}
I learned this from this question:
Scrapy FakeUserAgentError: Error occurred during getting browser
The fake_useragent package connects to the http://useragentstring.com/ to get the list of up-to-date user agent strings. Looks like the http://useragentstring.com/ is down and I hope it is temporarily.

Why does PhantomJS on Ubuntu register as a touch device by Google Maps?

PhantomJS inconsistency between Ubuntu & Mac, recognized as touch device on Ubuntu by Google Maps
I recently stumbled upon what looks like an inconsistency in PhantomJS between operating systems.
I am using the Python 2.7 Selenium module (2.42.1) and PhantomJS (1.9.7) to test website applications. While testing a webpage using Google Maps JS API 3 I noticed that Google Maps seems to recognize PhantomJS as a touch device on Ubuntu, but strangely enough not on Mac.
I've put together a simple Google Maps JavaScript API v3 Example.
What happens is that the zoom control buttons look different on touch devices such as iOS or Android devices, they are bigger and go to the bottom left corner.
Running the following python script
# -*- coding: utf-8 -*-
from selenium import webdriver
import os, time
browser = webdriver.PhantomJS(service_log_path=os.path.devnull)
browser.set_window_size(1280, 800)
browser.get("https://notendur.hi.is/~sfg6/google_maps_example/")
time.sleep(5)
browser.save_screenshot('test_google_maps_api_screenshot.png')
gives me this result on mac but this result on ubuntu.
Can I in any way prevent PhantomJS from being registered as a touch device?
Answer:
As Jeff Sisson suggested in his answer below the problem was the user agent string.
PhantomJS used the following user agent string on Ubuntu:
Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34
and this one on Mac:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36
After trying and checking few things out I came to the conclusion that the problem was the platform token. After changing Unknown to X11 as seen in below example, Google Maps stopped treating PhantomJS as a mobile device.
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import os, time
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = (
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.34 "
"(KHTML, like Gecko) PhantomJS/1.9.7 Safari/534.34"
)
browser = webdriver.PhantomJS(desired_capabilities=dcap,service_log_path=os.path.devnull)
browser.set_window_size(1280, 800)
browser.get("https://notendur.hi.is/~sfg6/google_maps_example/")
time.sleep(5)
browser.save_screenshot('test_google_maps_api_screenshot_x11.png')
Running above python script on Ubuntu gave this result.
Have you tried manually setting PhantomJS' user agent? Anecdotally (using Safari on a mac), your test page loads the mobile UI when I set the user agent to iPhone, so it could be a simple case of incorrect browser sniffing.
Here's an example of how you can set the user agent with page.settings: https://github.com/ariya/phantomjs/blob/master/examples/useragent.js
This example will also log what the default user agent is — and mailing list evidence seems to imply that the user agent definitely varies between operating systems.
This is the latest for me using Chrome.
var page = require('webpage').create();
page.settings.userAgent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36';
page.viewportSize = { width: 600, height: 200};
page.open('http://optime.dev.puppetsproutatwork.com/email/weekly/7780', function() {
page.render('github.png', {format: 'png', quality: '100'});
phantom.exit();
});

Get user browser info in Python Bottle

I'm trying to find out which browsers are my users using and I'm running into a problem.
If I try to read header "User-Agent" it usually gives me lots of text, and tells me nothing.
For example, if I visit the site with Chrome, in "User-Agent" header there is:
User-Agent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36".
As you can see, this tells me nothing since there is mention of Mozzila, Safari, Chrome etc.. even though I visited with Chrome.
Framework I've been using is Bottle (Python).
Any help would be appreciated, thanks.
User-Agent: "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36".
As you can see, this tells me nothing since there is mention of
Mozzila, Safari, Chrome etc.. even though I visited with Chrome.
Your conclusion above is wrong. The UA tells you many things including the type and version of the web browser.
The post below explains why Mozilla and Safari exist in Chrome's UA.
History of the browser user-agent string
You can try to analyze it manually on user-agent-string-db.
There's a Python API for it.
from uasparser2 import UASparser
uas_parser = UASparser()
# Instead of fecthing data via network every time, you can cache the db in local
# uas_parser = UASparser('/path/to/your/cache/folder', mem_cache_size=1000)
# Updating data is simple: uas_parser.updateData()
result = ua_parser.parse('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36')
# result
{'os_company': u'',
'os_company_url': u'',
'os_family': u'Linux',
'os_icon': u'linux.png',
'os_name': u'Linux',
'os_url': u'http://en.wikipedia.org/wiki/Linux',
'typ': u'Browser',
'ua_company': u'Google Inc.',
'ua_company_url': u'http://www.google.com/',
'ua_family': u'Chrome',
'ua_icon': u'chrome.png',
'ua_info_url': u'http://user-agent-string.info/list-of-ua/browser-detail?browser=Chrome',
'ua_name': u'Chrome 31.0.1650.57',
'ua_url': u'http://www.google.com/chrome'}
Thank you everyone for your answers, I found something really simple that works.
Download httpagentparser module from:
https://pypi.python.org/pypi/httpagentparser
after that, just import it in your pythong program
import httpagentparser
Then you can write a function like this that returns browser, works like a charm:
def detectBrowser(request):
agent = request.environ.get('HTTP_USER_AGENT')
browser = httpagentparser.detect(agent)
if not browser:
browser = agent.split('/')[0]
else:
browser = browser['browser']['name']
return browser
That's it
As you can see, this tells me nothing since there is mention of
Mozzila, Safari, Chrome etc.. even though I visited with Chrome.
It's not that the User Agent string tells you "nothing;" it's that it's telling you too much.
If you want a report that breaks down your users browser, your best bet is to analyze your logs. Several programs are available to help. (One caveat, if you're using Bottle's "raw" web server, is that it won't log in Common Log Format out of the box. You have options.)
If you need to know in real time, you'll need to spend time learning user agent strings (useragentstring.com might help here) or use an API like this one.

Categories