Add comments in html to Tableau dashboard when exporting to HTML - python

I have a tableau dashboard that has a web object in it. The web object has comments made to that particular dashboard, it is basically just html in a server. When I open it in Tableau server, it works perfectly and it is displayed correctly. When I export it to PDF, the web object is not displaying at all. As per tableau docs, the web object will not be displayed. Is there another way of getting the web object or the html to display in the PDF?
I can use python and tabcmd to achieve this, I'm open to suggestions of any kind, this requirement is completely necessary for what I'm doing and I'm tearing my hear out at this point.
Thanks!

According to the Tableau Community Forums, it's not possible to export the contents of a web object inside a dashboard because of how the web object works:
Thank you for contacting Tableau Technical Support. I understand
exported PDF on Tableau Server is not including the contents of web
pages. Please let me know if I misunderstood your request.
It is the expected behavior that exporting views to PDF will not
include web page objects. The dashboard containing a web page object
has no knowledge of what is in the web browser object so it is left
blank.
One way I could think to do this is build a web data connector to pull the comments as text data, then do some string manipulation to make a view formatted how you want, and add that view to your dashboard.

Related

How to scrape the webpage build with Flutter CanvasKit renderer

I need to extract data from a website but I found that it was rendered with Flutter Canvaskit renderer. It seems everything I wanted is drawn in the canvas. I have to go through each row, trigger click on a row and then trigger info button on top right which shows the file's attributes and get one of the attribute from there. [refer images]
Is this possible? If so, how? I want to do it in python.
The CORS issue.
In my case.
Use a web proxy like https://cors-anywhere.herokuapp.com/$urlTarget
Scrape the webpage in the back-end, then send the data via API.
I chose method 2 because it is easy to fix when the webpage changes.

Web scraping for dummies (or not)

GOAL
Extract data from a web page.. automatically.
Data are on this page... Be careful , it's in French...
MY HARD WAY, manually
I choose the data I want by clicking on the desired fields on the left side ('CHOISIR DES INDICATEURS')
Then I select ('Tableau' = Table), to have data table.
Then I click on ('Action'), on the right side, then ('Exporter' = Export)
I choose the format I want (ie CSV) and hit ('Executer'= Execute) to download the file.
WHAT I TRIED
I tried to automate this process, but It's like an impossible task for me. I tried to inspect the page for the network exchanges to see if there is an underlying server I could make easy json request.
I mainly work with python and frameworks like BS4 or scrapy.
I have few data to extract, so I can easily do it manually. Thus this question, I just purely for my own knowledge, to see if it is possible to scrape a page like that.
I would appreciate if you could share your skills!
Thank you,
It is possible. Check this website for details. This website will tell you how to scrape a website with an example.
https://realpython.com/beautiful-soup-web-scraper-python/#scraping-the-monster-job-site

Web scraping of my Kibana server

I am running the ELK stack for the log analysis in which kibana is being used as the data visualization.Now I want to extract the some fields from the kibana webpage.
I want to extract the CU and count field and as you can see I have attached the screenshot of the webpage and corresponding html source code.
Now I have tried to scrap the same webpage using the python and "Beautiful soap" library but there whatever code I am seeing it is different.
Please help.also,
Can you suggest me some other method by which I can extract the required fields?
It's better to make direct request to your elasticsearch for the data you need.
You can see the query executed by visualization if you go to Dashboard and click the arrow in the bottom left corner and select Request tab:

How to read a HTML page that takes some time to load? [duplicate]

I am trying to scrape a web site using python and beautiful soup. I encountered that in some sites, the image links although seen on the browser is cannot be seen in the source code. However on using Chrome Inspect or Fiddler, we can see the the corresponding codes.
What I see in the source code is:
<div id="cntnt"></div>
But on Chrome Inspect, I can see a whole bunch of HTML\CSS code generated within this div class. Is there a way to load the generated content also within python? I am using the regular urllib in python and I am able to get the source but without the generated part.
I am not a web developer hence I am not able to express the behaviour in better terms. Please feel free to clarify if my question seems vague !
You need JavaScript Engine to parse and run JavaScript code inside the page.
There are a bunch of headless browsers that can help you
http://code.google.com/p/spynner/
http://phantomjs.org/
http://zombie.labnotes.org/
http://github.com/ryanpetrello/python-zombie
http://jeanphix.me/Ghost.py/
http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/
The Content of the website may be generated after load via javascript, In order to obtain the generated script via python refer to this answer
A regular scraper gets just the HTML document. To get any content generated by JavaScript logic, you rather need a Headless browser that would also generate the DOM, load and run the scripts like a regular browser would. The Wikipedia article and some other pages on the Net have lists of those and their capabilities.
Keep in mind when choosing that some previously major products of those are abandoned now.
TRY THIS FIRST!
Perhaps the data technically could be in the javascript itself and all this javascript engine business is needed. (Some GREAT links here!)
But from experience, my first guess is that the JS is pulling the data in via an ajax request. If you can get your program simulate that, you'll probably get everything you need handed right to you without any tedious parsing/executing/scraping involved!
It will take a little detective work though. I suggest turning on your network traffic logger (such as "Web Developer Toolbar" in Firefox) and then visiting the site. Focus your attention attention on any/all XmlHTTPRequests. The data you need should be found somewhere in one of these responses, probably in the middle of some JSON text.
Now, see if you can re-create that request and get the data directly. (NOTE: You may have to set the User-Agent of your request so the server thinks you're a "real" web browser.)

web scraping a problem site

I'm trying to scrape some information from a web site, but am having trouble reading the relevant pages. The pages seem to first send a basic setup, then more detailed info. My download attempts only seem to capture the basic setup. I've tried urllib and mechanize so far.
Firefox and Chrome have no trouble displaying the pages, although I can't see the parts I want when I view page source.
A sample url is https://personal.vanguard.com/us/funds/snapshot?FundId=0542&FundIntExt=INT
I'd like, for example, average maturity and average duration from the lower right of the page. The problem isn't extracting that info from the page, it's downloading the page so that I can extract the info.
The page uses JavaScript to load the data. Firefox and Chrome are only working because you have JavaScript enabled - try disabling it and you'll get a mostly empty page.
Python isn't going to be able to do this by itself - your best compromise would be to control a real browser (Internet Explorer is easiest, if you're on Windows) from Python using something like Pamie.
The website loads the data via ajax. Firebug shows the ajax calls. For the given page, the data is loaded from https://personal.vanguard.com/us/JSP/Funds/VGITab/VGIFundOverviewTabContent.jsf?FundIntExt=INT&FundId=0542
See the corresponding javascript code on the original page:
<script>populator = new Populator({parentId:
"profileForm:vanguardFundTabBox:tab0",execOnLoad:true,
populatorUrl:"/us/JSP/Funds/VGITab/VGIFundOverviewTabContent.jsf?FundIntExt=INT&FundId=0542",
inline:fals e,type:"once"});
</script>
The reason why is because it's performing AJAX calls after it loads. You will need to account for searching out those URLs to scrape it's content as well.
As RichieHindle mentioned, your best bet on Windows is to use the WebBrowser class to create an instance of an IE rendering engine and then use that to browse the site.
The class gives you full access to the DOM tree, so you can do whatever you want with it.
http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser(loband).aspx

Categories