Genshi auto load css/js need exclude specific file

Genshi auto load css/js need exclude specific file - python

I'm making a bootstrap theme for Trac installation. This is my first time using Genshi so please be patient :)
So I've following:
<head py:match="head" py:attrs="select('#*')">
${select('*|comment()|text()')}
<link rel="stylesheet" type="text/css" href="${chrome.htdocs_location}css/bootstrap.min.css" />
<link rel="stylesheet" type="text/css" href="${chrome.htdocs_location}css/style.css" />
</head>
This loads my custom css, but JS/css that trac needs to use.
So result is this:
<link rel="help" href="/pixelperfect/wiki/TracGuide" />
<link rel="start" href="/pixelperfect/wiki" />
<link rel="stylesheet" href="/pixelperfect/chrome/common/css/trac.css" type="text/css" />
<link rel="stylesheet" href="/pixelperfect/chrome/common/css/wiki.css" type="text/css" />
<link rel="stylesheet" type="text/css" href="/pixelperfect/chrome/common/css/bootstrap.min.css" />
<link rel="stylesheet" type="text/css" href="/pixelperfect/chrome/common/css/style.css" />
All is good, except that I would like to exclude trac.css out of there completely.
So my question is twofold:
1. How does genshi know what to load? Where is the manfest of all css/js files that it displays.
2. Is it genshi or python doing this?
Any help and relevant reading appreciated! :)
Thanks!

On 1:
The information on CSS files is accumulated in the 'links' dictionary of a request's Chrome property (req.chrome['links']), for JS files it is the 'scripts' dictionary. See add_link and add_script functions from trac.web.chrome respectively.
The default style sheet is added to the Chrome object directly. See the add_stylesheet call in trac.web.chrome.Chrome.prepare_request() method.
On 2:
Its part of the Request object, that is processed by Genshi. Preparation is done in Python anyway, but it is in the Trac Python scripts domain rather than in Genshi Python scripts.

Related

Can't get stylesheet to link properly with Django Framework (using localhost)

I'm trying to link it in the header of html file in the following path:
main/home/templates/home/index.html
And the style.css lives in
main/main/stylesheet/style.css
And this is my link in the index.html:
<link rel="stylesheet" type="text/css" href="/main/stylesheet/style.css">
Is something wrong?

I guess you should go for:
<link rel="stylesheet" type="text/css" href="main/stylesheet/style.css">
so you need a relative path.

Replace the node of Beautiful Soup with string in python

I have to download and save the webpages with a given URL. I have downloaded the page as well as the required js and css files. But the problem is to change the src and href values of those tags in the html source file as well to make it work.
my html source is :
<link REL="shortcut icon" href="/commd/favicon.ico">
<script src="/commd/jquery.min.js"></script>
<script src="/commd/jquery-ui.min.js"></script>
<script src="/commd/slimScroll.min.js"></script>
<script src="/commd/ajaxstuff.js"></script>
<script src="/commd/jquery.nivo.slider.pack.js"></script>FCT0505
<script src="/commd/jquery.nivo.slider.pack.js"></script>
<link rel="stylesheet" type="text/css" href="/fonts/stylesheet.cssFCT0505"/>
<link rel="stylesheet" type="text/css" href="/commd/stylesheet.css"/>
<!--[if gte IE 6]>
<link rel="stylesheet" type="text/css" href="/commd/stylesheetIE.css" />
<![endif]-->
<link rel="stylesheet" type="text/css" href="/commd/accordion.css"/>
<link rel="stylesheet" href="/commd/nivo.css" type="text/css" media="screen" />
<link rel="stylesheet" href="/commd/nivo-slider.css" type="text/css" media="screen" />
I have found out all the links of css and js files as well as downloaded them using :
scriptsurl = soup3.find_all("script")
os.chdir(foldername)
for l in scriptsurl:
if l.get("src") is not None:
print(l.get("src"))
script="http://www.iitkgp.ac.in"+l.get("src")
print(script)
file=l.get("src").split("/")[-1]
l.get("src").replaceWith('./foldername/'+file)
print(file)
urllib.request.urlretrieve(script,file)
linksurl=soup3.find_all("link")
for l in linksurl:
if l.get("href") is not None:
print(l.get("href"))
css="http://www.iitkgp.ac.in"+l.get("href")
file=l.get("href").split("/")[-1]
print(css)
print(file)
if(os.path.exists(file)):
urllib.request.urlretrieve(css,file.split(".")[0]+"(1)."+file.split(".")[-1])
else:
urllib.request.urlretrieve(css,file)
os.chdir("..")
Can anyone suggest me the method to change(local machine path) the the src/href texts during these loop executions only which will be great help.
This is my first task of crawling.

Reading from the documentation:
You can add, remove, and modify a tag’s attributes. Again, this is done by treating the tag as a dictionary:
So writing something like:
l["src"] = os.path.join(os.getcwd(),foldername, file)
instead of
l.get("src").replaceWith('./foldername/'+file)
I believe will do the trick

Certain javascript libs not loading over https

After setting up https on a site, some the javascript libraries are not loading while others are. In this case, the select2 lib is not loading. Why would this be?
Head extract
<head>
<link rel="stylesheet" href="https://yui.yahooapis.com/pure/0.6.0/pure-min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
<script src="https://code.jquery.com/ui/1.11.4/jquery-ui.js"></script>
<link rel="stylesheet" href="https://code.jquery.com/ui/1.11.4/themes/cupertino/jquery-ui.css">
<link href="https://cdnjs.cloudflare.com/ajax/libs/select2/4.0.0/css/select2.min.css" rel="stylesheet" />
<script src="https://cdnjs.cloudflare.com/ajax/libs/select2/4.0.0/js/select2.min.js"></script>
<link rel="stylesheet" type="text/css" href="https://d1r6do663ilw4i.cloudfront.net/static/sweetalerts/sweetalert.css">
<script src="https://d1r6do663ilw4i.cloudfront.net/static/sweetalerts/sweetalert.min.js"></script>

First: Make sure the file actually exists at that URL. (Try different browsers, command line tools, ...)
Second: Make sure your ad-blocker/browser-plugins aren't blocking the request.

jinja2 static files mime-type always text/plain

I have a simple flask application. I have all my css inside static/css directory. I have created a master template in which I want to include css from that directory. Following is what I have tried so far.
<link rel="stylesheet" href="/static/css/justified-nav.css" type="text/css" media="screen">
<link rel="stylesheet" type="text/css" href="{{ url_for('static', filename='css/justified-nav.css') }}" media="screen">
<link rel="stylesheet" type="text/css" href="http://127.0.0.1/static/css/justified-nav.css" media="screen">
In all the cases the file gets loaded but with mime-type text/plain. I have tried putting the css file directly into the static folder as well but no results.
What am I doing wrong? How can I include a css file in a template?

Flask uses the mimetypes module to determine the MIME type of a file, based on its extension. If you get text/plain for a CSS file it means this module returns the wrong MIME type.
On Windows it uses data from the registry, so if the "Content Type" value in HKCR/.css is not set to the proper MIME type it can cause your problem.

Try this and it should work.
<link rel="stylesheet" href="{{ url_for('static', filename='css/bootstrap.css') }}"/>

How can I find a file name in a block of text using python?

I have gotten the HTML of a webpage using Python, and I now want to find all of the .CSS files that are linked to in the header. I tried partitioning, as shown below, but I got the error "IndexError: string index out of range" upon running it and save each as its own variable (I know how to do this part).
sytle = src.partition(".css")
style = style[0].partition('<link href=')
print style[2]
c =1
I do no think that this is the right way to approach this, so would love some advice. Many thanks in advance. Here is a section of the kind of text I am needing to extract .CSS file(s) from.
<meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0" />
<!--[if gte IE 7]><!-->
<link href="/stylesheets/master.css?1342791430" media="screen, projection" rel="stylesheet" type="text/css" />
<link href="/stylesheets/adapt.css?1342791413" media="screen, projection" rel="stylesheet" type="text/css" />
<!-- <![endif]-->
<link href="/stylesheets/print.css?1342791421" media="print" rel="stylesheet" type="text/css" />
<link href="/apple-touch-icon-precomposed.png" rel="apple-touch-icon-precomposed" />
<link href="http://dribbble.com/shots/popular.rss" rel="alternate" title="RSS" type="application/rss+xml" />

You should use regular expression for this. Try the following:
/href="(.*\.css[^"]*)/g
EDIT
import re
matches = re.findall('href="(.*\.css[^"]*)', html)
print(matches)

My answer is along the same lines as Jon Clements' answer, but I tested mine and added a drop of explanation.
You should not use a regex. You can't parse HTML with a regex. The regex answer might work, but writing a robust solution is very easy with lxml. This approach is guaranteed to return the full href attribute of all <link rel="stylesheet"> tags and no others.
from lxml import html
def extract_stylesheets(page_content):
doc = html.fromstring(page_content) # Parse
return doc.xpath('//head/link[#rel="stylesheet"]/#href') # Search
There is no need to check the filenames, since the results of the xpath search are already known to be stylesheet links, and there's no guarantee that the filenames will have a .css extension anyway. The simple regex will catch only a very specific form, but the general html parser solution will also do the right thing in cases such as this, where the regex would fail miserably:
<link REL="stylesheet" hREf =
'/stylesheets/print?1342791421'
media="print"
><!-- link href="/css/stylesheet.css" -->
It could also be easily extended to select only stylesheets for a particular media.

For what it's worth (using lxml.html) as a parsing lib.
untested
import lxml.html
from urlparse import urlparse
sample_html = """<meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0" />
<!--[if gte IE 7]><!-->
<link href="/stylesheets/master.css?1342791430" media="screen, projection" rel="stylesheet" type="text/css" />
<link href="/stylesheets/adapt.css?1342791413" media="screen, projection" rel="stylesheet" type="text/css" />
<!-- <![endif]-->
<link href="/stylesheets/print.css?1342791421" media="print" rel="stylesheet" type="text/css" />
<link href="/apple-touch-icon-precomposed.png" rel="apple-touch-icon-precomposed" />
<link href="http://dribbble.com/shots/popular.rss" rel="alternate" title="RSS" type="application/rss+xml" />
"""
import lxml.html
page = lxml.html.fromstring(html)
link_hrefs = (p.path for p in map(urlparse, page.xpath('//head/link/#href')))
for href in link_hrefs:
if href.rsplit(href, 1)[-1].lower() == 'css': # implement smarter error handling here
pass # do whatever

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Genshi auto load css/js need exclude specific file - python

Related

Can't get stylesheet to link properly with Django Framework (using localhost)

Replace the node of Beautiful Soup with string in python

Certain javascript libs not loading over https

jinja2 static files mime-type always text/plain

How can I find a file name in a block of text using python?

Categories

Resources