The following code block will not accept a multi line comment for the whole block using """ - I suspect this is because three double quotes have been used for a string to span multiple lines as part of this code block.
"""
tabby_cat = "\tI'm tabbed in."
persian_cat = "I'm split\non a line."
backslash_cat = "I'm \\ a \\ cat."
fat_cat = """
I'll do a list
\t* Cat food
\t* Fishies
\t* Catnip\n\t* Grass
"""
print(tabby_cat)
print(persian_cat)
print(backslash_cat)
print(fat_cat)
"""
Are there any alternative methods to ensure this code block is commented out?
I show the problem by an arrow in your code:
"""
tabby_cat = "\tI'm tabbed in."
persian_cat = "I'm split\non a line."
backslash_cat = "I'm \\ a \\ cat."
fat_cat = """
I'll do a list
\t* Cat food
\t* Fishies
\t* Catnip\n\t* Grass
""" <-----------------------------------------DELETE THIS ONE OR ADD ANOTHER ONE
print(tabby_cat)
print(persian_cat)
print(backslash_cat)
print(fat_cat)
"""
Related
I try to display n progress bar in an Alert (instead of using the vuetify progress element I know).
I created a little script that does a nice output as a str :
def update_progress(progress, msg='Progress', bar_length=30):
plain_char = '█'
empty_char = ' '
progress = float(progress)
block = int(round(bar_length * progress))
text = f'{msg}: |{plain_char * block + empty_char * (bar_length - block)}| {progress *100:.1f}%'
return text
when I print the output, I get the following :
and when I insert it as a child in an Alert component :
First problem, the spaces have been trimed, second problem it's changing the block ascii character to a wider one.
Is it normal and how can I avoid it?
As referenced in the github repository of the ipyvuetify lib (#103), it's a 'feature' of HTML: https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Whitespace
Using the pre tag will preserve whitespace:
v.Html(
tag='pre',
children=['a mess age'],
class_='info--text',
style_='font-family:roboto'
)
The class and style traitlets are here to keep the formating of the Alert (need to be adapted to your own type_)
I am trying to use a triple-quoted strings in Python3(.7) to build some formated strings.
I have a list of inner strings, which all need to be tabbed in:
This is some text
across multiple
lines.
And a string which should contain the inner string
data{
// string goes here
}
I cannot tab the inner string when I create it. So, my thought was to use dedent with Python3 triple-quoted fstrings:
import textwrap
inner_str = textwrap.dedent(
'''\
This is some text
across multiple
lines.'''
)
full_str = textwrap.dedent(
f'''\
data{{
// This should all be tabbed
{inner_str}
}}'''
)
print(full_str)
However, the indentation is not maintained:
data{
// This should all be tabbed
This is some text
across multiple
lines.
}
The desired result:
data{
// This should all be tabbed
This is some text
across multiple
lines.
}
How can I preserve the indentation of the fstring without pre-tabbing the inner string?
This seems to provide what you want.
import textwrap
inner_str = textwrap.dedent(
'''\
This is some text
across multiple
lines.'''
)
full_str = textwrap.dedent(
f'''
data{{
{textwrap.indent(inner_str, " ")}
}}'''
)
A better solution:
idt = str.maketrans({'\n': "\n "})
print(textwrap.dedent(
f'''
data{{
{inner_str.translate(idt)}
}}'''
))
Another solution with customized tab width:
def indent_inner(inner_str, indent):
return inner_str.replace('\n', '\n' + indent) # os.linesep could be used if the function is needed across different OSs
print(textwrap.dedent(
f'''
data{{
{indent_inner(inner_str, " ")}
}}'''
))
None of the answers here seemed to do what I want, so I'm answering my own question with a solution which gets as close to what I was looking for as I can make. It is not required to pre-tab the data to define its indentation level, nor is it required to not indent the line (which breaks readability). Instead, the indentation level of the current line in the fstring is passed at usage time.
Not perfect but it works.
import textwrap
def code_indent(text, tab_sz, tab_chr=' '):
def indented_lines():
for i, line in enumerate(text.splitlines(True)):
yield (
tab_chr * tab_sz + line if line.strip() else line
) if i else line
return ''.join(indented_lines())
inner_str = textwrap.dedent(
'''\
This is some text
across multiple
lines.'''
)
full_str = textwrap.dedent(
f'''
data{{
{code_indent(inner_str, 8)}
}}'''
)
print(full_str)
Edited to avoid tabbing inner_str.
import textwrap
line_tab = '\n\t'
inner_str = f'''\
This is some text
across multiple
lines.
'''
full_str = textwrap.dedent(f'''\
data{{
// This should all be tabbed
{line_tab.join(inner_str.splitlines())}
}}''')
)
print(full_str)
Output:
data{
// This should all be tabbed
This is some text
across multiple
lines.
}
I am running Python 2.7.5 and using the built-in html parser for what I am about to describe.
The task I am trying to accomplish is to take a chunk of html that is essentially a recipe. Here is an example.
html_chunk = "<h1>Miniature Potato Knishes</h1><p>Posted by bettyboop50 at recipegoldmine.com May 10, 2001</p><p>Makes about 42 miniature knishes</p><p>These are just yummy for your tummy!</p><p>3 cups mashed potatoes (about<br> 2 very large potatoes)<br>2 eggs, slightly beaten<br>1 large onion, diced<br>2 tablespoons margarine<br>1 teaspoon salt (or to taste)<br>1/8 teaspoon black pepper<br>3/8 cup Matzoh meal<br>1 egg yolk, beaten with 1 tablespoon water</p><p>Preheat oven to 400 degrees F.</p><p>Sauté diced onion in a small amount of butter or margarine until golden brown.</p><p>In medium bowl, combine mashed potatoes, sautéed onion, eggs, margarine, salt, pepper, and Matzoh meal.</p><p>Form mixture into small balls about the size of a walnut. Brush with egg yolk mixture and place on a well-greased baking sheet and bake for 20 minutes or until well browned.</p>"
The goal is to separate out the header, junk, ingredients, instructions, serving, and number of ingredients.
Here is my code that accomplishes that
from bs4 import BeautifulSoup
def list_to_string(list):
joined = ""
for item in list:
joined += str(item)
return joined
def get_ingredients(soup):
for p in soup.find_all('p'):
if p.find('br'):
return p
def get_instructions(p_list, ingredient_index):
instructions = []
instructions += p_list[ingredient_index+1:]
return instructions
def get_junk(p_list, ingredient_index):
junk = []
junk += p_list[:ingredient_index]
return junk
def get_serving(p_list):
for item in p_list:
item_str = str(item).lower()
if ("yield" or "make" or "serve" or "serving") in item_str:
yield_index = p_list.index(item)
del p_list[yield_index]
return item
def ingredients_count(ingredients):
ingredients_list = ingredients.find_all(text=True)
return len(ingredients_list)
def get_header(soup):
return soup.find('h1')
def html_chunk_splitter(soup):
ingredients = get_ingredients(soup)
if ingredients == None:
error = 1
header = ""
junk_string = ""
instructions_string = ""
serving = ""
count = ""
else:
p_list = soup.find_all('p')
serving = get_serving(p_list)
ingredient_index = p_list.index(ingredients)
junk_list = get_junk(p_list, ingredient_index)
instructions_list = get_instructions(p_list, ingredient_index)
junk_string = list_to_string(junk_list)
instructions_string = list_to_string(instructions_list)
header = get_header(soup)
error = ""
count = ingredients_count(ingredients)
return (header, junk_string, ingredients, instructions_string,
serving, count, error)
It works well except in situations where I have chunks that contain strings like "Sauté" because soup = BeautifulSoup(html_chunk) causes Sauté to turn into Sauté and this is a problem because I have a huge csv file of recipes like the html_chunk and I'm trying to structure all of them nicely and then get the output back into a database. I tried checking it Sauté comes out right using this html previewer and it still comes out as Sauté. I don't know what to do about this.
What's stranger is that when I do what BeautifulSoup's documentation shows
BeautifulSoup("Sacré bleu!")
# <html><head></head><body>Sacré bleu!</body></html>
I get
# Sacré bleu!
But my colleague tried that on his Mac, running from terminal, and he got exactly what the documentation shows.
I really appreciate all your help. Thank you.
This is not a parsing problem; it is about encoding, rather.
Whenever working with text which might contain non-ASCII characters (or in Python programs which contain such characters, e.g. in comments or docstrings), you should put a coding cookie in the first or - after the shebang line - second line:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
... and make sure this matches your file encoding (with vim: :set fenc=utf-8).
BeautifulSoup tries to guess the encoding, sometimes it makes a mistake, however you can specify the encoding by adding the from_encoding parameter:
for example
soup = BeautifulSoup(html_text, from_encoding="UTF-8")
The encoding is usually available in the header of the webpage
What is the regular expression for single line java comments:
I am trying the following grammar :
def single_comment(t):
r'\/\/.~(\n)'
#r'//.*$'
pass
but, I am unable to ignore the single line comments how can I do it?
Python regular expression for matching single line comments (only matches comments that start with //, not /* */). Unfortunately, this regular expression is pretty ugly as it has to account for escaped characters and // within strings. You should find a more easily understandable solution if you ever need this in real code.
import re
pattern = re.compile(r'^(?:[^"/\\]|\"(?:[^\"\\]|\\.)*\"|/(?:[^/"\\]|\\.)|/\"(?:[^\"\\]|\\.)*\"|\\.)*//(.*)$')
This is a little script that runs a bunch of test strings against the pattern.
import re
pattern = re.compile(r'^(?:[^"/\\]|\"(?:[^\"\\]|\\.)*\"|/(?:[^/"\\]|\\.)|/\"(?:[^\"\\]|\\.)*\"|\\.)*//(.*)$')
tests = [
(r'// hello world', True),
(r' // hello world', True),
(r'hello world', False),
(r'System.out.println("Hello, World!\n"); // prints hello world', True),
(r'String url = "http://www.example.com"', False),
(r'// hello world', True),
(r'//\\', True),
(r'// "some comment"', True),
(r'new URI("http://www.google.com")', False),
(r'System.out.println("Escaped quote\""); // Comment', True)
]
tests_passed = 0
for test in tests:
match = pattern.match(test[0])
has_comment = match != None
if has_comment == test[1]:
tests_passed += 1
print "Passed {0}/{1} tests".format(tests_passed, len(tests))
I think this works (using pyparsing):
data = """
class HelloWorld {
// method main(): ALWAYS the APPLICATION entry point
public static void main (String[] args) {
System.out.println("Hello World!"); // Nested //Print 'Hello World!'
System.out.println("http://www.example.com"); // Another nested // Print a URL
System.out.println("\"http://www.example.com"); // A nested escaped quote // Print another URL
}
}"""
from pyparsing import *
from pprint import pprint
dbls = QuotedString('"', '\\', '"')
sgls = QuotedString("'", '\\', "'")
strings = dbls | sgls
pprint(dblSlashComment.ignore(strings).searchString(data).asList())
[['// method main(): ALWAYS the APPLICATION entry point'],
["// Nested //Print 'Hello World!'"],
['// Another nested // Print a URL'],
['// A nested escaped quote // Print another URL']]
Should you have /* ... */ style comments, that happen to have single line comments in them, and don't actually want those, then you can use:
pprint(dblSlashComment.ignore(strings | cStyleComment).searchString(data).asList())
(as discussed in https://chat.stackoverflow.com/rooms/26267/discussion-between-nhahtdh-and-martega)
I'm trying to color some specific part of the text, i have tried to say:
if word.strip().startswith(":"):
self.setAttributesForRange(NSColor.greenColor(), None, highlightOffset, len(word))
When someone types the sign : it gets colored green. That is good, but it keeps coloring the word after it like this:
:Hello Hello :Hello <---- this all gets colored green, but I want something like:
:Hello Hello :Hello <---- where everything get colored except the middle "hello" because it doesn't start with the sign : , please help me out
from Foundation import *
from AppKit import *
import objc
class PyObjC_HighlightAppDelegate(NSObject):
# The connection to our NSTextView in the UI
highlightedText = objc.IBOutlet()
# Default font size to use when highlighting
fontSize = 12
def applicationDidFinishLaunching_(self, sender):
NSLog("Application did finish launching.")
def textDidChange_(self, notification):
"""
Delegate method called by the NSTextView whenever the contents of the
text view have changed. This is called after the text has changed and
been committed to the view. See the Cocoa reference documents:
http://developer.apple.com/documentation/Cocoa/Reference/ApplicationKit/Classes/NSText_Class/Reference/Reference.html
http://developer.apple.com/documentation/Cocoa/Reference/ApplicationKit/Classes/NSTextView_Class/Reference/Reference.html
Specifically the sections on Delegate Methods for information on additional
delegate methods relating to text control is NSTextView objects.
"""
# Retrieve the current contents of the document and start highlighting
content = self.highlightedText.string()
self.highlightText(content)
def setAttributesForRange(self, color, font, rangeStart, rangeLength):
"""
Set the visual attributes for a range of characters in the NSTextView. If
values for the color and font are None, defaults will be used.
The rangeStart is an index into the contents of the NSTextView, and
rangeLength is used in combination with this index to create an NSRange
structure, which is passed to the NSTextView methods for setting
text attributes. If either of these values are None, defaults will
be provided.
The "font" parameter is used as an key for the "fontMap", which contains
the associated NSFont objects for each font style.
"""
fontMap = {
"normal" : NSFont.systemFontOfSize_(self.fontSize),
"bold" : NSFont.boldSystemFontOfSize_(self.fontSize)
}
# Setup sane defaults for the color, font and range if no values
# are provided
if color is None:
color = NSColor.blackColor()
if font is None:
font = "normal"
if font not in fontMap:
font = "normal"
displayFont = fontMap[font]
if rangeStart is None:
rangeStart = 0
if rangeLength is None:
rangeLength = len(self.highlightedText.string()) - rangeStart
# Set the attributes for the specified character range
range = NSRange(rangeStart, rangeLength)
self.highlightedText.setTextColor_range_(color, range)
self.highlightedText.setFont_range_(displayFont, range)
def highlightText(self, content):
"""
Apply our customized highlighting to the provided content. It is assumed that
this content was extracted from the NSTextView.
"""
# Calling the setAttributesForRange with no values creates
# a default that "resets" the formatting on all of the content
self.setAttributesForRange(None, None, None, None)
# We'll highlight the content by breaking it down into lines, and
# processing each line one by one. By storing how many characters
# have been processed we can maintain an "offset" into the overall
# content that we use to specify the range of text that is currently
# being highlighted.
contentLines = content.split("\n")
highlightOffset = 0
for line in contentLines:
if line.strip().startswith("#"):
# Comment - we want to highlight the whole comment line
self.setAttributesForRange(NSColor.greenColor(), None, highlightOffset, len(line))
elif line.find(":") > -1:
# Tag - we only want to highlight the tag, not the colon or the remainder of the line
startOfLine = line[0: line.find(":")]
yamlTag = startOfLine.strip("\t ")
yamlTagStart = line.find(yamlTag)
self.setAttributesForRange(NSColor.blueColor(), "bold", highlightOffset + yamlTagStart, len(yamlTag))
elif line.strip().startswith("-"):
# List item - we only want to highlight the dash
listIndex = line.find("-")
self.setAttributesForRange(NSColor.redColor(), None, highlightOffset + listIndex, 1)
# Add the processed line to our offset, as well as the newline that terminated the line
highlightOffset += len(line) + 1
It all depends on what word is.
In [6]: word = ':Hello Hello :Hello'
In [7]: word.strip().startswith(':')
Out[7]: True
In [8]: len(word)
Out[8]: 19
Compare:
In [1]: line = ':Hello Hello :Hello'.split()
In [2]: line
Out[2]: [':Hello', 'Hello', ':Hello']
In [3]: for word in line:
print word.strip().startswith(':')
print len(word)
...:
True
6
False
5
True
6
Notice the difference in len(word), which I suspect is causing your problem.