I want to get the follow list:
['value1', 'value2', 'value3']
from the next string:
some_option=value1,value2,value3
Is it ugly to use the following code to get this?
some_string = 'some_option=value1,value2,value3'
print(some_string.split('=')[1].split(','))
I don't agree with the "ugly" statement, as that seems perfectly Pythonic to me, and efficient enough for the job at least. IMHO the question of how the code appears doesn't matter as much as whether it's easily understandable to others what the code is doing, and more importantly whether it actually gets the task accomplished - which it certainly does.
A slight change for optimization I would propose is to pass the number of splits to str.split, and then you can safely access the last result rather than the 2nd element:
some_string = 'some_option=value1,value2,value3'
print(some_string.split('=', 1)[-1].split(','))
"Ugly" is a matter of opinion, and can not be answered on stackoverflow.
The real issue with your code however, is that it has no error checking, so if the input string does not contain an = character, you will be indexing a non-existent list member.
If this code is designed for production application and not some personal tool, you should do something like this:
some_string = 'some_option=value1,value2,value3'
try:
print(some_string.split('=')[1].split(','))
except IndexError:
print('Invalid input')
Or, if you just want to skip the bad strings you can put pass in the except clause.
Without knowing more context, it looks fine to me. If you feel bad about the ugliness, one way to improve might be to put that inside a function, such as
def scrape_values(row):
return row.split('=')[1].split(',')
some_string = 'some_option=value1,value2,value3'
print(scrape_values(some_string))
If you give the function a meaningful name, then we can "abstract away" the ugliness
Related
Am I correct in thinking that that Python doesn't have a direct equivalent for Perl's __END__?
print "Perl...\n";
__END__
End of code. I can put anything I want here.
One thought that occurred to me was to use a triple-quoted string. Is there a better way to achieve this in Python?
print "Python..."
"""
End of code. I can put anything I want here.
"""
The __END__ block in perl dates from a time when programmers had to work with data from the outside world and liked to keep examples of it in the program itself.
Hard to imagine I know.
It was useful for example if you had a moving target like a hardware log file with mutating messages due to firmware updates where you wanted to compare old and new versions of the line or keep notes not strictly related to the programs operations ("Code seems slow on day x of month every month") or as mentioned above a reference set of data to run the program against. Telcos are an example of an industry where this was a frequent requirement.
Lastly Python's cult like restrictiveness seems to have a real and tiresome effect on the mindset of its advocates, if your only response to a question is "Why would you want to that when you could do X?" when X is not as useful please keep quiet++.
The triple-quote form you suggested will still create a python string, whereas Perl's parser simply ignores anything after __END__. You can't write:
"""
I can put anything in here...
Anything!
"""
import os
os.system("rm -rf /")
Comments are more suitable in my opinion.
#__END__
#Whatever I write here will be ignored
#Woohoo !
What you're asking for does not exist.
Proof: http://www.mail-archive.com/python-list#python.org/msg156396.html
A simple solution is to escape any " as \" and do a normal multi line string -- see official docs: http://docs.python.org/tutorial/introduction.html#strings
( Also, atexit doesn't work: http://www.mail-archive.com/python-list#python.org/msg156364.html )
Hm, what about sys.exit(0) ? (assuming you do import sys above it, of course)
As to why it would useful, sometimes I sit down to do a substantial rewrite of something and want to mark my "good up to this point" place.
By using sys.exit(0) in a temporary manner, I know nothing below that point will get executed, therefore if there's a problem (e.g., server error) I know it had to be above that point.
I like it slightly better than commenting out the rest of the file, just because there are more chances to make a mistake and uncomment something (stray key press at beginning of line), and also because it seems better to insert 1 line (which will later be removed), than to modify X-many lines which will then have to be un-modified later.
But yeah, this is splitting hairs; commenting works great too... assuming your editor supports easily commenting out a region, of course; if not, sys.exit(0) all the way!
I use __END__ all the time for multiples of the reasons given. I've been doing it for so long now that I put it (usually preceded by an exit('0');), along with BEGIN {} / END{} routines, in by force-of-habit. It is a shame that Python doesn't have an equivalent, but I just comment-out the lines at the bottom: extraneous, but that's about what you get with one way to rule them all languages.
Python does not have a direct equivalent to this.
Why do you want it? It doesn't sound like a really great thing to have when there are more consistent ways like putting the text at the end as comments (that's how we include arbitrary text in Python source files. Triple quoted strings are for making multi-line strings, not for non-code-related text.)
Your editor should be able to make using many lines of comments easy for you.
This question already has answers here:
python-re: How do I match an alpha character
(3 answers)
Closed 4 years ago.
Ok so basically this is what I know, and it does work, using Python3:
color="Red1 and Blue2!"
color[2]=="d"
True
What I need is that when I call any position, (which inputs any single character Lower or Upper case in the comparison), into the brackets "color[ ]" and compare it to match only with "Lower or Upper case letters" excluding all numbers and characters (.*&^%$##!).
in order words something to the effects below:
color="Red1 and Blue2!"
if color[5]==[a-zA-z]:
doSomething
else:
doSomethingElse
Of course what I just listed above does not work. Perhaps my syntax is wrong, perhaps it just cant be done. If I only use a single letter on the "right" side of the equals, then all is well, But like I said I need whatever single letter is pulled into the left side, to match something on the right.
First off I wan't to make sure that its possible to do, what I'm trying to accomplish?
2nd, if it is indeed possible to do then have this accomplished "Without" importing anything other then "sys".
If the only way to accomplish this is by importing something else, then I will take a look at that suggestion, however I prefer not to import anything if at all possible.
I'v searched my books, and a whole other questions on this site and I can't seem to find anything that matches, thanks.
For the case of looking for letters, a simple .isalpha() check:
if color[5].isalpha():
will work.
For the general case where a specific check function doesn't exist, you can use in checks:
if color[5] in '13579': # Checks for existence in some random letter set
If the "random letter set" is large enough, you may want to preconvert to a frozenset for checking (frozenset membership tests are roughly O(1), vs. O(n) for str, but str tests are optimized enough that you'd need quite a long str before the frozenset makes sense; possibly larger than the one in the example):
CHARSET = frozenset('13579adgjlqetuozcbm')
if color[5] in CHARSET:
Alternatively, you can use regular expressions to get the character classes you were trying to get:
import re
# Do this once up front to avoid recompiling, then use repeatedly
islet = re.compile('^[a-zA-Z]$').match
...
if islet(color[5]):
This is where isalpha() is helpful.
color="Red1 and Blue2!"
if color[5].isalpha():
doSomething
else:
doSomethingElse
There's also isnumeric(), if you need numbers.
Not really sure why you'd require not importing anything from the standard libraries though.
import string
color="Red1 and Blue2!"
if color[5] in string.ascii_letters:
print("do something")
else:
print("do something else")
I am trying to match a string with a regular expression but it is not working.
What I am trying to do is simple, it is the typical situation when an user intruduces a range of pages, or single pages. I am reading the string and checking if it is correct or not.
Expressions I am expecting, for a range of pages are like: 1-3, 5-6, 12-67
Expressions I am expecting, for single pages are like: 1,5,6,9,10,12
This is what I have done so far:
pagesOption1 = re.compile(r'\b\d\-\d{1,10}\b')
pagesOption2 = re.compile(r'\b\d\,{1,10}\b')
Seems like the first expression works, but not the second.
And, would it be possible to merge both of them in one single regular expression?, In a way that, if the user introduces either something like 1-2, 7-10 or something like 3,5,6,7 the expression will be recogniced as good.
Simpler is better
Matching the entire input isn't simple, as the proposed solutions show, at least it is not as simple as it could/should be. Will become read only very quickly and probably be scrapped by anyone that isn't regex savvy when they need to modify it with a simpler more explicit solution.
Simplest
First parse the entire string and .split(","); into individual data entries, you will need these anyway to process. You have to do this anyway to parse out the useable numbers.
Then the test becomes a very simple, test.
^(\d+)(?:-\(d+))?$
It says, that there the string must start with one or more digits and be followed by optionally a single - and one or more digits and then the string must end.
This makes your logic as simple and maintainable as possible. You also get the benefit of knowing exactly what part of the input is wrong and why so you can report it back to the user.
The capturing groups are there because you are going to need the input parsed out to actually use it anyway, this way you get the numbers if they match without having to add more code to parse them again anyway.
This regex should work -
^(?:(\d+\-\d+)|(\d+))(?:\,[ ]*(?:(\d+\-\d+)|(\d+)))*$
Demo here
Testing this -
>>> test_vals = [
'1-3, 5-6, 12-67',
'1,5,6,9,10,12',
'1-3,1,2,4',
'abcd',
]
>>> regex = re.compile(r'^(?:(\d+\-\d+)|(\d+))(?:\,[ ]*(?:(\d+\-\d+)|(\d+)))*$')
>>> for val in test_vals:
print val
if regex.match(val) == None:
print "Fail"
else:
print "Pass"
1-3, 5-6, 12-67
Pass
1,5,6,9,10,12
Pass
1-3,1,2,4.5
Fail
abcd
Fail
Sorry in advance if this is something really easy, I'm very new to Python. This seems like it should be something very simple, but I am having a frustratingly hard time finding an answer online.
I'm writing a Python script in a preliminary, pseudo sort of fashion at the moment, and don't have all variable defined yet. I want to be able to have comments in the middle of a line to symbolize where the variable will go, but without commenting out the entire rest of the line to the right.
To visualize what I mean, this is how what I want is done in C/C++:
int some_variable = /*some_variable_that_doesnt_exist_yet*/ + an_existing_variable;
Basically I need to be able to comment the middle of a line without commenting the left or right sides of said comment. Is there any way to do this?
I know there's a way to do it like this (or something similar to this):
some_variable = #some_variable_that_doesnt_exist_yet
\+ an_existing_variable
...but I'd rather not do it that way if possible, just for ease of reading.
Unfortunately no. But you can always break things into multiple lines and comment in between. Parentheses come in handy for that.
my_var = (#some_variable +
some_other_var)
As with any language switch you will need to learn new habits that fit the features of the language. Python does not have the feature you desire, you could use some horrendous hack to force something that looks a bit similar in but I rather suggest you don't.
Some options are: document the TODO on a neighbouring line, perhaps using docstrings; don't sweat it and figure you'll add it later when your tests start requiring it; or use the fact that variables are lightweight and just create them with dummy values that leave the final calculation unchanged.
Inline comments do not exist in python.
The closest that I know of is the use of strings:
int some_variable = "some_variable_that_doesnt_exist_yet +" and an_existing_variable;
But that is terrible, and you should never do that.
You can't: According to the documentation, comments in Python start with the hash character (#) and extend to the end of the physical line. See An Informal Introduction to Python.
Why not use something like:
name = "Use this for something later"
:
:
name = 27
Python does not have inline or block comments like this. You can add a string (or any other expression), as suggested by others, but you will have to make sure to (consistently) replace all of those placeholders, which is extremely error prone
If it's only the value of the variable that is missing or unclear, and not the variable itself, how about this:
variable_to_be_defined = None # TODO define me!
some_other_variable = variable_to_be_defined + an_existing_variable
I recently wrote a rather ugly looking one-liner, and was wondering if it is better python style to break it up into multiple lines, or leave it as a commented one-liner. I looked in PEP 8, but it did not mention anything about this
This is the code I wrote:
def getlink(url):
return(urllib.urlopen(url).readlines()[425].split('"')[7])
# Fetch the page at "url", read the 426th line, split it along
# quotes, and return the 8th quote delimited section
But would something like this be better style?:
def getlink(url):
url_file = urllib.urlopen(url)
url_data = url_file.readlines()
line = url_data[425]
line = line.split('"')
return line[7]
Or perhaps something in between?
My vote would be based on readability. I find your one-liner quicker to digest than the multi-line example.
One-liners are great as long as it fits in one eye-ful, and collectively they perform one distinct task.
Personally, I would write that as:
def getlink(url):
content = urllib.urlopen(url).readlines()
return content[425].split('"')[7]
(Now, venturing into downvote realm...)
Your block of comments is great for someone unfamiliar with Python, but arguably, they reduce readability by increasing the information to digest. A pythonista reading the code would quickly understand your one-liner, and yet may then proceed to read the comments just in case there are caveats or edge cases to be warned of.
I'm not saying comments are evil, just that verbose comments can have a negative effect on readability. E.g. the classic : x+=1 # increment x by 1
Naturally, this is down to the purpose and audience of the code.
I also find the expression urllib.urlopen(url).readlines()[425].split('"')[7] rather comprehensible.
However, I would prefer:
def getlink(url):
line425 = urllib.urlopen(url).readlines()[425]
return line425.split('"')[7]
To me multi-line version is much better. With multi-line code you break up the logic and use variables to store intermediate output. The variable names then allow me to read the logic and see what my output depends on. Also you don't have to write elaborate comments in this case. I find it easier to read the multi-line version after some months than read the single line version in such cases. The example you posted is not complex, but just to keep consistency I would have written your example code in multiple lines.
The multi-line version conveys semantics, which the one-liner makes harder to grasp.
This is how I read it:
def getlink(url):
url_file = ...
url_data = ...
line = url_data[425]
... = ... .split('"')
return line[7]
Which means I can get the important parts faster and easier, without scrumbling through a long expression mixing:
general calls to urlopen() and readlines() (obvious for a function called getlink(url))
and more specific parts (url_data[425] and line[7]).
However, Shawn Chin's version is even easier to read.
Your one-liner is not that obscene (at least for my eyes), plus it's a good thing you've added the comments .
When write software, think of yourself in 8 months or so, looking again at this piece of code. It should be as readable then, as you perceive it today .
The multiline version is better Python style. It is easier to read, easier to understand, and easier to modify.
This is Python -- easy is good! :)