compare two python strings that contain numbers [duplicate] - python

This question already has answers here:
How do I compare version numbers in Python?
(16 answers)
Closed 5 months ago.
UPDATE: I should have specified this sooner, but not all of the names are simply floats. For example, some of them are "prefixed" with "YT". So for example" YT1.1. so, you have the same problem YT1.9 < YT1.11 should be true. I'm really surprised that the string comparison fails....
hello,
this should be a pretty simple question but I can't seem to find the answer. I'd like to sort a bunch of XL worksheets by name. Each of the names are numbers but in the same way that textbook "sections" are numbered, meaning section 4.11 comes after 4.10 which both come after 4.9 and 4.1. I thought simply comparing these numbers as string would do but I get the following:
>>> s1 = '4.11'
>>> s2 = '4.2'
>>> s1> s2
False
>>> n1 = 4.11
>>> n2 = 4.2
>>> n1 > n2
False
how can I compare these two values such that 4.11 is greater than 4.2?

Convert the names to tuples of integers and compare the tuples:
def splittedname(s):
return tuple(int(x) for x in s.split('.'))
splittedname(s1) > splittedname(s2)
Update: Since your names apparently can contain other characters than digits, you'll need to check for ValueError and leave any values that can't be converted to ints unchanged:
import re
def tryint(x):
try:
return int(x)
except ValueError:
return x
def splittedname(s):
return tuple(tryint(x) for x in re.split('([0-9]+)', s))
To sort a list of names, use splittedname as a key function to sorted:
>>> names = ['YT4.11', '4.3', 'YT4.2', '4.10', 'PT2.19', 'PT2.9']
>>> sorted(names, key=splittedname)
['4.3', '4.10', 'PT2.9', 'PT2.19', 'YT4.2', 'YT4.11']

This is not a built-in method, but it ought to work:
>>> def lt(num1, num2):
... for a, b in zip(num1.split('.'), num2.split('.')):
... if int(a) < int(b):
... return True
... if int(a) > int(b):
... return False
... return False
...
... lt('4.2', '4.11')
0: True
That can be cleaned up, but it gives you the gist.

What you're looking for is called "natural sorting". That is opposed to "lexicographical sorting". There are several recipes out there that do this, since the exact output of what you want is implementation specific. A quick google search yields this (note* this is not my code, nor have I tested it):
import re
def tryint(s):
try:
return int(s)
except:
return s
def alphanum_key(s):
""" Turn a string into a list of string and number chunks.
"z23a" -> ["z", 23, "a"]
"""
return [ tryint(c) for c in re.split('([0-9]+)', s) ]
def sort_nicely(l):
""" Sort the given list in the way that humans expect.
"""
l.sort(key=alphanum_key)
http://nedbatchelder.com/blog/200712.html#e20071211T054956

use s1.split(".") to create a list of the items before and after the decimal then sort the list of lists, example:
import random
sheets = list([str(x), str(y)] for x in xrange(1, 5) for y in xrange(0,99))
print sheets
#sheets in order
random.shuffle(sheets)
print sheets
#sheets out of order
sheets.sort()
print sheets
#sheets back in order
So, you implementation might be:
#assume input sheets is a list of the worksheet names
sheets = list(x.split(".") for x in input_sheets)
sheets.sort()

If you know they are real numbers [*] , simply:
>>> float(s1) > float(s2)
True
[*] Otherwise, be ready to handle a raised ValueError.

Related

How to write a function to compare two strings alphabetically?

If a="formula" and b="formulab" are two strings, how can the function compare which string comes after the other and return a true or false value?
We know string b will come after, but how to determine that using the function?
def alphabet_order(a,b):
if len(a) > len(b):
return True
else:
return False
I am getting the length of the strings, but how can I sort the strings lexicographically and compare them?
Python compares strings lexicographically based on their order in the ascii table for alphabet (where each letter is essentially assigned a value).
Here is a link to the numeric order: https://www.asciitable.com/
So you now have two options:
compare by length (using len()).
compare by value.
Here is an example of the value comparison:
a = 'hello'
b = 'world'
if a > b:
print('a > b')
else :
print('a < b')
which returns this:
a < b
because "hello comes" before "world" in that ordering.
You can wrap the above into a function.

incorrect output while calling function

I'm a noobie, learning to code and i stumbled upon an incorrect output while practicing a code in python, please help me with this. I tried my best to find the problem in the code but i could not find it.
Code:
def compare(x,y):
if x>y:
return 1
elif x==y:
return 0
else:
return -1
i=raw_input("enter x\n")
j=raw_input("enter y\n")
print compare(i,j)
Output:
-> python python.py
enter x
10
enter y
5
-1
The output that i had to receive is 1 but the output that i receive is -1. Please help me with the unseen error in my code.
Thank you.
raw_input returns a string always.
so you have to convert the input values into numbers.
i=raw_input("enter x\n")
j=raw_input("enter y\n")
print compare(i,j)
should be
i=int(raw_input("enter x\n"))
j=int(raw_input("enter y\n"))
print compare(i,j)
Your issue is that raw_input() returns a string, not an integer.
Therefore, what your function is actually doing is checking "10" > "5", which is False, therefore it falls through your if block and reaches the else clause.
To fix this, you'll need to cast your input strings to integers by wrapping the values in int().
i.e.
i = int(raw_input("enter x\n")).
Use the inbuilt cmp builtin function.
>>> help(cmp)
Help on built-in function cmp in module __builtin__:
cmp(...)
cmp(x, y) -> integer
Return negative if x<y, zero if x==y, positive if x>y.
So your function will look like this.
>>> def compare(x,y):
... return cmp(x,y)
...
>>>
Then get two variables using raw_input() which returns string, So If you are typing two numbers with a blankspace in the middle, splitting based on blank space will save two numbers in these x and y, and then apply map function which takes two parameters, one int function and the sequence which is nothing but a list created out of split().
>>> x,y = map(int, raw_input().split())
3 2
Now Comparing the x and y, Since x = 3 and y =2, Now since as per the documentation of cmp(), It Return negative if xy.
>>> compare(x,y)
1
>>> compare(y,x)
-1
>>> compare(x-1,y)
0
>>>

Python: general iterator or pure function for testing any condition across list

I would like to have a function AllTrue that takes three arguments:
List: a list of values
Function: a function to apply to all values
Condition: something to test against the function's output
and return a boolean of whether or not all values in the list match the criteria.
I can get this to work for basic conditions as follows:
def AllTrue(List, Function = "Boolean", Condition = True):
flag = True
condition = Condition
if Function == "Boolean"
for element in List:
if element != condition:
flag = False
break
else:
Map = map(Function, List)
for m in Map:
if m != condition:
flag = False
break
return flag
Since python doesn't have function meant for explicitly returning if something is True, I just make the default "Boolean". One could clean this up by defining TrueQ to return True if an element is True and then just mapping TrueQ on the List.
The else handles queries like:
l = [[0,1], [2,3,4,5], [6,7], [8,9],[10]]
AllTrue(l, len, 2)
#False
testing if all elements in the list are of length 2. However, it can't handle more complex conditions like >/< or compound conditions like len > 2 and element[0] == 15
How can one do this?
Cleaned up version
def TrueQ(item):
return item == True
def AllTrue(List, Function = TrueQ, Condition = True):
flag = True
condition = Condition
Map = map(Function, List)
for m in Map:
if m != condition:
flag = False
break
return flag
and then just call AllTrue(List,TrueQ)
Python already has built-in the machinery you are trying to build. For example to check if all numbers in a list are even the code could be:
if all(x%2==0 for x in L):
...
if you want to check that all values are "truthy" the code is even simpler:
if all(L):
...
Note that in the first version the code is also "short-circuited", in other words the evaluation stops as soon as the result is known. In:
if all(price(x) > 100 for x in stocks):
...
the function price will be called until the first stock is found with a lower or equal price value. At that point the search will stop because the result is known to be False.
To check that all lengths are 2 in the list L the code is simply:
if all(len(x) == 2 for x in L):
...
i.e. more or less a literal translation of the request. No need to write a function for that.
If this kind of test is a "filter" that you want to pass as a parameter to another function then a lambda may turn out useful:
def search_DB(test):
for record in database:
if test(record):
result.append(record)
...
search_DB(lambda rec: all(len(x) == 2 for x in rec.strings))
I want a function that takes a list, a function, and a condition, and tells me if every element in the list matches the condition. i.e. foo(List, Len, >2)
In Python >2 is written lambda x : x>2.
There is (unfortunately) no metaprogramming facility in Python that would allow to write just >2 or things like ยท>2 except using a string literal evaluation with eval and you don't want to do that. Even the standard Python library tried going down that path (see namedtuple implementation in collections) but it's really ugly.
I'm not saying that writing >2 would be a good idea, but that it would be nice to have a way to do that in case it was a good idea. Unfortunately to have decent metaprogramming abilities you need a homoiconic language representing code as data and therefore you would be programming in Lisp or another meta-language, not Python (programming in Lisp would indeed be a good idea, but for reasons unknown to me that approach is still unpopular).
Given that, the function foo to be called like
foo(L, len, lambda x : x > 2)
is just
def foo(L, f=lambda x : x, condition=lambda x: x):
return all(condition(f(x)) for x in L)
but no Python programmer would write such a function, because the original call to foo is actually more code and less clear than inlining it with:
all(len(x) > 2 for x in L)
and requires you to also learn about this thing foo (that does what all and a generator expression would do, just slower, with more code and more obfuscated).
You are reinventing the wheel. Just use something like this:
>>> l = [[0,1], [2,3,4,5], [6,7], [8,9],[10]]
>>> def all_true(iterable, f, condition):
... return all(condition(f(e)) for e in iterable)
...
>>> def cond(x): return x == 2
...
>>> all_true(l, len, cond)
False
You can define a different function to check a different condition:
>>> def cond(x): return x >= 1
...
>>> all_true(l, len, b)
True
>>>
And really, having your own function that does this seems like overkill. For example, to deal with your "complex condition" you could simply do something like:
>>> l = [[0,2],[0,1,2],[0,1,3,4]]
>>> all(len(sub) > 2 and sub[0] == 5 for sub in l)
False
>>> all(len(sub) > 1 and sub[0] == 0 for sub in l)
True
>>>
I think the ideal solution in this case may be:
def AllTrue(List, Test = lambda x:x):
all(Test(x) for x in List)
This thereby allows complex queries like:
l = [[0, 1], [1, 2, 3], [2, 5]]
AllTrue(l, lambda x: len(x) > 2 and x[0] == 1)
To adhere to Juanpa's suggestion, here it is in python naming conventions and an extension of what I posted in the question now with the ability to handle simple conditions like x > value.
from operator import *
all_true(a_list, a_function, an_operator, a_value):
a_map = map(a_function, a_list)
return all( an_operator(m, a_value) for m in a_map)
l = [[0,2],[0,1,2],[0,1,3,4]]
all_true(l, len, gt, 2)
#True
Note: this works for single conditions, but not for complex conditions like
len > 2 and element[0] == 5

Python newbie clarification about tuples and strings

I just learned that I can check if a substring is inside a string using:
substring in string
It looks to me that a string is just a special kind of tuple where its elements are chars. So I wonder if there's a straightforward way to search a slice of a tuple inside a tuple. The elements in the tuple can be of any type.
tupleslice in tuple
Now my related second question:
>>> tu = 12 ,23, 34,56
>>> tu[:2] in tu
False
I gather that I get False because (12, 23) is not an element of tu. But then, why substring in string works?. Is there syntactic sugar hidden behind scenes?.
string is not a type of tuple. Infact both belongs to different class. How in statement will be evaluated is based on the __contains__() magic function defined within there respective class.
Read How do you set up the contains method in python, may be you will find it useful. To know about magic functions in Python, read: A Guide to Python's Magic Methods
A string is not just a special kind of tuple. They have many similar properties, in particular, both are iterators, but they are distinct types and each defines the behavior of the in operator differently. See the docs on this here: https://docs.python.org/3/reference/expressions.html#in
To solve your problem of finding whether one tuple is a sub-sequence of another tuple, writing an algorithm like in your answer would be the way to go. Try something like this:
def contains(inner, outer):
inner_len = len(inner)
for i, _ in enumerate(outer):
outer_substring = outer[i:i+inner_len]
if outer_substring == inner:
return True
return False
This is how I accomplished to do my first request, however, it's not straightforward nor pythonic. I had to iterate the Java way. I wasn't able to make it using "for" loops.
def tupleInside(tupleSlice):
i, j = 0, 0
while j < len(tu):
t = tu[j]
ts = tupleSlice[i]
print(t, ts, i, j)
if ts == t:
i += 1
if i == len(tupleSlice):
return True
else:
j -= i
i = 0
j += 1
return False
tu = tuple('abcdefghaabedc')
print(tupleInside(tuple(input('Tuple slice: '))))
Try just playing around with tuples and splices. In this case its pretty easy because your splice is essentially indexing.
>>> tu = 12 ,23, 34,56
>>> tu
(12, 23, 34, 56) #a tuple of ints
>>> tu[:1] # a tuple with an int in it
(12,)
>>> tu[:1] in tu #checks for a tuple against int. no match.
False
>>> tu[0] in tu #checks for int against ints. matched!
True
>>> #you can see as we iterate through the values...
>>> for i in tu:
print(""+str(tu[:1])+" == " + str(i))
(12,) == 12
(12,) == 23
(12,) == 34
(12,) == 56
Splicing is returning a list of tuples, but you need to index further to compare in by values and not containers. Spliced strings return values, strings and the in operator can compare to values, but splicing tuples returns tuples, which are containers.
Just adding to Cameron Lee's answer so that it accepts inner containing a single integer.
def contains(inner, outer):
try:
inner_len = len(inner)
for i, _ in enumerate(outer):
outer_substring = outer[i:i+inner_len]
if outer_substring == inner:
return True
return False
except TypeError:
return inner in outer
contains(4, (3,1,2,4,5)) # returns True
contains((4), (3,1,2,4,5)) # returns True

in Python how to convert number to float in a mixed list [duplicate]

This question already has answers here:
Python - How to convert only numbers in a mixed list into float?
(4 answers)
Closed 2 years ago.
I have a list of strings in the form like
a = ['str','5','','4.1']
I want to convert all numbers in the list to float, but leave the rest unchanged, like this
a = ['str',5,'',4.1]
I tried
map(float,a)
but apparently it gave me an error because some string cannot be converted to float. I also tried
a[:] = [float(x) for x in a if x.isdigit()]
but it only gives me
[5]
so the float number and all other strings are lost. What should I do to keep the string and number at the same time?
>>> a = ['str','5','','4.1']
>>> a2 = []
>>> for s in a:
... try:
... a2.append(float(s))
... except ValueError:
... a2.append(s)
>>> a2
['str', 5.0, '', 4.0999999999999996]
If you're doing decimal math, you may want to look at the decimal module:
>>> import decimal
>>> for s in a:
... try:
... a2.append(decimal.Decimal(s))
... except decimal.InvalidOperation:
... a2.append(s)
>>> a2
['str', Decimal('5'), '', Decimal('4.1')]
for i, x in enumerate(a):
try:
a[i] = float(x)
except ValueError:
pass
This assumes you want to change a in place, for creating a new list you can use the following:
new_a = []
for x in a:
try:
new_a.append(float(x))
except ValueError:
new_a.append(x)
This try/except approach is standard EAFP and will be more efficient and less error prone than checking to see if each string is a valid float.
Here's a way to do it without exception handling and using a bit of regex: -
>>> a = ['str','5','','4.1']
>>> import re
>>> [float(x) if re.match("[+-]?(?:\d+(?:\.\d+)?|\.\d+)$", x) else x for x in a]
4: ['str', 5.0, '', 4.1]
Note that, this regex will cover only a basic range of numbers, applicable in your case. For more elaborate regex to match a wider range of floating-point numbers, like, including exponents, you can take a look at this question: -
Extract float/double value
My version:
def convert(value):
try:
return float(value)
except ValueError:
return value
map(convert, a)

Categories