My aim is to sort a list of strings where words have to be sorted alphabetically.Except words starting with "s" should be at the start of the list (they should be sorted as well), followed by the other words.
The below function does that for me.
def mysort(words):
mylist1 = sorted([i for i in words if i[:1] == "s"])
mylist2 = sorted([i for i in words if i[:1] != "s"])
list = mylist1 + mylist2
return list
I am just looking for alternative approaches to achieve this or if anyone can find any issues with the code above.
You could do it in one line, with:
sorted(words, key=lambda x: 'a' + x if x.startswith('s') else 'b' + x)
The sorted() function takes a keyword argument key, which is used to translate the values in the list before comparisons are done.
For example:
sorted(words, key=str.lower)
# Will do a sort that ignores the case, since instead
# of checking 'A' vs. 'b' it will check str.lower('A')
# vs. str.lower('b').
sorted(intlist, key=abs)
# Will sort a list of integers by magnitude, regardless
# of whether they're negative or positive:
# >>> sorted([-5,2,1,-8], key=abs)
# [1, 2, -5, -8]
The trick I used translated strings like this when doing the sorting:
"hello" => "bhello"
"steve" => "asteve"
And so "steve" would come before "hello" in the comparisons, since the comparisons are done with the a/b prefix.
Note that this only affects the keys used for comparisons, not the data items that come out of the sort.
1 . You can use generator expression inside sorted.
2 . You can use str.startswith.
3 . Don't use list as a variable name.
4 . Use key=str.lower in sorted.
mylist1 = sorted((i for i in words if i.startswith(("s","S"))),key=str.lower)
mylist2 = sorted((i for i in words if not i.startswith(("s","S"))),key=str.lower)
return mylist1 + mylist2
why str.lower?
>>> "abc" > "BCD"
True
>>> "abc" > "BCD".lower() #fair comparison
False
>>> l = ['z', 'a', 'b', 's', 'sa', 'sb', '', 'sz']
>>> sorted(l, key=lambda x:(x[0].replace('s','\x01').replace('S','\x01') if x else '') + x[1:])
['', 's', 'sa', 'sb', 'sz', 'a', 'b', 'z']
This key function replaces, for the purpose of sorting, every value starting with S or s with a \x01 which sorts before everything else.
One the lines of Integer answer I like using a tuple slightly better because is cleaner and also more general (works for arbitrary elements, not just strings):
sorted(key=lambda x : ((1 if x[:1] in ("S", "s") else 2), x))
Explanation:
The key parameter allows sorting an array based on the values of f(item) instead of on the values of item where f is an arbitray function.
In this case the function is anonymous (lambda) and returns a tuple where the first element is the "group" you want your element to end up in (e.g. 1 if the string starts with an "s" and 2 otherwise).
Using a tuple works because tuple comparison is lexicographical on the elements and therefore in the sorting the group code will weight more than the element.
Related
I have the following sorting problem:
Given a list of strings, return the list of strings
sorted, but group all strings starting with 'x' first.
Example: ['mix', 'banana' ,'xyz', 'apple', 'xanadu', 'aardvark']
Will return: ['xanadu', 'xyz', 'aardvark', 'apple', 'banana' ,'mix']
I solved by splitting the list into 2:
def front_x(words):
return [w for w in words if w[0] == "x"] + [w for w in words if w[0] != "x"]
Another pythonic solution for this problem would be using sorted method, like this:
def front_x(words):
return sorted(words, key=lambda x: x if x[0] == 'x' else f'y{x}')
I am having a hard time to understand what is going on after else. Any good soul to help me out? I'm grateful.
return sorted(words, key=lambda x: x if x[0] == 'x' else f'y{x}')
Translation:
"Return a sorted version of words. For each item x in words, use x for comparison when sorting, but only if x[0] is equal to the character 'x'. Otherwise, append 'y' to the front of x and use that for comparison instead"
The f'y{x}' syntax is the f-string syntax. It's equivalent to:
"y" + str(x)
There are plenty of other equivalent ways to insert a character into a string. That's just how it's being done here.
So, if your word list was:
aardvark
xylophone
xamarin
xenophobia
zipper
apple
bagel
yams
The list would be sorted as if it contained the following:
yaardvark
xylophone
xamarin
xenophobia
yzipper
yapple
ybagel
yyams
And therefore the output would be:
xamarin
xenophobia
xylophone
aardvark
apple
bagel
yams
zipper
So, what is happening is that, when the list is sorted, items that start with 'x' will always appear before any other items.
f'y{x}' in an f-string expression that prepends the character 'y' to the original string (x). That way, all items that don't start with 'x' will sort as if they started with 'y', which puts them after all of the items that do start with 'x'.
For example, 'banana' will be sorted as if it was 'ybanana', which naturally places it after 'xyz'.
I am trying to get the word "Test" by taking each character out of the list using positions within it.
Here is my code:
test1 = ["T", "E", "S", "T"]
one = test1[0:1]
two = test1[1:2]
three = test1[2:3]
four = test1[3:4]
print(one, two, three, four)
At the moment my output from the program is:
['T'] ['E'] ['S'] ['T']
Although that does read "Test" it has [] around each letter which I don't want.
[a:b] returns a list with every value from index a until index b.
If you just want to access a singe value from a list you just need to point to the index of the value to access. E.g.
s = ['T', 'e', 's', 't']
print(s[0]) # T
print(s[0:1]) # ['T']
The problem is you are using slices of the list not elements. The syntax l[i1,i2] returns a list with all elements of l between the indices i1 and i2. If one of them is out of bound you get an error. To do what you intended you can do:
one = test[0]
two = test[1]
...
You have slicing and indexing confused. You are using slicing where you should use indexing.
Slicing always returns a new object of the same type, with the given selection elements. Slicing a list always gives you a list again:
>>> test1 = ["T","E","S","T"]
>>> test1[1:3]
['E', 'S']
>>> test1[:1]
['T']
while indexing uses individual positions only (no : colons to separate start and end positions), and gives you the individual elements from the list:
>>> test1[0]
'T'
>>> test1[1]
'E'
Not that you need to use indexing at all. Use the str.join() method instead; given a separator string, this joins the string elements of a list together with that delimiter in between. Use the empty string:
>>> ''.join(test1)
'TEST'
try this
test1 = ["T","E","S","T"]
final = ""
for i in range(0, len(test1)):
final = final + str(test1[i])
I have a list of strings like this:
['Aden', 'abel']
I want to sort the items, case-insensitive.
So I want to get:
['abel', 'Aden']
But I get the opposite with sorted() or list.sort(), because uppercase appears before lowercase.
How can I ignore the case? I've seen solutions which involves lowercasing all list items, but I don't want to change the case of the list items.
In Python 3.3+ there is the str.casefold method that's specifically designed for caseless matching:
sorted_list = sorted(unsorted_list, key=str.casefold)
In Python 2 use lower():
sorted_list = sorted(unsorted_list, key=lambda s: s.lower())
It works for both normal and unicode strings, since they both have a lower method.
In Python 2 it works for a mix of normal and unicode strings, since values of the two types can be compared with each other. Python 3 doesn't work like that, though: you can't compare a byte string and a unicode string, so in Python 3 you should do the sane thing and only sort lists of one type of string.
>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']
>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']
In Python 3 str is unicode but in Python 2 you can use this more general approach which works for both str and unicode:
>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']
You can also try this to sort the list in-place:
>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']
This works in Python 3 and does not involves lowercasing the result (!).
values.sort(key=str.lower)
In python3 you can use
list1.sort(key=lambda x: x.lower()) #Case In-sensitive
list1.sort() #Case Sensitive
I did it this way for Python 3.3:
def sortCaseIns(lst):
lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
for i in range(0, len(lst)):
lst2[i][0] = lst[i].lower()
lst2[i][1] = lst[i]
lst2.sort()
for i in range(0, len(lst)):
lst[i] = lst2[i][1]
Then you just can call this function:
sortCaseIns(yourListToSort)
Case-insensitive sort, sorting the string in place, in Python 2 OR 3 (tested in Python 2.7.17 and Python 3.6.9):
>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower) # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']
The key is key=str.lower. Here's what those commands look like with just the commands, for easy copy-pasting so you can test them:
x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x
Note that if your strings are unicode strings, however (like u'some string'), then in Python 2 only (NOT in Python 3 in this case) the above x.sort(key=str.lower) command will fail and output the following error:
TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'
If you get this error, then either upgrade to Python 3 where they handle unicode sorting, or convert your unicode strings to ASCII strings first, using a list comprehension, like this:
# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)
References:
https://docs.python.org/3/library/stdtypes.html#list.sort
Convert a Unicode string to a string in Python (containing extra symbols)
https://www.programiz.com/python-programming/list-comprehension
Python3:
Sorting is discussed in other answers but here is what is going on behind the scenes with the sort options.
Say we would like to sort the following list case-insensitive we can use 'key=':
strs = ['aa', 'BB', 'zz', 'CC']
strs_sorted = sorted(strs,key=str.lower)
print(strs_sorted)
['aa', 'BB', 'CC', 'zz']
What is happening here ?
The key is telling the sort to use 'proxy values. 'Key=' transforms each element before comparison. The key function takes in 1 value and returns 1 value, and the returned "proxy" value is used for the comparisons within the sort.
Hence we are employing '.lower' to make all of our proxy values all lowercase which eliminates the case differences and returns the list in order by lowercase essentially.
str.lower vs str.casefold
As mentioned in other posts you can also use "casefold()" as the key or anything (for example "len" to sort by char length). The casefold() method is an aggressive lower() method which converts strings to case folded strings for caseless matching.
sorted(strs,key=str.casefold)
What about creating my own sort function?
Generally speaking, it is always best to use the built-in functions for sorting unless there is an extreme need not to. The build-in functions have been unit tested and will most likely the most reliable.
Python2:
Similar principle,
sorted_list = sorted(strs, key=lambda s: s.lower())
Try this
def cSort(inlist, minisort=True):
sortlist = []
newlist = []
sortdict = {}
for entry in inlist:
try:
lentry = entry.lower()
except AttributeError:
sortlist.append(lentry)
else:
try:
sortdict[lentry].append(entry)
except KeyError:
sortdict[lentry] = [entry]
sortlist.append(lentry)
sortlist.sort()
for entry in sortlist:
try:
thislist = sortdict[entry]
if minisort: thislist.sort()
newlist = newlist + thislist
except KeyError:
newlist.append(entry)
return newlist
lst = ['Aden', 'abel']
print cSort(lst)
Output
['abel', 'Aden']
So I have a list of keys:
keys = ['id','name', 'date', 'size', 'actions']
and I also have a list of lists of vales:
values=
[
['1','John','23-04-2015','0','action1'],
['2','Jane','23-04-2015','1','action2']
]
How can I build a dictionary with those keys matched to the values?
The output should be:
{
'id':['1','2'],
'name':['John','Jane'],
'date':['23-04-2015','23-04-2015'],
'size':['0','1'],
'actions':['action1','action2']
}
EDIT:
I tried to use zip() and dict(), but that would only work if the list of values had 1 list, i.e. values = [['1','John','23-04-2015','0','action1']]
for list in values:
dic = dict(zip(keys,list))
I also thought about initialising a dic with the keys, then building the list of values on my own, but I felt that there had to be an easier way to do it.
dic = dict.fromkeys(keys)
for list in values:
ids = list[0]
names = list[1]
dates = list[2]
sizes = list[3]
actions = list[4]
and then finally
dic['id'] = ids
dic['name'] = names
dic['date'] = dates
dic['size'] = sizes
dic['action'] = actions
This seemed really silly and I was wondering what a better way of doing it would be.
>>> keys = ['id','name', 'date', 'size', 'actions']
>>> values = [['1','John','23-04-2015','0','action1'], ['2','Jane','23-04-2015','1','action2']]
>>> c = {x:list(y) for x,y in zip(keys, zip(*values))}
>>> c
{'id': ['1', '2'], 'size': ['0', '1'], 'actions': ['action1', 'action2'], 'date': ['23-04-2015', '23-04-2015'], 'name': ['John', 'Jane']}
>>> print(*(': '.join([item, ', '.join(c.get(item))]) for item in sorted(c, key=lambda x: keys.index(x))), sep='\n')
id: 1, 2
name: John, Jane
date: 23-04-2015, 23-04-2015
size: 0, 1
actions: action1, action2
This uses several tools:
c is created with a dictionary comprehension. Comprehensions are a different way of expressing an iterable like a dictionary or a list. Instead of initializing an empty iterable and then using a loop to add elements to it, a comprehension moves these syntactical structures around.
result = [2*num for num in range(10) if num%2]
is equivalent to
result = []
for num in range(10):
if num%2: # shorthand for "if num%2 results in non-zero", or "if num is not divisible by 2"
result.append(2*num)
and we get [2, 6, 10, 14, 18].
zip() creates a generator of tuples, where each element of each tuple is the corresponding element of one of the arguments you passed to zip().
>>> list(zip(['a','b'], ['c','d']))
[('a', 'c'), ('b', 'd')]
zip() takes multiple arguments - if you pass it one large list containing smaller sublists, the result is different:
>>> list(zip([['a','b'], ['c','d']]))
[(['a', 'b'],), (['c', 'd'],)]
and generally not what we want. However, our values list is just such a list: a large list containing sublists. We want to zip() those sublists. This is a great time to use the * operator.
The * operator represents an "unpacked" iterable.
>>> print(*[1,2,3])
1 2 3
>>> print(1, 2, 3)
1 2 3
It is also used in function definitions:
>>> def func(*args):
... return args
...
>>> func('a', 'b', [])
('a', 'b', [])
So, to create the dictionary, we zip() the lists of values together, then zip() that with the keys. Then we iterate through each of those tuples and create a dictionary out of them, with each tuple's first item being the key and the second item being the value (cast as a list instead of a tuple).
To print this, we could make a large looping structure, or we can make generators (quicker to assemble and process than full data structures like a list) and iterate through them, making heavy use of * to unpack things. Remember, in Python 3, print can accept multiple arguments, as seen above.
We will first sort the dictionary, using each element's position in keys as the key. If we use something like key=len, that sends each element to the len() function and uses the returned length as the key. We use lambda to define an inline, unnamed function, that takes an argument x and returns x's index in the list of keys. Note that the dictionary isn't actually sorted; we're just setting it up so we can iterate through it according to a sort order.
Then we can go through this sorted dictionary and assemble its elements into printable strings. At the top level, we join() a key with its value separated by ': '. Each value has its elements join()ed with ', '. Note that if the elements weren't strings, we would have to turn them into strings for join() to work.
>>> list(map(str, [1,2,3]))
['1', '2', '3']
>>> print(*map(str, [1,2,3]))
1 2 3
The generator that yields each of these join()ed lines is then unpacked with the * operator, and each element is sent as an argument to print(), specifying a separator of '\n' (new line) instead of the default ' ' (space).
It's perfectly fine to use loops instead of comprehensions and *, and then rearrange them into such structures after your logic is functional, if you want. It's not particularly necessary most of the time. Comprehensions sometimes execute slightly faster than equivalent loops, and with practice you may come to prefer the syntax of comprehensions. Do learn the * operator, though - it's an enormously versatile tool for defining functions. Also look into ** (often referred to with "double star" or "kwargs"), which unpacks dictionaries into keyword arguments and can also be used to define functions.
I have a list of strings like this:
['Aden', 'abel']
I want to sort the items, case-insensitive.
So I want to get:
['abel', 'Aden']
But I get the opposite with sorted() or list.sort(), because uppercase appears before lowercase.
How can I ignore the case? I've seen solutions which involves lowercasing all list items, but I don't want to change the case of the list items.
In Python 3.3+ there is the str.casefold method that's specifically designed for caseless matching:
sorted_list = sorted(unsorted_list, key=str.casefold)
In Python 2 use lower():
sorted_list = sorted(unsorted_list, key=lambda s: s.lower())
It works for both normal and unicode strings, since they both have a lower method.
In Python 2 it works for a mix of normal and unicode strings, since values of the two types can be compared with each other. Python 3 doesn't work like that, though: you can't compare a byte string and a unicode string, so in Python 3 you should do the sane thing and only sort lists of one type of string.
>>> lst = ['Aden', u'abe1']
>>> sorted(lst)
['Aden', u'abe1']
>>> sorted(lst, key=lambda s: s.lower())
[u'abe1', 'Aden']
>>> x = ['Aden', 'abel']
>>> sorted(x, key=str.lower) # Or unicode.lower if all items are unicode
['abel', 'Aden']
In Python 3 str is unicode but in Python 2 you can use this more general approach which works for both str and unicode:
>>> sorted(x, key=lambda s: s.lower())
['abel', 'Aden']
You can also try this to sort the list in-place:
>>> x = ['Aden', 'abel']
>>> x.sort(key=lambda y: y.lower())
>>> x
['abel', 'Aden']
This works in Python 3 and does not involves lowercasing the result (!).
values.sort(key=str.lower)
In python3 you can use
list1.sort(key=lambda x: x.lower()) #Case In-sensitive
list1.sort() #Case Sensitive
I did it this way for Python 3.3:
def sortCaseIns(lst):
lst2 = [[x for x in range(0, 2)] for y in range(0, len(lst))]
for i in range(0, len(lst)):
lst2[i][0] = lst[i].lower()
lst2[i][1] = lst[i]
lst2.sort()
for i in range(0, len(lst)):
lst[i] = lst2[i][1]
Then you just can call this function:
sortCaseIns(yourListToSort)
Case-insensitive sort, sorting the string in place, in Python 2 OR 3 (tested in Python 2.7.17 and Python 3.6.9):
>>> x = ["aa", "A", "bb", "B", "cc", "C"]
>>> x.sort()
>>> x
['A', 'B', 'C', 'aa', 'bb', 'cc']
>>> x.sort(key=str.lower) # <===== there it is!
>>> x
['A', 'aa', 'B', 'bb', 'C', 'cc']
The key is key=str.lower. Here's what those commands look like with just the commands, for easy copy-pasting so you can test them:
x = ["aa", "A", "bb", "B", "cc", "C"]
x.sort()
x
x.sort(key=str.lower)
x
Note that if your strings are unicode strings, however (like u'some string'), then in Python 2 only (NOT in Python 3 in this case) the above x.sort(key=str.lower) command will fail and output the following error:
TypeError: descriptor 'lower' requires a 'str' object but received a 'unicode'
If you get this error, then either upgrade to Python 3 where they handle unicode sorting, or convert your unicode strings to ASCII strings first, using a list comprehension, like this:
# for Python2, ensure all elements are ASCII (NOT unicode) strings first
x = [str(element) for element in x]
# for Python2, this sort will only work on ASCII (NOT unicode) strings
x.sort(key=str.lower)
References:
https://docs.python.org/3/library/stdtypes.html#list.sort
Convert a Unicode string to a string in Python (containing extra symbols)
https://www.programiz.com/python-programming/list-comprehension
Python3:
Sorting is discussed in other answers but here is what is going on behind the scenes with the sort options.
Say we would like to sort the following list case-insensitive we can use 'key=':
strs = ['aa', 'BB', 'zz', 'CC']
strs_sorted = sorted(strs,key=str.lower)
print(strs_sorted)
['aa', 'BB', 'CC', 'zz']
What is happening here ?
The key is telling the sort to use 'proxy values. 'Key=' transforms each element before comparison. The key function takes in 1 value and returns 1 value, and the returned "proxy" value is used for the comparisons within the sort.
Hence we are employing '.lower' to make all of our proxy values all lowercase which eliminates the case differences and returns the list in order by lowercase essentially.
str.lower vs str.casefold
As mentioned in other posts you can also use "casefold()" as the key or anything (for example "len" to sort by char length). The casefold() method is an aggressive lower() method which converts strings to case folded strings for caseless matching.
sorted(strs,key=str.casefold)
What about creating my own sort function?
Generally speaking, it is always best to use the built-in functions for sorting unless there is an extreme need not to. The build-in functions have been unit tested and will most likely the most reliable.
Python2:
Similar principle,
sorted_list = sorted(strs, key=lambda s: s.lower())
Try this
def cSort(inlist, minisort=True):
sortlist = []
newlist = []
sortdict = {}
for entry in inlist:
try:
lentry = entry.lower()
except AttributeError:
sortlist.append(lentry)
else:
try:
sortdict[lentry].append(entry)
except KeyError:
sortdict[lentry] = [entry]
sortlist.append(lentry)
sortlist.sort()
for entry in sortlist:
try:
thislist = sortdict[entry]
if minisort: thislist.sort()
newlist = newlist + thislist
except KeyError:
newlist.append(entry)
return newlist
lst = ['Aden', 'abel']
print cSort(lst)
Output
['abel', 'Aden']