Find "n" in a list full of strings - python

I'm looking to go through a list and find any element with a number.
This is what i got so far
list = ['Alvarez, S', 'Crawford, B', 'Fury, 8', 'Mayweather, F', 'Lopez, 44']
num = '8'
for s in home_pitchers:
if num in s:
print(s)
print(ex)
>>> Fury, 8
What I'm looking to do is to have num be 0 - 9. I thought about using '[^0-9]' but that didn't work.
Ultimately I'm looking to print out this
print
>>> Fury, 8
>>> Lopez, 44
Just a heads up, I'm pretty new to coding so some concept might go over my head

You can use isdigit method with any function. The isdigit method return True if the string is a digit string, False otherwise.
>>> lst = ['Alvarez, S', 'Crawford, B', 'Fury, 8', 'Mayweather, F', 'Lopez, 44']
>>>
>>> for s in lst:
... if any(char.isdigit() for char in s):
... print(s)
...
Fury, 8
Lopez, 44

Using the re library:
import re
lst = ['Alvarez, S', 'Crawford, B', 'Fury, 8', 'Mayweather, F', 'Lopez, 44']
list(filter(lambda x:re.match(".*[0-9]+$",x), lst))
OUTPUT:
['Fury, 8', 'Lopez, 44']
The pattern matches any string ending with one or more numbers.

Related

How to find a substring in a string list?

I try to find some string in a list, but have problems because of word order.
list = ['a b c d', 'e f g', 'h i j k']
str = 'e g'
I need to find the 2nd item in a list and output it.
You can use combination of any() and all() to check the presence in one line:
>>> my_list = ['a b c d', 'e f g', 'h i j k']
>>> my_str = 'e g'
>>> any(all(s in sub_list for s in my_str.split()) for sub_list in my_list)
True
Here, above expression will return True / False depending on whether the char in your strings are present inside the list.
To also get the get that sub-list as return value, you can modify above expression by skipping any() with list comprehension as:
>>> [sub_list for sub_list in my_list if all(s in sub_list for s in my_str.split())]
['e f g']
It'll return the list of strings containing your chars.
You can try:
for l in list:
l_words = l.split(" ")
if all([x in l_words for x in str.split(" ")]):
print(l_words)
You can try this
list = ['a b c d', 'e f g', 'h i j k']
str = list[2].split()
for letter in str:
print(letter)
This can be achieved by using sets and list comprehension
ls = ['a b c d', 'e f g', 'h i j k']
s = 'e g'
print([i for i in ls if len(set(s.replace(" ", "")).intersection(set(i.replace(" ", "")))) == len(s.replace(" ", ""))])
OR
ls = ['a b c d', 'e f g', 'h i j k']
s = 'e g'
s_set = set(s.replace(" ", ""))
print([i for i in ls if len(s_set.intersection(set(i.replace(" ", "")))) == len(s_set)])
Output
['e f g']
The list comprehension is removing all the items in ls that all the chars from s are not including inside the list item, by that you will get all the ls items that all the s chars are in them.

Sort list of strings by very special key [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I must implement sorting the list of strings in a way which is much similar to sorted function, but with one important distinction. As you know, the sorted function accounts space character prior digits character, so sorted(['1 ', ' 9']) will give us [' 9', '1 ']. I need sorted that accounts digit character prior space chars, so in our example the result will be ['1 ', ' 9'].
Update
As I understand, by default the sorted behaviour relies on the order of chars in ascii 'alphabet' (i.e. ''.join([chr(i) for i in range(59, 127)])), so I decided to implement my own ascii 'alphabet' in the my_ord function.
I planned to use this function in junction with simple my_sort function as a key for sorted,
def my_ord(c):
punctuation1 = ''.join([chr(i) for i in range(32, 48)])
other_stuff = ''.join([chr(i) for i in range(59, 127)])
my_alphabet = string.digits + punctuation1 + other_stuff
return my_alphabet.find(c)
def my_sort(w):
return sorted(w, key=my_ord)
like this: sorted([' 1 ', 'abc', ' zz zz', '9 '], key=my_sort).
What I'm expecting in this case, is ['9 ', ' 1 ', ' zz zz', 'abc']. Unfortunately, the result not only doesn't match the expected - moreover, it differs from time to time.
You can use lstrip as the key function to ignore the whitespace on the left, front of the string.
r = sorted(['1 ', ' 9' , ' 4', '2 '], key=str.lstrip)
# r == ['1 ', '2 ', ' 4', ' 9']
key specifies a function of one argument that is used to extract a comparison key from each list element, doc.
Try this
import string
MY_ALPHABET = (
string.digits
+ ''.join([chr(i) for i in range(32, 127) if chr(i) not in string.digits])
)
inp = [' 1 ', 'abc', ' zz zz', '9 ', 'a 1', 'a ']
print(inp, '-->', sorted(inp, key=lambda w: [MY_ALPHABET.index(c) for c in w]))
You want a combination of lexical and numerical sorting. You can do that by chopping up the string into a tuple and converting the digits to int. Now the tuple compare will consider each element by its own comparison rules.
I've used regex to split the string into (beginning text, white space, the digits, everything else) created an int and used that for the key. if the string didn't match the pattern, it just returns the original string in a tuple so that it can be used for comparison also.
I moved the whitespace before the digit (group(2)) after the digit but it may make more sense to leave it out of the comparison completely.
import re
test = ['1 ', ' 9']
wanted = ['1 ', ' 9']
def sort_key(val):
"""Return tuple of (text, int, spaces, remainder) or just
(text) suitable for sorting text lexagraphically but embedded
number numerically"""
m = re.match(r"(.*?)(\s*)(\d+)(.*)", val)
if m:
return (m.group(1), int(m.group(3)), m.group(2), m.group(4))
else:
return (val,)
result = sorted(test, key=sort_key)
print(test, '-->', result)
assert result == wanted, "results compare"
For completeness and maybe efficiency in extreme cases, here is a solution using numpy argsort:
import numpy as np
lst = ['1 ', ' 9' , ' 4', '2 ']
order = np.argsort(np.array([s.lstrip() for s in lst]))
result = list(np.array(lst)[order])
Overall, I think that using sorted(..., key=...) is generally superior and this solution makes more sense if the input is already a numpy array. On the other hand, it uses strip() only once per item and makes use of numpy, so it is possible that for large enough lists, it could be faster. Additionally, it produces order, whitch shows where each sorted element was in the original list.
As a last comment, from the code you provide, but not the example you give, I am not sure if you just want to strip the leading white spaces, or do more, e.g. best-way-to-strip-punctuation-from-a-string-in-python, or first order on the string without punctuatation and then if they are equal, order on the rest (solution by tdelaney) In any case it might not be a bad idea to compile a pattern, e.g.
import numpy as np
import re
pattern = re.compile(r'[^\w]')
lst = ['1 ', ' 9' , ' 4', '2 ']
order = np.argsort(np.array([pattern.sub('',s) for s in lst]))
result = list(np.array(lst)[order])
or:
import re
pattern = re.compile(r'[^\w]')
r = sorted(['1 ', ' 9' , ' 4', '2 '], key= lambda s: pattern.sub('',s))

comparing values in two dictionaries in python

I am stuck in the middle of my coding because of this:
I have two dictionaries as follows:
a = {0:['1'],1:['0','-3']}
b = {'box 4': ['0 and 2', '0 and -3', ' 0 and -1', ' 2 and 3'], 'box 0': [' 1 ', ' 1 and 4 ', ' 3 and 4']
I want to find if the values in the first dictionaries match the values in the second and if it does, I want to return the matched key and values in dictionary b.
For example: The result of the comparison will return box4, ['0','-3'] here as ['0','-3'] is an item in a and it has been found also in b ['0 and -3'], however if only '3' has been found I don't want it to return anything as there's no values match it. the result will also return box0, ['1'] as it is an item in a and it has been found also in b.
Any ideas ? I appreciate any helps.
You say, "the result of the comparison will return box4 here as ['0','-3'] is an item in a and it has been found also in b ['0 and -3'],". I do not see '0 and -3' in b.
Also, your question is not clear enough. Your code snippets are not complete and you have presented just once case here.
Nevertheless, I will make the mistake of assuming that you want something like this
normalized_values = set([" and ".join(tokens) for tokens in a.values()])
for k in b:
if normalized_values.intersection(set(b[k])):
print k
here you go: its simple coded,
>>> a_values = a.values()
>>> for x,y in b.items():
... for i in y:
... i = i.strip()
... if len(i)>1:
... i = i.split()[::2]
... if i in a_values:
... print x,i
... else:
... if list(i) in a_values:
... print x,list(i)
box 4 ['0', '-3']
box 0 ['1']
pythonic way:
>>> [ [x,i] for x,y in b.items() for i in y if re.findall('-?\d',i) in a_values ]
[['box 4', ' 0 and -3'], ['box 0', ' 1 ']]

Concatenating Arbitrary number of items of a string in Python

Given a list ['a','b','c','d','e','f'].No. of divisions to be made 2.. So In the first string i want to take the 0,2,4 elements of the list, and then concatenate them separated by a space delimiter with the second string of 1,3,5 elements.
The output needs to be in the form of k = ["a c e", "b d f"]
The actual program is to take in a string (eg {ball,bat,doll,choclate,bat,kite}), also take in the input of the number of kids who take those gifts(eg 2), and then divide them so that the frst kid gets a gift, goes to the back of the line, the second kid takes the gift and stands at the back, in that way all kids take gifts. If gifts remain then the first kid again takes a gift and the cycle continues....
desired output for above eg: {"ball doll bat" , "bat choclate kite"}
Here is a general way to do this for any number of groups:
def merge(lst, ngroups):
return [' '.join(lst[start::ngroups]) for start in xrange(ngroups)]
Here is how it's used:
>>> lst = ['a','b','c','d','e','f']
>>> merge(lst, 2)
['a c e', 'b d f']
>>> merge(lst, 3)
['a d', 'b e', 'c f']
lst = ['a','b','c','d','e','f']
k = [" ".join(lst[::2]), " ".join(lst[1::2])]
output:
['a c e', 'b d f']
more generic solution:
def group(lst, n):
return [" ".join(lst[i::n]) for i in xrange(n)]
lst = ['a','b','c','d','e','f']
print group(lst, 3)
output:
['a d', 'b e', 'c f']

How to sort alpha numeric set in python

I have a set
set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
After sorting I want it to look like
4 sheets,
12 sheets,
48 sheets,
booklet
Any idea please
Jeff Atwood talks about natural sort and gives an example of one way to do it in Python. Here is my variation on it:
import re
def sorted_nicely( l ):
""" Sort the given iterable in the way that humans expect."""
convert = lambda text: int(text) if text.isdigit() else text
alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ]
return sorted(l, key = alphanum_key)
Use like this:
s = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
for x in sorted_nicely(s):
print(x)
Output:
4 sheets
12 sheets
48 sheets
booklet
One advantage of this method is that it doesn't just work when the strings are separated by spaces. It will also work for other separators such as the period in version numbers (for example 1.9.1 comes before 1.10.0).
Short and sweet:
sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
This version:
Works in Python 2 and Python 3, because:
It does not assume you compare strings and integers (which won't work in Python 3)
It doesn't use the cmp parameter to sorted (which doesn't exist in Python 3)
Will sort on the string part if the quantities are equal
If you want printed output exactly as described in your example, then:
data = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
r = sorted(data, key=lambda item: (int(item.partition(' ')[0])
if item[0].isdigit() else float('inf'), item))
print ',\n'.join(r)
You should check out the third party library natsort. Its algorithm is general so it will work for most input.
>>> import natsort
>>> your_list = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
>>> print ',\n'.join(natsort.natsorted(your_list))
4 sheets,
12 sheets,
48 sheets,
booklet
A simple way is to split up the strings to numeric parts and non-numeric parts and use the python tuple sort order to sort the strings.
import re
tokenize = re.compile(r'(\d+)|(\D+)').findall
def natural_sortkey(string):
return tuple(int(num) if num else alpha for num, alpha in tokenize(string))
sorted(my_set, key=natural_sortkey)
It was suggested that I repost this answer over here since it works nicely for this case also
from itertools import groupby
def keyfunc(s):
return [int(''.join(g)) if k else ''.join(g) for k, g in groupby(s, str.isdigit)]
sorted(my_list, key=keyfunc)
Demo:
>>> my_set = {'booklet', '4 sheets', '48 sheets', '12 sheets'}
>>> sorted(my_set, key=keyfunc)
['4 sheets', '12 sheets', '48 sheets', 'booklet']
For Python3 it's necessary to modify it slightly (this version works ok in Python2 too)
def keyfunc(s):
return [int(''.join(g)) if k else ''.join(g) for k, g in groupby('\0'+s, str.isdigit)]
Generic answer to sort any numbers in any position in an array of strings. Works with Python 2 & 3.
def alphaNumOrder(string):
""" Returns all numbers on 5 digits to let sort the string with numeric order.
Ex: alphaNumOrder("a6b12.125") ==> "a00006b00012.00125"
"""
return ''.join([format(int(x), '05d') if x.isdigit()
else x for x in re.split(r'(\d+)', string)])
Sample:
s = ['a10b20','a10b1','a3','b1b1','a06b03','a6b2','a6b2c10','a6b2c5']
s.sort(key=alphaNumOrder)
s ===> ['a3', 'a6b2', 'a6b2c5', 'a6b2c10', 'a06b03', 'a10b1', 'a10b20', 'b1b1']
Part of the answer is from there
>>> a = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
>>> def ke(s):
i, sp, _ = s.partition(' ')
if i.isnumeric():
return int(i)
return float('inf')
>>> sorted(a, key=ke)
['4 sheets', '12 sheets', '48 sheets', 'booklet']
Based on SilentGhost's answer:
In [4]: a = set(['booklet', '4 sheets', '48 sheets', '12 sheets'])
In [5]: def f(x):
...: num = x.split(None, 1)[0]
...: if num.isdigit():
...: return int(num)
...: return x
...:
In [6]: sorted(a, key=f)
Out[6]: ['4 sheets', '12 sheets', '48 sheets', 'booklet']
sets are inherently un-ordered. You'll need to create a list with the same content and sort that.
For people stuck with a pre-2.4 version of Python, without the wonderful sorted() function, a quick way to sort sets is:
l = list(yourSet)
l.sort()
This does not answer the specific question above (12 sheets will come before 4 sheets), but it might be useful to people coming from Google.
b = set(['booklet', '10-b40', 'z94 boots', '4 sheets', '48 sheets',
'12 sheets', '1 thing', '4a sheets', '4b sheets', '2temptations'])
numList = sorted([x for x in b if x.split(' ')[0].isdigit()],
key=lambda x: int(x.split(' ')[0]))
alphaList = sorted([x for x in b if not x.split(' ')[0].isdigit()])
sortedList = numList + alphaList
print(sortedList)
Out: ['1 thing',
'4 sheets',
'12 sheets',
'48 sheets',
'10-b40',
'2temptations',
'4a sheets',
'4b sheets',
'booklet',
'z94 boots']

Categories