I have such loop:
ex = [{u'white': []},
{u'yellow': [u'9241.jpg', []]},
{u'red': [u'241.jpg', []]},
{u'blue': [u'59241.jpg', []]}]
for i in ex:
while not len(i.values()[0]):
break
else:
print i
break
I need always to return first dict with lenght of values what is higher then 0
but i want to make it with list comprehension
A list comprehension would produce a whole list, while you want just one item.
Use a generator expression instead, and have the next() function iterate to the first value:
next((i for i in ex if i.values()[0]), None)
I've given next() a default to return as well; if there is no matching dictionary, None is returned instead.
Demo:
>>> ex = [{u'white': []},
... {u'yellow': [u'9241.jpg', []]},
... {u'red': [u'241.jpg', []]},
... {u'blue': [u'59241.jpg', []]}]
>>> next((i for i in ex if i.values()[0]), None)
{u'yellow': [u'9241.jpg', []]}
You should, however, rethink your data structure. Dictionaries with just one key-value pair suggest to me you wanted a different type instead; tuples perhaps:
ex = [
(u'white', []),
(u'yellow', [u'9241.jpg', []]),
(u'red', [u'241.jpg', []]),
(u'blue', [u'59241.jpg', []]),
]
Related
I want to check if the current id equals last id +1, (this should work for any number of similar dicts added in the list)
Code
listing = [
{
'id': 1,
'stuff': "othervalues"
},
{
'id': 2,
'stuff': "othervalues"
},
{
'id': 3,
'stuff': "othervalues"
}
]
for item in listing :
if item[-1]['id'] == item['id']+1:
print(True)
Output
Traceback (most recent call last):
File "C:\Users\samuk\Desktop\Master\DV\t2\tester.py", line 10, in <module>
if item[-1]['id'] == item['id']+1:
KeyError: -1
Desired result
True
or, in case of fail,
False
To check if all the ids are in a sequence we can use enumerate here.
def is_sequential(listing):
start = listing[0]['id']
for idx, item in enumerate(listing, start):
if item['id'] != idx:
return False
return True
You can save the last id in a variable.
id = listing[0]['id']
for item in listing[1:]:
if id == item['id']:
print(True)
else:
print(False)
id = item['id']
As the saying goes... "There are many ways to skin a cat". So here's another way:
listing = [
{
'id': 1,
'stuff': "othervalues"
},
{
'id': 2,
'stuff': "othervalues"
},
{
'id': 3,
'stuff': "othervalues"
}
]
for i, v in enumerate(listing[1:], 1):
print(v['id'] - 1 == listing[i-1]['id'])
Of course, rather than just checking like this you could always sort the list!
When you do for item in some_list, item is the actual item in that list. You can't get the previous item by item[-1].
So in your case, item is a dictionary that has the keys 'id' and 'stuff'. Obviously, no key called -1, so you get a KeyError.
You can iterate over two slices of the list after using zip() -- one that starts at the zeroth element, and one that starts on the first element. Using zip(), you can get corresponding items of these slices simultaneously. You'll have every item and its subsequent item coming out of zip().
Also, you want a result after you've checked all pairs of items, not while you're checking each pair. To do this, create a variable is_sequential with an initial value of True. If you find a single pair that doesn't work, change this to False, and break out of the loop because one pair being non-sequential makes the entire list so.
is_sequential = True
for item1, item2 in zip(listing, listing[1:]):
if item1['id'] + 1 != item2['id']:
is_sequential = False
break
print(is_sequential)
Alternatively, you could use the any() function with the same loop and condition, and then invert the result, or use all() with the opposite condition:
# use any()
is_sequential = not any( item1['id'] + 1 != item2['id']
for item1, item2 in zip(listing, listing[1:])
)
# or, use all()
is_sequential = all( item1['id'] + 1 == item2['id']
for item1, item2 in zip(listing, listing[1:])
)
You could also use sorted and range to evaluate if the identifiers are sequential:
from typing import Dict, List
def is_sequential(records: List[Dict], key: str) -> bool:
values = [value[key] for value in records]
return sorted(values) == list(range(min(values), max(values)+1))
print(is_sequential(listing, "id"))
Note: Solution is based on approach #1 in this example and only works with integers.
However, I like Pranav Hosangadi's approach with a list comprehension better. But I would replace not any() with all() as suggested by Ch3steR in the comments:
from typing import Dict, List
def is_sequential(records: List[Dict], key: str) -> bool:
return all(l[key]+1 == r[key] for l, r in zip(records, records[1:]))
print(is_sequential(records, "id"))
Ch3steR's solution is probably the most readable code.
I have a list of dictionaries and want each item to be sorted by a specific property values.
The list:
[
{'name':'alpha', status='run'},
{'name':'alpha', status='in'},
{'name':'alpha-32', status='in'},
{'name':'beta', status='out'}
{'name':'gama', status='par'}
{'name':'gama', status='in'}
{'name':'aeta', status='run'}
{'name':'aeta', status='unknown'}
{'pname': 'boc', status='run'}
]
I know I can do:
newlist = sorted(init_list, key=lambda k: (k['name'], k['status'])
but there two more conditions:
If the key name is no present in a dict, for the name to be used the value corresponding to pname key.
the status order to be ['out', 'in', 'par', 'run']
if the status value doesn't correspond with what is in the list, ignore it - see unknown;
The result should be:
[
{'name':'aeta', status='unknown'}
{'name':'aeta', status='run'}
{'name':'alpha', status='in'},
{'name':'alpha', status='run'},
{'name':'alpha-32', status='in'},
{'name':'beta', status='out'},
{'pname': 'boc', status='run'}
{'name':'gama', status='in'},
{'name':'gama', status='par'}
]
Use
from itertools import count
# Use count() instead of range(4) so that we
# don't need to worry about the length of the status list.
newlist = sorted(init_list,
key=lambda k: (k.get('name', k.get('pname')),
dict(zip(['out', 'in', 'par', 'run'], count())
).get(k['status'], -1)
)
)
If k['name'] doesn't exits, fall back to k['pname'] (or None if that doesn't exist). Likewise, if there is no known integer for the given status, default to -1.
I deliberately put this all in one logical line to demonstrate that at this point, you may want to just define the key function using a def statement.
def list_order(k):
name_to_use = k.get('name')
if name_to_use is None:
name_to_use = k['pname'] # Here, just assume pname is available
# Explicit definition; you might still write
# status_orders = dict(zip(['out', ...], count())),
# or better yet a dict comprehension like
# { status: rank for rank, status in enumerate(['out', ...]) }
status_orders = {
'out': 0,
'in': 1,
'par': 2,
'run': 3
}
status_to_use = status_orders.get(k['status'], -1)
return name_to_use, status_to_use
newlist = sorted(init_list, key=list_order)
The first condition is simple, you get default the first value of the ordering tuple to pname, i.e.
lambda k: (k.get('name', k.get('pname')), k['status'])
For the second and third rule I would define an order dict for statuses
status_order = {key: i for i, key in enumerate(['out', 'in', 'par', 'run'])}
and then use it in key-function
lambda k: (k.get('name', k.get('pname')), status_order.get(k['status']))
I haven't tested it, so it might need some tweaking
I have a list as below
tlist=[(‘abc’,HYD,’user1’), (‘xyz’,’SNG’,’user2’), (‘pppp’,’US’,’user3’), (‘qq’,’HK’,’user4’)]
I want to display the second field tuple of provided first field of tuple.
Ex:
tlist(‘xyz’)
SNG
Is there way to get it?
A tuple doesn't have a hash table lookup like a dictionary, so you will need to loop through it in sequence until you find it:
def find_in_tuple(tlist, search_term):
for x, y, z in tlist:
if x == search_term:
return y
print(find_in_tuple(tlist, 'xyz')) # prints 'SNG'
If you plan to do this multiple times, you definitely want to convert to a dictionary. I would recommend making the first element of the tuple the key and then the other two the values for that key. You can do this very easily using a dictionary comprehension.
>>> tlist_dict = { k: (x, y) for k, x, y in tlist } # Python 3: { k: v for k, *v in tlist }
>>> tlist_dict
{'qq': ['HK', 'user4'], 'xyz': ['SNG', 'user2'], 'abc': ['HYD', 'user1'], 'pppp': ['US', 'user3']}
You can then select the second element as follows:
>>> tlist_dict['xyz'][0]
'SNG'
If there would be multiple tuples with xyz as a first item, use the following simple approach(with modified example):
tlist = [('abc','HYD','user1'), ('xyz','SNG','user2'), ('pppp','US','user3'), ('xyz','HK','user4')]
second_fields = [f[1] for f in tlist if f[0] == 'xyz']
print(second_fields) # ['SNG', 'HK']
I'm trying to get the matching IDs and store the data into one list. I have a list of dictionaries:
list = [
{'id':'123','name':'Jason','location': 'McHale'},
{'id':'432','name':'Tom','location': 'Sydney'},
{'id':'123','name':'Jason','location':'Tompson Hall'}
]
Expected output would be something like
# {'id':'123','name':'Jason','location': ['McHale', 'Tompson Hall']},
# {'id':'432','name':'Tom','location': 'Sydney'},
How can I get matching data based on dict ID value? I've tried:
for item in mylist:
list2 = []
row = any(list['id'] == list.id for id in list)
list2.append(row)
This doesn't work (it throws: TypeError: tuple indices must be integers or slices, not str). How can I get all items with the same ID and store into one dict?
First, you're iterating through the list of dictionaries in your for loop, but never referencing the dictionaries, which you're storing in item. I think when you wrote list[id] you mean item[id].
Second, any() returns a boolean (true or false), which isn't what you want. Instead, maybe try row = [dic for dic in list if dic['id'] == item['id']]
Third, if you define list2 within your for loop, it will go away every iteration. Move list2 = [] before the for loop.
That should give you a good start. Remember that row is just a list of all dictionaries that have the same id.
I would use kdopen's approach along with a merging method after converting the dictionary entries I expect to become lists into lists. Of course if you want to avoid redundancy then make them sets.
mylist = [
{'id':'123','name':['Jason'],'location': ['McHale']},
{'id':'432','name':['Tom'],'location': ['Sydney']},
{'id':'123','name':['Jason'],'location':['Tompson Hall']}
]
def merge(mylist,ID):
matches = [d for d in mylist if d['id']== ID]
shell = {'id':ID,'name':[],'location':[]}
for m in matches:
shell['name']+=m['name']
shell['location']+=m['location']
mylist.remove(m)
mylist.append(shell)
return mylist
updated_list = merge(mylist,'123')
Given this input
mylist = [
{'id':'123','name':'Jason','location': 'McHale'},
{'id':'432','name':'Tom','location': 'Sydney'},
{'id':'123','name':'Jason','location':'Tompson Hall'}
]
You can just extract it with a comprehension
matched = [d for d in mylist if d['id'] == '123']
Then you want to merge the locations. Assuming matched is not empty
final = matched[0]
final['location'] = [d['location'] for d in matched]
Here it is in the interpreter
In [1]: mylist = [
...: {'id':'123','name':'Jason','location': 'McHale'},
...: {'id':'432','name':'Tom','location': 'Sydney'},
...: {'id':'123','name':'Jason','location':'Tompson Hall'}
...: ]
In [2]: matched = [d for d in mylist if d['id'] == '123']
In [3]: final=matched[0]
In [4]: final['location'] = [d['location'] for d in matched]
In [5]: final
Out[5]: {'id': '123', 'location': ['McHale', 'Tompson Hall'], 'name': 'Jason'}
Obviously, you'd want to replace '123' with a variable holding the desired id value.
Wrapping it all up in a function:
def merge_all(df):
ids = {d['id'] for d in df}
result = []
for id in ids:
matches = [d for d in df if d['id'] == id]
combined = matches[0]
combined['location'] = [d['location'] for d in matches]
result.append(combined)
return result
Also, please don't use list as a variable name. It shadows the builtin list class.
I am using a tuple to store the output of a find -exec stat command and need to condense it in order to run du on it. The output is a tuple with each item being (username,/path/to/file)
I want to condense it to combine like usernames so the end result is (username,/path/to/file1,/path/to/file2,etc)
Is there any way to do this?
Here is the current code that returns my tuple
cmd = ['find',dir_loc,'-type','f','-exec','stat','-c','%U %n','{}','+']
process = Popen(cmd,stdout=PIPE)
find_out = process.communicate()
exit_code = process.wait()
find_out = find_out[0].split('\n')
out_tuple = []
for item in find_out:
out_tuple.append(item.split(' '))
Assuming you have a list of tuples or a list of lists of the form:
out_tuple = [('user_one', 'path_one'),
('user_three', 'path_seven'),
('user_two', 'path_five'),
('user_one', 'path_two'),
('user_one', 'path_three'),
('user_two', 'path_four')]
You can do:
from itertools import groupby
out_tuple.sort()
total_grouped = []
for key, group in groupby(out_tuple, lambda x: x[0]):
grouped_list = [key] + [x[1] for x in group]
total_grouped.append(tuple(grouped_list))
This will give you the list of tuples:
print total_grouped
# Prints:
# [('user_one', 'path_one', 'path_two', 'path_three'),
# ('user_three', 'path_seven'),
# ('user_two', 'path_five', 'path_four')]
If you started with a list of lists, then instead of:
total_grouped.append(tuple(grouped_list))
You can get rid of the tuple construction:
total_grouped.append(grouped_list)
I'll say one thing though, you might be better off using something like a dict as #BradBeattie suggests. If you're going to perform some operation later on that treats the first item in your tuple (or list) in a special way, then a dict is better.
It not only has a notion of uniqueness in the keys, it's also less cumbersome because the nesting has two distinct levels. First you have the dict, then you have the inner item which is a tuple (or a list). This is much clearer than having two similar collections nested one inside the other.
Just use a dict of lists:
out_tuple = [('user1', 'path1'),
('user1', 'path2'),
('user2', 'path3'),
('user1', 'path4'),
('user2', 'path5'),
('user1', 'path6')]
d={}
for user_name, path in out_tuple:
d.setdefault(user_name, []).append(path)
print d
Prints:
{'user2': ['path3', 'path5'], 'user1': ['path1', 'path2', 'path4', 'path6']}
Then if you want the output for each user name as a tuple:
for user_name in d:
print tuple([user_name]+d[user_name])
Prints:
('user2', 'path3', 'path5')
('user1', 'path1', 'path2', 'path4', 'path6')