This is the situation: A list consisting of ids - every one of those Ids can be related to any number (0 to n) other ids, which again can be related to other ids, etc. As a result I want a list of all relations, no matter the "depth".
At least to me this screams recursion but I can't quite wrap my head around how to do it.
def dive(rels):
if dive(rels) == []:
return rels
else:
for item in rels:
rels.append(getRelation(item))
rels = list(set(flattenAndClean(rels)))
return dive(rels)
This is my first (not working) attempt, where the function getRelation returns a list of relations of this item and the function flattenAndClean takes nested lists and returns flat ones.
Edit: Example:
Items={1:[4,5,6],2:[6,8],3:[],4:[7],5:[],6:[],7:[4],8:[]}
List = [1,2,3]
def getRelation(id):
return Items[id]
In: dive(List)
Out: [4,5,6,7,8]
def rels(items, L):
answer = set(L)
for i in L:
targs = tuple(t for t in items[i] if t not in L)
L.extend(targs)
answer.update(set(targs))
return answer
Output
In [29]: rels({1:[4,5,6],2:[6,8],3:[],4:[7],5:[],6:[],7:[4],8:[]}, [1,2,3])
Out[29]: {1, 2, 3, 4, 5, 6, 7, 8}
Related
I am trying to remove adjacent duplicates from a list without using list mutations like del or remove. Below is the code I tried:
def remove_dups(L):
L = [x for x in range(0,len(L)) if L[x] != L[x-1]]
return L
print(remove_dups([1,2,2,3,3,3,4,5,1,1,1]))
This outputs:
[1, 3, 6, 7, 8]
Can anyone explain me how this output occurred? I want to understand the flow but I wasn't able to do it even with debugging in VS code.
Input:
[1,2,2,3,3,3,4,5,1,1,1]
Expected output:
[1,2,3,4,5,1]
I'll replace the variables to make this more readable
def remove_dups(L):
L = [x for x in range(0,len(L)) if L[x] != L[x-1]]
becomes:
def remove_dups(lst):
return [index for index in range(len(lst)) if lst[index] != lst[index-1]]
You can see, instead of looping over the items of the list it is instead looping over the indices of the array comparing the value at one index lst[index] to the value at the previous index lst[index-1] and only migrating/copying the value if they don't match
The two main issues are:
the first index it is compared to is -1 which is the last item of the list (compared to the first)
this is actually returning the indices of the non-duplicated items.
To make this work, I'd use the enumerate function which returns the item and it's index as follows:
def remove_dups(lst):
return [item for index, item in enumerate(lst[:-1]) if item != lst[index+1]] + [lst[-1]]
Here what I'm doing is looping through all of the items except for the last one [:-1] and checking if the item matches the next item, only adding it if it doesn't
Finally, because the last value isn't read we append it to the output + [lst[-1]].
This is a job for itertools.groupby:
from itertools import groupby
def remove_dups(L):
return [k for k,g in groupby(L)]
L2 = remove_dups([1,2,2,3,3,3,4,5,1,1,1])
Output: [1, 2, 3, 4, 5, 1]
I have list:
[1,4,4,2,[[[4],5]],5,6]
How can i filter this list, that i will have unique values, but save dimensions for elements?
result:
[1,4,2,[[5]],6]
you can have recursive function that works like any other unique-finding solution you might find online, but when it encounters an inner list, it calls itself with this inner list (getting only the uniques from it), also passing as argument the set of already seen elements (so that the "uniques from it" is defined as entirely new elements).
try this:
lst = [1, 4, 4, 2, [[[4], 5]], 5, 6]
def unique(lst, already_seen):
result = []
for item in lst:
if isinstance(item, list):
inner_uniques = unique(item, already_seen)
if len(inner_uniques) > 0:
result.append(inner_uniques)
else: # assuming item is a number
if item in already_seen:
continue
else: # new number
result.append(item)
already_seen.add(item)
return result
result = unique(lst, set()) # starting with an empty ser
print(result)
Output:
[1, 4, 2, [[5]], 6]
Here's a version that modifies the original list instead of making a new one (and returns it, which isn't so Pythonic but can be useful):
def remDups( s, seen=set() ):
to_pop = []
for i,v in enumerate(s):
if isinstance(v,list):
remDups( v, seen )
elif v in seen:
to_pop = [i]+to_pop
else:
seen.add(v)
for i in to_pop:
s.pop(i)
return s
I'm trying to write a piece of code that can automatically factor an expression. For example,
if I have two lists [1,2,3,4] and [2,3,5], the code should be able to find the common elements in the two lists, [2,3], and combine the rest of the elements together in a new list, being [1,4,5].
From this post: How to find list intersection?
I see that the common elements can be found by
set([1,2,3,4]&set([2,3,5]).
Is there an easy way to retrieve non-common elements from each list, in my example being [1,4] and [5]?
I can go ahead and do a for loop:
lists = [[1,2,3,4],[2,3,5]]
conCommon = []
common = [2,3]
for elem in lists:
for elem in eachList:
if elem not in common:
nonCommon += elem
But this seems redundant and inefficient. Does Python provide any handy function that can do that? Thanks in advance!!
Use the symmetric difference operator for sets (aka the XOR operator):
>>> set([1,2,3]) ^ set([3,4,5])
set([1, 2, 4, 5])
Old question, but looks like python has a built-in function to provide exactly what you're looking for: .difference().
EXAMPLE
list_one = [1,2,3,4]
list_two = [2,3,5]
one_not_two = set(list_one).difference(list_two)
# set([1, 4])
two_not_one = set(list_two).difference(list_one)
# set([5])
This could also be written as:
one_not_two = set(list_one) - set(list_two)
Timing
I ran some timing tests on both and it appears that .difference() has a slight edge, to the tune of 10 - 15% but each method took about an eighth of a second to filter 1M items (random integers between 500 and 100,000), so unless you're very time sensitive, it's probably immaterial.
Other Notes
It appears the OP is looking for a solution that provides two separate lists (or sets) - one where the first contains items not in the second, and vice versa. Most of the previous answers return a single list or set that include all of the items.
There is also the question as to whether items that may be duplicated in the first list should be counted multiple times, or just once.
If the OP wants to maintain duplicates, a list comprehension could be used, for example:
one_not_two = [ x for x in list_one if x not in list_two ]
two_not_one = [ x for x in list_two if x not in list_one ]
...which is roughly the same solution as posed in the original question, only a little cleaner. This method would maintain duplicates from the original list but is considerably (like multiple orders of magnitude) slower for larger data sets.
You can use Intersection concept to deal with this kind of problems.
b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
set(b1).intersection(b2)
Out[22]: {4, 5}
Best thing about using this code is it works pretty fast for large data also. I have b1 with 607139 and b2 with 296029 elements when i use this logic I get my results in 2.9 seconds.
You can use the .__xor__ attribute method.
set([1,2,3,4]).__xor__(set([2,3,5]))
or
a = set([1,2,3,4])
b = set([2,3,5])
a.__xor__(b)
You can use symmetric_difference command
x = {1,2,3}
y = {2,3,4}
z = set.difference(x,y)
Output will be : z = {1,4}
This should get the common and remaining elements
lis1=[1,2,3,4,5,6,2,3,1]
lis2=[4,5,8,7,10,6,9,8]
common = list(dict.fromkeys([l1 for l1 in lis1 if l1 in lis2]))
remaining = list(filter(lambda i: i not in common, lis1+lis2))
common = [4, 5, 6]
remaining = [1, 2, 3, 2, 3, 1, 8, 7, 10, 9, 8]
All the good solutions, starting from basic DSA style to using inbuilt functions:
# Time: O(2n)
def solution1(arr1, arr2):
map = {}
maxLength = max(len(arr1), len(arr2))
for i in range(maxLength):
if(arr1[i]):
if(not map.get(arr1[i])):
map[arr1[i]] = [True, False]
else:
map[arr1[i]][0] = True
if(arr2[i]):
if(not map.get(arr2[i])):
map[arr2[i]] = [False, True]
else:
map[arr2[i]][1] = False
res = [];
for key, value in map.items():
if(value[0] == False or value[1] == False):
res.append(key)
return res
def solution2(arr1, arr2):
return set(arr1) ^ set(arr2)
def solution3(arr1, arr2):
return (set(arr1).difference(arr2), set(arr2).difference(arr1))
def solution4(arr1, arr2):
return set(arr1).__xor__(set(arr2))
print(solution1([1,2,3], [2,4,6]))
print(solution2([1,2,3], [2,4,6]))
print(solution3([1,2,3], [2,4,6]))
print(solution4([1,2,3], [2,4,6]))
I'm looking for a way to make a list containing list (a below) into a single list (b below) with 2 conditions:
The order of the new list (b) is based on the number of times the value has occurred in some of the lists in a.
A value can only appear once
Basically turn a into b:
a = [[1,2,3,4], [2,3,4], [4,5,6]]
# value 4 occurs 3 times in list a and gets first position
# value 2 occurs 2 times in list a and get second position and so on...
b = [4,2,3,1,5,6]
I figure one could do this with set and some list magic. But can't get my head around it when a can contain any number of list. The a list is created based on user input (I guess that it can contain between 1 - 20 list with up 200-300 items in each list).
My trying something along the line with [set(l) for l in a] but don't know how to perform set(l) & set(l).... to get all matched items.
Is possible without have a for loop iterating sublist count * items in sublist times?
I think this is probably the closest you're going to get:
from collections import defaultdict
d = defaultdict(int)
for sub in outer:
for val in sub:
d[val] += 1
print sorted(d.keys(), key=lambda k: d[k], reverse = True)
# Output: [4, 2, 3, 1, 5, 6]
There is an off chance that the order of elements that appear an identical number of times may be indeterminate - the output of d.keys() is not ordered.
import itertools
all_items = set(itertools.chain(*a))
b = sorted(all_items, key = lambda y: -sum(x.count(y) for x in a))
Try this -
a = [[1,2,3,4], [2,3,4], [4,5,6]]
s = set()
for l in a:
s.update(l)
print s
#set([1, 2, 3, 4, 5, 6])
b = list(s)
This will add each list to the set, which will give you a unique set of all elements in all the lists. If that is what you are after.
Edit. To preserve the order of elements in the original list, you can't use sets.
a = [[1,2,3,4], [2,3,4], [4,5,6]]
b = []
for l in a:
for i in l:
if not i in b:
b.append(i)
print b
#[1,2,3,4,5,6] - The same order as the set in this case, since thats the order they appear in the list
import itertools
from collections import defaultdict
def list_by_count(lists):
data_stream = itertools.chain.from_iterable(lists)
counts = defaultdict(int)
for item in data_stream:
counts[item] += 1
return [item for (item, count) in
sorted(counts.items(), key=lambda x: (-x[1], x[0]))]
Having the x[0] in the sort key ensures that items with the same count are in some kind of sequence as well.
i have a list of dictionaries. there are several points inside the list, some are multiple. When there is a multiple entry i want to calculate the average of the x and the y of this point. My problem is, that i don't know how to loop through the list of dictionaries to compare the ids of the points!
when i use something like that:
for i in list:
for j in list:
if i['id'] == j['id']:
point = getPoint(i['geom'])
....
sorry, the formating is a little bit tricky... the second loop is inside the first one...
i think it compares the first entry of the list, so it's the same... so i have to start in the second loop with the second entry, but i can't do that with i-1 because i is the hole dictionary...
Someone an idea?
thanks in advance!
for j in range(1, len(NEWPoint)):
if i['gid']==j['gid']:
allsamePoints.append(j)
for k in allsamePoints:
for l in range(1, len(allsamePoints)):
if k['gid']==l['gid']:
Point1 = k['geom']
Point2=l['geom']
X=(Point1.x()+Point2.x())/2
Y=(Point1.y()+Point2.y())/2
AVPoint = QgsPoint(X, Y)
NEWReturnList.append({'gid': j['gid'], 'geom': AVPoint})
del l
for m in NEWReturnList:
for n in range(1, len(NEWReturnList)):
if m['gid']==n['gid']:
Point1 = m['geom']
Point2=n['geom']
X=(Point1.x()+Point2.x())/2
Y=(Point1.y()+Point2.y())/2
AVPoint = QgsPoint(X, Y)
NEWReturnList.append({'gid': j['gid'], 'geom': AVPoint})
del n
else:
pass
ok, i think... at the moment thats more confusing :)...
One way would be changing the way you store your points, because as you already noticed, it's hard to get what you want out of it.
A much more useful structure would be a dict where the id maps to a list of points:
from collections import defaultdict
points_dict = defaultdict(list)
# make the new dict
for point in point_list:
id = point["id"]
points_dict[id].append(point['geom'])
def avg( lst ):
""" average of a `lst` """
return 1.0 * sum(lst)/len(lst)
# now its simple to get the average
for id in points_dict:
print id, avg( points_dict[id] )
I'm not totally sure what you want to do, but I think list filtering would help you. There's built-in function filter, which iterates over a sequence and for each item it calls user-defined function to determine whether to include that item in the resulting list or not.
For instance:
def is4(number):
return number == 4
l = [1, 2, 3, 4, 5, 6, 4, 7, 8, 4, 4]
filter(is4, l) # returns [4, 4, 4, 4]
So, having a list of dictionaries, to filter out all dictionaries with certain entry equal to a given value, you could do something like this:
def filter_dicts(dicts, entry, value):
def filter_function(d):
if entry not in d:
return False
return d[entry] == value
return filter(filter_function, dicts)
With this function, to get all dictionaries with the "id" entry equal to 2, you can do:
result = filter_dicts(your_list, "id", 2)
With this, your main loop could look something like this:
processed_ids = set()
for item in list:
id = item['id']
if id in processed_ids:
continue
processed_ids.add(id)
same_ids = filter_dicts(list, "id", id)
# now do something with same_ids
I hope I understood you correctly and that this is helpful to you.