Select only one unique element from multiple lists in Python - python

This is not a homework that I'm struggling to do but I am trying to solve a problem (here is the link if interested https://open.kattis.com/problems/azulejos).
Here you actually don't have to understand the problem but what I would like to accomplish now is that I want to select only one element from multiple lists and they do not overlap with each other.
For example, at the end of my code, I get an output:
{1: [1, 2, 3], 2: [1, 2, 3, 4], 3: [2, 4], 4: [1, 2, 3, 4]}
I would like to transform this into, for example,
{3: 4, 2: 2, 4:1, 1: 3} -- which is the sample answer that is in the website.
But from my understanding, it can also be simply
{1: 3, 2: 2, 3: 4, 4: 1}
I am struggling to select only one integer that does not overlap with the others. The dictionary I produce in my code contains lists with multiple integers. And I would like to pick only one from each and they are all unique
import sys
n_tiles_row = int(sys.stdin.readline().rstrip())
# print(n_tiles_row) ==> 4
# BACK ROW - JOAO
back_row_price = sys.stdin.readline().rstrip()
# print(back_row_price) ==> 3 2 1 2
back_row_height = sys.stdin.readline().rstrip()
# print(back_row_height) ==> 2 3 4 3
# FRONT ROW - MARIA
front_row_price = sys.stdin.readline().rstrip()
# print(front_row_price) ==> 2 1 2 1
front_row_height = sys.stdin.readline().rstrip()
# print(front_row_height) ==> 2 2 1 3
br_num1_price, br_num2_price, br_num3_price, br_num4_price = map(int, back_row_price.split())
# br_num1_price = 3; br_num2_price = 2; br_num3_price = 1; br_num4_price = 2;
br_num1_height, br_num2_height, br_num3_height, br_num4_height = map(int, back_row_height.split())
# 2 3 4 3
fr_num1_price, fr_num2_price, fr_num3_price, fr_num4_price = map(int, front_row_price.split())
# 2 1 2 1
fr_num1_height, fr_num2_height, fr_num3_height, fr_num4_height = map(int, front_row_height.split())
# 2 2 1 3
back_row = {1: [br_num1_price, br_num1_height],
2: [br_num2_price, br_num2_height],
3: [br_num3_price, br_num3_height],
4: [br_num4_price, br_num4_height]}
# {1: [3, 2], 2: [2, 3], 3: [1, 4], 4: [2, 3]}
front_row = {1: [fr_num1_price, fr_num1_height],
2: [fr_num2_price, fr_num2_height],
3: [fr_num3_price, fr_num3_height],
4: [fr_num4_price, fr_num4_height]}
# {1: [2, 2], 2: [1, 2], 3: [2, 1], 4: [1, 3]}
_dict = {1: [],
2: [],
3: [],
4: []
}
for i in range(n_tiles_row):
_list = []
for n in range(n_tiles_row):
if(list(back_row.values())[i][0] >= list(front_row.values())[n][0]
and list(back_row.values())[i][1] >= list(front_row.values())[n][1]):
_list.append(list(front_row.keys())[n])
_dict[list(back_row.keys())[i]] = _list
print(_dict)
# {1: [1, 2, 3], 2: [1, 2, 3, 4], 3: [2, 4], 4: [1, 2, 3, 4]}
Please let me know if there is another approach to this problem.

Here is a solution using the same syntax as the code you provided.
The trick here was to order the tiles first by price ascending (the question asked for non-descending) then by height descending such that the tallest tile of the next lowest price in the back row would be matched the tallest tile of the next lowest price in the front row.
To do this sorting I utilized Python's sorted() function. See a Stack Overflow example here.
I assumed if there was no such match then immediately break and print according to the problem you linked.
As a side note, you had originally claimed that a python dictionary
{3: 4, 2: 2, 4:1, 1: 3} was equivalent to {1: 3, 2: 2, 3: 4, 4: 1}. While you are correct, you must remember that in Python dictionary objects are unsorted by default so it is not easy to compare keys this way.
import sys
n_tiles_row = int(sys.stdin.readline().rstrip())
# print(n_tiles_row) ==> 4
# BACK ROW - JOAO
back_row_price = sys.stdin.readline().rstrip()
# print(back_row_price) ==> 3 2 1 2
back_row_height = sys.stdin.readline().rstrip()
# print(back_row_height) ==> 2 3 4 3
# FRONT ROW - MARIA
front_row_price = sys.stdin.readline().rstrip()
# print(front_row_price) ==> 2 1 2 1
front_row_height = sys.stdin.readline().rstrip()
# print(front_row_height) ==> 2 2 1 3
# preprocess data into lists of ints
back_row_price = [int(x) for x in back_row_price.strip().split(' ')]
back_row_height = [int(x) for x in back_row_height.strip().split(' ')]
front_row_price = [int(x) for x in front_row_price.strip().split(' ')]
front_row_height = [int(x) for x in front_row_height.strip().split(' ')]
# store each tile into lists of tuples
front = list()
back = list()
for i in range(n_tiles_row):
back.append((i, back_row_price[i], back_row_height[i])) # tuples of (tile_num, price, height)
front.append((i, front_row_price[i], front_row_height[i]))
# sort tiles by price first (as the price must be non-descending) then by height descending
back = sorted(back, key=lambda x: (x[1], -x[2]))
front = sorted(front, key=lambda x: (x[1], -x[2]))
# print(back) ==> [(2, 1, 4), (1, 2, 3), (3, 2, 3), (0, 3, 2)]
# print(front) ==> [(3, 1, 3), (1, 1, 2), (0, 2, 2), (2, 2, 1)]
possible_back_tile_order = list()
possible_front_tile_order = list()
for i in range(n_tiles_row):
if back[i][2] > front[i][2]: # if next lowest priced back tile is taller than next lowest priced front tile
possible_back_tile_order.append(back[i][0])
possible_front_tile_order.append(front[i][0])
else:
break
if len(possible_back_tile_order) < n_tiles_row: # check that all tiles had matching pairs in back and front
print("impossible")
else:
print(possible_back_tile_order)
print(possible_front_tile_order)

A, possibly inefficient, way of solving the issue, is to generate all possible "solutions" (with values potentially not present in the lists corresponding to a specific key) and settle for a "valid" one (for which all values are present in the corresponding lists).
One way of doing this with itertools.permutation (that is able to compute all possible solutions satisfying the uniqueness constraint) would be:
import itertools
def gen_valid(source):
keys = source.keys()
possible_values = set(x for k, v in source.items() for x in v)
for values in itertools.permutations(possible_values):
result = {k: v for k, v in zip(keys, values)}
# : check that `result` is valid
if all(v in source[k] for k, v in result.items()):
yield result
d = {1: [1, 2, 3], 2: [1, 2, 3, 4], 3: [2, 4], 4: [1, 2, 3, 4]}
next(gen_valid(d))
# {1: 1, 2: 2, 3: 4, 4: 3}
list(gen_valid(d))
# [{1: 1, 2: 2, 3: 4, 4: 3},
# {1: 1, 2: 3, 3: 2, 4: 4},
# {1: 1, 2: 3, 3: 4, 4: 2},
# {1: 1, 2: 4, 3: 2, 4: 3},
# {1: 2, 2: 1, 3: 4, 4: 3},
# {1: 2, 2: 3, 3: 4, 4: 1},
# {1: 3, 2: 1, 3: 2, 4: 4},
# {1: 3, 2: 1, 3: 4, 4: 2},
# {1: 3, 2: 2, 3: 4, 4: 1},
# {1: 3, 2: 4, 3: 2, 4: 1}]
This generates n! solutions.
The "brute force" approach using a Cartesian product over the lists, produces prod(n_k) = n_1 * n_1 * ... * n_k solutions (with n_k the length of each list). In the worst case scenario (maximum density) this is n ** n solutions, which is asymptotically much worse than the factorial.
In the best case scenario (minimum density) this is 1 solution only.
In general, this can be either slower or faster than the "permutation solution" proposed above, depending on the "sparsity" of the lists.
For an average n_k of approx. n / 2, n! is smaller/faster for n >= 6.
For an average n_k of approx. n * (3 / 4), n! is smaller/faster for n >= 4.
In this example there are 4! == 4 * 3 * 2 * 1 == 24 permutation solutions, and 3 * 4 * 2 * 4 == 96 Cartesian product solutions.

Related

Exporting values & keys from dictionary in specified way

New to Python and hitting a wall with this problem.
Scenario: I have a list with multiple unknown integers. I need to take these, sort them and extract the most frequent occurences. If there are more than one instance of an item, then the higher value should be chosen first.
So far, I have made a dictionary to deal with an example request list but I am unsure how to extract the keys and values as specified above.
def frequency(requests):
freq = {}
for x in requests:
if (x in freq):
freq[x] += 1
else:
freq[x] = 1
print(freq) # provides expected result
#my attempt to sort dictionary and extract required values
sorted_freq = dict(sorted(freq.items(), key=lambda x:x[1], reverse=True))
print(sorted_freq) #printing the keys at this stage doesn't factor in if the key is bigger/smaller for items with same frequency
print(sorted_freq.keys())
return
requests = [2,3,6,5,2,7,2,3,6,5,2,7,11,2,77] #example of request
frequency(requests)
#Output for freq = {2: 5, 3: 2, 6: 2, 5: 2, 7: 2, 11: 1, 77: 1}
#Output for sorted_freq = {2: 5, 3: 2, 6: 2, 5: 2, 7: 2, 11: 1, 77: 1}
#Output for sorted_freq.keys = [2, 3, 6, 5, 7, 11, 77]
So in the above, 3, 6, 5 & 7 all have two occurences, similarly 11 & 77 both one occurence.
The output I am looking for is [2, 7, 6, 5, 3, 77, 11].
I have added in the extra prints above to visualise the problem, will only need the final print in the actual code.
Not sure what the optimal way to approach this is, any help would be appreciated.
Thanks
from collections import Counter
from itertools import groupby
requests = [2,3,6,5,2,7,2,3,6,5,2,7,11,2,77]
c = Counter(requests)
freq = list()
for i,g in groupby(c.items(), key=lambda t:t[1]):
freq.extend(sorted([j for j,k in g],reverse=True))
print(freq)
Try to use built-ins as they are really useful, don't reinvent the wheel :)
Output:
[2, 7, 6, 5, 3, 77, 11]

How do I get a multiple of leaf nodes for each element in dict as tree structure?

Let's say here is input:
input = {1:[2,3], 2:[], 3:[4,5,6], 4:[], 5:[], 6:[]}
This can be represented like below:
tree
If all leaf nodes are 1, we can change like this:
for idx, val in input[1]:
if len(input[val]) == 0:
input[1][idx] = [] # input[1] = [[], 3]
else:
input[1][idx] = input[val] # input[1] = [[], [4,5,6]]
And then somehow, input[1] can be input[1] = [[],[[],[],[]]].
So finally, i wanna get multiple numbers of each dict key elements compared to leaf nodes.
I am not sure the description that i wrote is clear.
Anyways, What i want to get is like:
# 2, 4, 5, 6 are all leaf nodes.
# 3 is including [4, 5, 6] which are all leaf nodes, then 3's value must be 3 (len[4,5,6])
# 1 is including [2, 3]. Among them, only 2 is leaf. And 3's value is 3. So output[1] = 1 + 3 = 4
output = {1:4, 2:1, 3:3, 4:1, 5:1, 6:1}
Simple recursion with functools.lru_cache:
from functools import lru_cache
def leaves_count(tree):
#lru_cache
def cntr(key):
value = tree[key]
return sum(map(cntr, value)) if value else 1
return {k: cntr(k) for k in tree}
Test:
>>> tree = {1: [2, 3], 2: [], 3: [4, 5, 6], 4: [], 5: [], 6: []}
>>> leaves_count(tree)
{1: 4, 2: 1, 3: 3, 4: 1, 5: 1, 6: 1}
Manually implement the cached version:
def leaves_count(tree):
def cntr(key):
cnt = cache.get(key)
if cnt:
return cnt
value = tree[key]
cache[key] = cnt = sum(map(cntr, value)) if value else 1
return cnt
cache = {}
return {k: cntr(k) for k in tree}
Using recursion method
Code:
dic= {1:[2,3], 2:[], 3:[4,5,6], 4:[], 5:[], 6:[]}
def recur(val, leaf):
for v in val:
if v in dic.keys():
if len(dic[v])==0:
leaf.append(v)
else:
recur(dic[v],leaf)
else:
leaf.append(v)
return len(leaf)
{key : 1 if recur(val,[])==0 else recur(val,[]) for key,val in dic.items()}
Output:
{1: 4, 2: 1, 3: 3, 4: 1, 5: 1, 6: 1}
Input: {1:[2,3], 2:[], 3:[4,5,6], 4:[], 5:[], 6:[7,8], 7:[], 8:[9,10], 9:[]}
Output: {1: 6, 2: 1, 3: 5, 4: 1, 5: 1, 6: 3, 7: 1, 8: 2, 9: 1}

how to subtract one list from another including duplicates

I have 2 lists
On is a big list with some elements having duplicates
super_set_list = [1,1,2,3,3,4,4,4,5,6,7,8,9]
The other is a subset of the big list, also with duplicates
sub_set_list = [1,2,3,3,4,4,6,7,9]
I want the difference, like this
diff = [1,4,5,8]
Not sure how I would go about this
You can use a Counter
super_set_list = [1,1,1,2,3,3,4,4,4,5,6,7,8,9]
sub_set_list = [1,2,3,3,4,4,6,7,9]
from collections import Counter
super_counter = Counter(super_set_list)
super_counter = Counter({1: 3, 4: 3, 3: 2, 2: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1})
For every element in sub_set_list, reduce the count in in super_counter
for item in sub_set_list:
super_counter[item]-=1
Now super_counter = Counter({1: 2, 4: 1, 5: 1, 8: 1, 2: 0, 3: 0, 6: 0, 7: 0, 9: 0})
Finally, just pick elements that have some count left (but add it that many number of times).
diff=[]
for k,v in super_counter.items():
for _ in range(v):
diff.append(k)
print(diff)
# [1, 1, 4, 5, 8]
You can loop through sub-set list and remove item in super-set list one by one as follows:
super_set_list = [1,1,2,3,3,4,4,4,5,6,7,8,9]
sub_set_list = [1,2,3,3,4,4,6,7,9]
for item in sub_set_list:
if item in super_set_list:
super_set_list.remove(item)
print(super_set_list)

How to efficiently count each element in a list in Python? [duplicate]

This question already has answers here:
Using a dictionary to count the items in a list
(8 answers)
Closed 7 months ago.
Given an unordered list of values like
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
How can I get the frequency of each value that appears in the list, like so?
# `a` has 4 instances of `1`, 4 of `2`, 2 of `3`, 1 of `4,` 2 of `5`
b = [4, 4, 2, 1, 2] # expected output
In Python 2.7 (or newer), you can use collections.Counter:
>>> import collections
>>> a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
>>> counter = collections.Counter(a)
>>> counter
Counter({1: 4, 2: 4, 5: 2, 3: 2, 4: 1})
>>> counter.values()
dict_values([2, 4, 4, 1, 2])
>>> counter.keys()
dict_keys([5, 1, 2, 4, 3])
>>> counter.most_common(3)
[(1, 4), (2, 4), (5, 2)]
>>> dict(counter)
{5: 2, 1: 4, 2: 4, 4: 1, 3: 2}
>>> # Get the counts in order matching the original specification,
>>> # by iterating over keys in sorted order
>>> [counter[x] for x in sorted(counter.keys())]
[4, 4, 2, 1, 2]
If you are using Python 2.6 or older, you can download an implementation here.
If the list is sorted, you can use groupby from the itertools standard library (if it isn't, you can just sort it first, although this takes O(n lg n) time):
from itertools import groupby
a = [5, 1, 2, 2, 4, 3, 1, 2, 3, 1, 1, 5, 2]
[len(list(group)) for key, group in groupby(sorted(a))]
Output:
[4, 4, 2, 1, 2]
Python 2.7+ introduces Dictionary Comprehension. Building the dictionary from the list will get you the count as well as get rid of duplicates.
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> d = {x:a.count(x) for x in a}
>>> d
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
>>> a, b = d.keys(), d.values()
>>> a
[1, 2, 3, 4, 5]
>>> b
[4, 4, 2, 1, 2]
Count the number of appearances manually by iterating through the list and counting them up, using a collections.defaultdict to track what has been seen so far:
from collections import defaultdict
appearances = defaultdict(int)
for curr in a:
appearances[curr] += 1
In Python 2.7+, you could use collections.Counter to count items
>>> a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>>
>>> from collections import Counter
>>> c=Counter(a)
>>>
>>> c.values()
[4, 4, 2, 1, 2]
>>>
>>> c.keys()
[1, 2, 3, 4, 5]
Counting the frequency of elements is probably best done with a dictionary:
b = {}
for item in a:
b[item] = b.get(item, 0) + 1
To remove the duplicates, use a set:
a = list(set(a))
You can do this:
import numpy as np
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
np.unique(a, return_counts=True)
Output:
(array([1, 2, 3, 4, 5]), array([4, 4, 2, 1, 2], dtype=int64))
The first array is values, and the second array is the number of elements with these values.
So If you want to get just array with the numbers you should use this:
np.unique(a, return_counts=True)[1]
Here's another succint alternative using itertools.groupby which also works for unordered input:
from itertools import groupby
items = [5, 1, 1, 2, 2, 1, 1, 2, 2, 3, 4, 3, 5]
results = {value: len(list(freq)) for value, freq in groupby(sorted(items))}
results
format: {value: num_of_occurencies}
{1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
I would simply use scipy.stats.itemfreq in the following manner:
from scipy.stats import itemfreq
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq = itemfreq(a)
a = freq[:,0]
b = freq[:,1]
you may check the documentation here: http://docs.scipy.org/doc/scipy-0.16.0/reference/generated/scipy.stats.itemfreq.html
from collections import Counter
a=["E","D","C","G","B","A","B","F","D","D","C","A","G","A","C","B","F","C","B"]
counter=Counter(a)
kk=[list(counter.keys()),list(counter.values())]
pd.DataFrame(np.array(kk).T, columns=['Letter','Count'])
seta = set(a)
b = [a.count(el) for el in seta]
a = list(seta) #Only if you really want it.
Suppose we have a list:
fruits = ['banana', 'banana', 'apple', 'banana']
We can find out how many of each fruit we have in the list like so:
import numpy as np
(unique, counts) = np.unique(fruits, return_counts=True)
{x:y for x,y in zip(unique, counts)}
Result:
{'banana': 3, 'apple': 1}
This answer is more explicit
a = [1,1,1,1,2,2,2,2,3,3,3,4,4]
d = {}
for item in a:
if item in d:
d[item] = d.get(item)+1
else:
d[item] = 1
for k,v in d.items():
print(str(k)+':'+str(v))
# output
#1:4
#2:4
#3:3
#4:2
#remove dups
d = set(a)
print(d)
#{1, 2, 3, 4}
For your first question, iterate the list and use a dictionary to keep track of an elements existsence.
For your second question, just use the set operator.
def frequencyDistribution(data):
return {i: data.count(i) for i in data}
print frequencyDistribution([1,2,3,4])
...
{1: 1, 2: 1, 3: 1, 4: 1} # originalNumber: count
I am quite late, but this will also work, and will help others:
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
freq_list = []
a_l = list(set(a))
for x in a_l:
freq_list.append(a.count(x))
print 'Freq',freq_list
print 'number',a_l
will produce this..
Freq [4, 4, 2, 1, 2]
number[1, 2, 3, 4, 5]
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
counts = dict.fromkeys(a, 0)
for el in a: counts[el] += 1
print(counts)
# {1: 4, 2: 4, 3: 2, 4: 1, 5: 2}
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
# 1. Get counts and store in another list
output = []
for i in set(a):
output.append(a.count(i))
print(output)
# 2. Remove duplicates using set constructor
a = list(set(a))
print(a)
Set collection does not allow duplicates, passing a list to the set() constructor will give an iterable of totally unique objects. count() function returns an integer count when an object that is in a list is passed. With that the unique objects are counted and each count value is stored by appending to an empty list output
list() constructor is used to convert the set(a) into list and referred by the same variable a
Output
D:\MLrec\venv\Scripts\python.exe D:/MLrec/listgroup.py
[4, 4, 2, 1, 2]
[1, 2, 3, 4, 5]
Simple solution using a dictionary.
def frequency(l):
d = {}
for i in l:
if i in d.keys():
d[i] += 1
else:
d[i] = 1
for k, v in d.iteritems():
if v ==max (d.values()):
return k,d.keys()
print(frequency([10,10,10,10,20,20,20,20,40,40,50,50,30]))
#!usr/bin/python
def frq(words):
freq = {}
for w in words:
if w in freq:
freq[w] = freq.get(w)+1
else:
freq[w] =1
return freq
fp = open("poem","r")
list = fp.read()
fp.close()
input = list.split()
print input
d = frq(input)
print "frequency of input\n: "
print d
fp1 = open("output.txt","w+")
for k,v in d.items():
fp1.write(str(k)+':'+str(v)+"\n")
fp1.close()
from collections import OrderedDict
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
def get_count(lists):
dictionary = OrderedDict()
for val in lists:
dictionary.setdefault(val,[]).append(1)
return [sum(val) for val in dictionary.values()]
print(get_count(a))
>>>[4, 4, 2, 1, 2]
To remove duplicates and Maintain order:
list(dict.fromkeys(get_count(a)))
>>>[4, 2, 1]
i'm using Counter to generate a freq. dict from text file words in 1 line of code
def _fileIndex(fh):
''' create a dict using Counter of a
flat list of words (re.findall(re.compile(r"[a-zA-Z]+"), lines)) in (lines in file->for lines in fh)
'''
return Counter(
[wrd.lower() for wrdList in
[words for words in
[re.findall(re.compile(r'[a-zA-Z]+'), lines) for lines in fh]]
for wrd in wrdList])
For the record, a functional answer:
>>> L = [1,1,1,1,2,2,2,2,3,3,4,5,5]
>>> import functools
>>> >>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc,1)] if e<=len(acc) else acc+[0 for _ in range(e-len(acc)-1)]+[1], L, [])
[4, 4, 2, 1, 2]
It's cleaner if you count zeroes too:
>>> functools.reduce(lambda acc, e: [v+(i==e) for i, v in enumerate(acc)] if e<len(acc) else acc+[0 for _ in range(e-len(acc))]+[1], L, [])
[0, 4, 4, 2, 1, 2]
An explanation:
we start with an empty acc list;
if the next element e of L is lower than the size of acc, we just update this element: v+(i==e) means v+1 if the index i of acc is the current element e, otherwise the previous value v;
if the next element e of L is greater or equals to the size of acc, we have to expand acc to host the new 1.
The elements do not have to be sorted (itertools.groupby). You'll get weird results if you have negative numbers.
Another approach of doing this, albeit by using a heavier but powerful library - NLTK.
import nltk
fdist = nltk.FreqDist(a)
fdist.values()
fdist.most_common()
Found another way of doing this, using sets.
#ar is the list of elements
#convert ar to set to get unique elements
sock_set = set(ar)
#create dictionary of frequency of socks
sock_dict = {}
for sock in sock_set:
sock_dict[sock] = ar.count(sock)
For an unordered list you should use:
[a.count(el) for el in set(a)]
The output is
[4, 4, 2, 1, 2]
Yet another solution with another algorithm without using collections:
def countFreq(A):
n=len(A)
count=[0]*n # Create a new list initialized with '0'
for i in range(n):
count[A[i]]+= 1 # increase occurrence for value A[i]
return [x for x in count if x] # return non-zero count
num=[3,2,3,5,5,3,7,6,4,6,7,2]
print ('\nelements are:\t',num)
count_dict={}
for elements in num:
count_dict[elements]=num.count(elements)
print ('\nfrequency:\t',count_dict)
You can use the in-built function provided in python
l.count(l[i])
d=[]
for i in range(len(l)):
if l[i] not in d:
d.append(l[i])
print(l.count(l[i])
The above code automatically removes duplicates in a list and also prints the frequency of each element in original list and the list without duplicates.
Two birds for one shot ! X D
This approach can be tried if you don't want to use any library and keep it simple and short!
a = [1,1,1,1,2,2,2,2,3,3,4,5,5]
marked = []
b = [(a.count(i), marked.append(i))[0] for i in a if i not in marked]
print(b)
o/p
[4, 4, 2, 1, 2]

Reducing execution time Python

given that I have a list like this (river discharge tree):
String type (11 elements).
From;To
1;2
2;3
5;4
3;4
4;-999
9;6
6;5
10;7
8;7
7;5
If you imagine it as a tree, it should be like (direction from top to bottom):
1 9 8 10
| | \/
2 6 7
| \ /
3 5
| /
4
|
I just want to expand the list so I would have all the combinations like
From;To
1;2
2;3
5;4
3;4
4;-999
9;6
6;5
10;7
8;7
7;5
1;3
1;4
6;4
9;4
9;5
7;4
8:4
8:5
10:5
10:4
There must be connection in the tree and the order must be from top to bottom.
What is the best way to do this?
I wrote a code for this but this would take me about 6 days of executing for 6000 rows.
should_restart = False
for b in range(1, lengthchange):
row1 = str(tree[b])
position2 = row1.find(delimeter)
position3 = row1.find(end)
from1 = (row1[0:position2])
to1 = row1[position2+1:position3]
for c in range(1, lengthchange):
row2 = str(tree[c])
position4 = row2.find(delimeter)
position5 = row2.find(end)
from2 = (row2[0:position4])
to2 = row2[position4+1:position5]
if to1 == from2 and not to2 == "-999":
append1 = str(from1)+";"+str(to2)+"\n"
seen = set(tree)
if append1 not in seen:
seen.add(append1)
tree.append(append1)
should_restart = True
count_test = count_test+1
print(count_test)
lengthchange = len(tree)
Could you check my code and give me some advices?
Thank you very much!
So the key to doing this efficiently is ensuring we don't have to revisit nodes over and over again. We can do this by starting with the output and working our way back:
crivers = rivers[:] # copy the rivers list, as this process is destructive
ckeys = set(river.split(";")[0] for river in crivers) # make a set for O(1) lookup
result = {}
while crivers:
for river in crivers[:]:
key, value = river.split(";")
if value in ckeys:
continue # skip rivers that are flowing into unprocessed rivers
result[int(key)] = [int(value)] + result.get(int(value), [])
ckeys.remove(key)
crivers.remove(river)
If the rivers list is sorted properly, this is O(n), if it's not sorted (or, in the worst case, reverse sorted), it's O(n**2). "Sorted properly", in this case, means they are sorted from root to leaf in our upside down tree... as our processing order is: 4, 5, 3, 6, 7, 2, 9, 10, 8, 1
The final result is:
{1: [2, 3, 4, -999],
2: [3, 4, -999],
3: [4, -999],
4: [-999],
5: [4, -999],
6: [5, 4, -999],
7: [5, 4, -999],
8: [7, 5, 4, -999],
9: [6, 5, 4, -999],
10: [7, 5, 4, -999]}
Which can be converted to your final format via:
fmt_lst = []
for key in result:
for val in result[key]:
fmt_lst.append("%s;%s" % (key, val))
['1;2', '1;3', '1;4', '1;-999',
'2;3', '2;4', '2;-999',
'3;4', '3;-999',
'4;-999',
'5;4', '5;-999',
'6;5', '6;4', '6;-999',
'7;5', '7;4', '7;-999',
'8;7', '8;5', '8;4', '8;-999',
'9;6', '9;5', '9;4', '9;-999',
'10;7', '10;5', '10;4', '10;-999']

Categories