Identify a single difference in a python list - python

I would have to get some help concerning a part of my code.
I have some python list, example:
list1 = (1,1,1,1,1,1,5,1,1,1)
list2 = (6,7,4,4,4,1,6,7,6)
list3 = (8,8,8,8,9)
I would like, for each list, know if there is a single value that is different compare to every other values if and only if all of these other values are the same. For example, in the list1, it would identify "5" as a different value, in list2 it would identify nothing as there are more than 2 different values and in list3 it would identify "9"
What i already did is :
for i in list1:
if list1(i)==len(list1)-1
print("One value identified")
The problem is that i get "One value identified" as much time as "1" is present in my list ...
But what i would like to have is an output like that :
The most represented value equal to len(list1)-1 (Here "1")
The value that is present only once (Here "5")
The position in the list where the "5"

You could use something like that:
def odd_one_out(lst):
s = set(lst)
if len(s)!=2: # see comment (1)
return False
else:
return any(lst.count(x)==1 for x in s) # see comment (2)
which for the examples you provided, yields:
print(odd_one_out(list1)) # True
print(odd_one_out(list2)) # False
print(odd_one_out(list3)) # True
To explain the code I would use the first example list you provided [1,1,1,1,1,1,5,1,1,1].
(1) converting to set removes all the duplicate values from your list thus leaving you with {1, 5} (in no specific order). If the length of this set is anything other than 2 your list does not fulfill your requirements so False is returned
(2) Assuming the set does have a length of 2, what we need to check next is that at least one of the values it contains appear only once in the original list. That is what this any does.

You can use the built-in Counter from High-performance container datatypes :
from collections import Counter
def is_single_diff(iterable):
c = Counter(iterable)
non_single_items = list(filter(lambda x: c[x] > 1, c))
return len(non_single_items) == 1
Tests
list1 = (1,1,1,1,1,1,5,1,1,1)
list2 = (6,7,4,4,4,1,6,7,6)
list3 = (8,8,8,8,9)
In: is_single_diff(list1)
Out: True
In: is_single_diff(list2)
Out: False
In: is_single_diff(list3)
Out: True

Use numpy unique, it will give you all the information you need.
myarray = np.array([1,1,1,1,1,1,5,1,1,1])
vals_unique,vals_counts = np.unique(myarray,return_counts=True)

You can first check for the most common value. After that, go through the list to see if there is a different value, and keep track of it.
If you later find another value that isn't the same as the most common one, the list does not have a single difference.
list1 = [1,1,1,1,1,1,5,1,1,1]
def single_difference(lst):
most_common = max(set(lst), key=lst.count)
diff_idx = None
diff_val = None
for idx, i in enumerate(lst):
if i != most_common:
if diff_val is not None:
return "No unique single difference"
diff_idx = idx
diff_val = i
return (most_common, diff_val, diff_idx)
print(single_difference(list1))

Related

How to remove a value if its repeating more than once in a json?

I have a json like below
a = {"infinity_war":["Stark", "Hulk", "Rogers", "Thanos"],
"end_game":["Stark", "Dr.Strange", "Peter"]}
Since the name "Stark" is repeating more than once in the whole json I need to keep only one occurrence of "Stark" and remove the others. I tried using pandas but it needs all the list with same length. Is there any other way. The result I need is
a = {"infinity_war":["Stark", "Hulk", "Rogers", "Thanos"],
"end_game":["Dr.Strange", "Peter"]}
You can use a simple loop and a set to keep track of the seen elements:
seen = set()
b = {}
for k,l in a.items():
b[k] = [x for x in l if not (x in seen or seen.add(x))]
output:
{'infinity_war': ['Stark', 'Hulk', 'Rogers', 'Thanos'],
'end_game': ['Dr.Strange', 'Peter']}
How it works:
for each key/list pair, iterate over the elements of the list. If an element is found in the seen set, skip adding it to the new list, else append it to the seen set and add it to the new list.
seen.add(x) is always False as set.add returns None, so (x in seen or seen.add(x)) has the boolean value of x in seen, which we invert with not.

Python function that takes a list and returns a new list with unique elements of the first list

I'm trying to solve this problem by using this code
def unique_list(numbers):
unique = []
for item in numbers :
if item in unique == False:
unique.append(item)
return unique
But every time i'm calling this function, I get an empty list
Can somebody help thus beginner ? I don't understand where i'm going wrong
As Oksana mentioned, use list(set(numbers)).
As for you code, change if item in unique == False to if item not in unique. But note that this code is slower, since it has to scan the list for every new element that it tries to add:
def unique_list(numbers):
unique = []
for item in numbers :
if item not in unique:
unique.append(item)
return unique
print(unique_list([1, 2, 3, 1, 2]))
# [1, 2, 3]
As SET only contain unique value, we can use it to get your answer.
Use this python function code
def unique_list(numbers):
x=set(numbers) #unique numbers in set
y=list(x) #convert set to list as you want your output in LIST.
print(y)
EXAMPLE:
unique_list([2,2,3,3,3])
OUTPUT Will be a unique list.
[2,3]
Edit:
Actually, as pointed out by DeepSpace below, this is wrong! Curiously, It isn't evaluated as (item in unique) == False nor as item in (unique == False).
It's caused by operator precedence. In the line:
item in unique == False:
Python first resolves the unique == False expression, which checks if the variable unique is equals to False (which isn't true, it is a list).
So, that line becomes
if item in False:
So, the if block is never executed! To fix it, you can wrap item in unique in parenthesis, such as:
if (item in unique) == False:
BUT, there's a very useful data structure in Python that serves your purpose: Set. A Set holds only unique values, and you can create it from a existing list! So, your function can be rewritten as:
def unique_list(numbers):
return list(set(numbers)) # It converts your original list into a set (with only unique values) and converts back into a list
We cannot check if the item in unique is false or true like that, instead we use 'not'. Try this:
if not item in unique:
unique.append(item)
def unique_list(lst):
a = set(lst)
return list(a)

Sort a list with certain values staying in constant positions

I have a list of strings. I want to only sort values that meet a certain condition. Consider this list
['foo','bar','testa','python','java','abc']
and I only want to sort the values with an a in them. The result should look like this
['foo','abc','bar','python','java','testa']
The elements with a will change places appropriately, but the other elements retain their original positions.
I have absolutely no idea how to implement this, so I hope someone else does. Can someone show me how to do this?
y = sorted(w for w in x if 'a' in w) # pick and sort only the elements with 'a'
x = [w if 'a' not in w else y.pop(0) for w in x]
The last line leaves word without an 'a' in them unchanged, while those with 'a' are picked progressively from the y list (that is already sorted)
EDIT:
#MartijnPieters solution performs better, since it uses an iterator and won't use additional memory to store y.
y = iter(sorted(w for w in x if 'a' in w)) # create iterator, don't use memory
x = [w if 'a' not in w else next(y) for w in x] # yield from iter instead of popping from a list
Since it looks like you need this algorithm to work with different condition, you could put this into a method:
x = ['foo','bar','testa','python','java','abc']
def conditional_sort(ls, f):
y = iter(sorted(w for w in ls if f(w)))
return [w if not f(w) else next(y) for w in ls]
conditional_sort(x, lambda w: 'a' in w)
The first parameter would be the list, the second one a function that takes a single parameter and returns a bool value.
Find the elements with a; mark the positions and pull them out.
orig = ['foo','bar','testa','python','java','abc']
just_a = [str for str in orig if `a` in str]
mark = [`a` in str for str in orig]
This gives us
just_a = ['bar', 'testa', 'java', 'abc']
mark = [False, True, True, False, True, True]
Sort just_a; I'm sure you can do that. Now, build your result: where there's True in mark, take the next item in the sorted list; otherwise, take the original element.
result = []
for pos in range len(orig):
if mark[pos]:
result.append(sort_a.pop())
else:
result.append(orig[pos])
This can be done with much less code. Among other things, this last loop can be done with a list comprehension. This code merely clarifies the process.
A possible approach would be to :
Extract all values with an 'a' in them and note their positions.
Sort the values alphabetically (see this post).
Insert the sorted values into the original list.
This can definitely be simplified, but here's one way of doing it
def custom_sort(lst):
sorted_list = [x for x in lst if 'a' in x] # get list of everything with an a in it
sorted_list.sort() # sort this of elements containing a
final_list = [] # make empty list, we will fill this with what we need
sorted_counter = 0 # need a counter to keep track of what element containing a have been accounted for below
for x in lst: # loop over original list
if 'a' in x: # if that element in our original list contains an a
final_list.append(sorted_list[sorted_counter]) # then we will from our sorted list of elements with a
sorted_counter += 1 # increment counter
else: # otherwise
final_list.append(x) # we add an element from our original list
return final_list # return the custom sorted list
I just would use two additional lists to keep track of the indices of words with 'a' and to sorted the words:
L=['foo','bar','testa','python','java','abc']
M=[]
count=[]
for t in L:
if 'a' in t:
M.append(t) #append both the word and the index
count.append(L.index(t))
M=sorted(M)
for l in count:
L[l]=M[count.index(l)]
L
Probably is not very efficient but it works.

check for duplicates in a python list

I've seen a lot of variations of this question from things as simple as remove duplicates to finding and listing duplicates. Even trying to take bits and pieces of these examples does not get me my result.
My question is how am I able to check if my list has a duplicate entry? Even better, does my list have a non-zero duplicate?
I've had a few ideas -
#empty list
myList = [None] * 9
#all the elements in this list are None
#fill part of the list with some values
myList[0] = 1
myList[3] = 2
myList[4] = 2
myList[5] = 4
myList[7] = 3
#coming from C, I attempt to use a nested for loop
j = 0
k = 0
for j in range(len(myList)):
for k in range(len(myList)):
if myList[j] == myList[k]:
print "found a duplicate!"
return
If this worked, it would find the duplicate (None) in the list. Is there a way to ignore the None or 0 case? I do not care if two elements are 0.
Another solution I thought of was turn the list into a set and compare the lengths of the set and list to determine if there is a duplicate but when running set(myList) it not only removes duplicates, it orders it as well. I could have separate copies, but it seems redundant.
Try changing the actual comparison line to this:
if myList[j] == myList[k] and not myList[j] in [None, 0]:
I'm not certain if you are trying to ascertain whether or a duplicate exists, or identify the items that are duplicated (if any). Here is a Counter-based solution for the latter:
# Python 2.7
from collections import Counter
#
# Rest of your code
#
counter = Counter(myList)
dupes = [key for (key, value) in counter.iteritems() if value > 1 and key]
print dupes
The Counter object will automatically count occurances for each item in your iterable list. The list comprehension that builds dupes essentially filters out all items appearing only once, and also upon items whose boolean evaluation are False (this would filter out both 0 and None).
If your purpose is only to identify that duplication has taken place (without enumerating which items were duplicated), you could use the same method and test dupes:
if dupes: print "Something in the list is duplicated"
If you simply want to check if it contains duplicates. Once the function finds an element that occurs more than once, it returns as a duplicate.
my_list = [1, 2, 2, 3, 4]
def check_list(arg):
for i in arg:
if arg.count(i) > 1:
return 'Duplicate'
print check_list(my_list) == 'Duplicate' # prints True
To remove dups and keep order ignoring 0 and None, if you have other falsey values that you want to keep you will need to specify is not None and not 0:
print [ele for ind, ele in enumerate(lst[:-1]) if ele not in lst[:ind] or not ele]
If you just want the first dup:
for ind, ele in enumerate(lst[:-1]):
if ele in lst[ind+1:] and ele:
print(ele)
break
Or store seen in a set:
seen = set()
for ele in lst:
if ele in seen:
print(ele)
break
if ele:
seen.add(ele)
You can use collections.defaultdict and specify a condition, such as non-zero / Truthy, and specify a threshold. If the count for a particular value exceeds the threshold, the function will return that value. If no such value exists, the function returns False.
from collections import defaultdict
def check_duplicates(it, condition, thresh):
dd = defaultdict(int)
for value in it:
dd[value] += 1
if condition(value) and dd[value] > thresh:
return value
return False
L = [1, None, None, 2, 2, 4, None, 3, None]
res = check_duplicates(L, condition=bool, thresh=1) # 2
Note in the above example the function bool will not consider 0 or None for threshold breaches. You could also use, for example, lambda x: x != 1 to exclude values equal to 1.
In my opinion, this is the simplest solution I could come up with. this should work with any list. The only downside is that it does not count the number of duplicates, but instead just returns True or False
for k, j in mylist:
return k == j
Here's a bit of code that will show you how to remove None and 0 from the sets.
l1 = [0, 1, 1, 2, 4, 7, None, None]
l2 = set(l1)
l2.remove(None)
l2.remove(0)

python intersect of dict items

Suppose I have a dict like:
aDict[1] = '3,4,5,6,7,8'
aDict[5] = '5,6,7,8,9,10,11,12'
aDict[n] = '5,6,77,88'
The keys are arbitrary, and there could be any number of them. I want to consider every value in the dictionary.
I want to treat each string as comma-separated values, and find the intersection across the entire dictionary (the elements common to all dict values). So in this case the answer would be '5,6'. How can I do this?
from functools import reduce # if Python 3
reduce(lambda x, y: x.intersection(y), (set(x.split(',')) for x in aDict.values()))
First of all, you need to convert these to real lists.
l1 = '3,4,5,6,7,8'.split(',')
Then you can use sets to do the intersection.
result = set(l1) & set(l2) & set(l3)
Python Sets are ideal for that task. Consider the following (pseudo code):
intersections = None
for value in aDict.values():
temp = set([int(num) for num in value.split(",")])
if intersections is None:
intersections = temp
else:
intersections = intersections.intersection(temp)
print intersections
result = None
for csv_list in aDict.values():
aList = csv_list.split(',')
if result is None:
result = set(aList)
else:
result = result & set(aList)
print result
Since set.intersection() accepts any number of sets, you can make do without any use of reduce():
set.intersection(*(set(v.split(",")) for v in aDict.values()))
Note that this version won't work for an empty aDict.
If you are using Python 3, and your dictionary values are bytes objects rather than strings, just split at b"," instead of ",".

Categories