python iterate over the two lists while comparing items - python

i have two lists eg x = [1,2,3,4,4,5,6,7,7] y = [3,4,5,6,7,8,9,10], i want to iterate over the two lists while comparing items. For those that match, i would like to call some function and remove them from the lists, in this example i should end up with x= [1,2] and y = [8,9,10]. Sets will not work for this problem because of my type of data and the comparison operator.
for i in x:
for j in y:
if i ==j:
callsomefunction(i,j)
remove i, j from x and y respectively

Edit: After discovering the person asking the question simply didn't know about __hash__ I provided this information in a comment:
To use sets, implement __hash__. So if obj1 == obj2 when obj1.a == obj2.a and ob1.b == obj2.b, __hash__ should be return hash((self.a, self.b)) and your sets will work as expected.
That solved their problem, and they switched to using sets.
The rest of this answer is now obsolete, but it's still correct (but horribly inefficient) so I'll leave it here.
This code does what you want. At the end, newx and newy are the non-overlapping items of x and y specifically.
x = [1,2,3,4,4,5,6,7,7]
y = [3,4,5,6,7,8,9,10]
# you can leave out bad and just compare against
# x at the end if memory is more important than speed
newx, bad, newy = [], [], []
for i in x:
if i in y:
callsomefunction(i)
bad.append(i)
else:
newx.append(i)
for i in y:
if i not in bad:
newy.append(i)
print newx
print newy
However, I know without even seeing your code that this is the wrong way to do this. You can certainly do it with sets, but if you don't want to, that's up to you.

Ok, discard my post, I hadn't seen the point where you mentionned that sets wouldn't work.
Nevertheless, if you're OK with a little work, you might want to use classes so that operators do work as they are expected to.
I think the most "pythonistic" way of doing this is to use sets.
You could then do :
x = set([1,2,3,4,4,5,6,7,7])
y = set([3,4,5,6,7,8,9,10])
for item in x.intersection(y): #y.intersection(x) is fine too.
my_function(item) #You could use my_function(item, item) if that's what your function requires
x.remove(item)
y.remove(item)
I think that sets are also more efficient than lists for this kind of work when it comes down to performance (though this might not be your top priority).
On a sidenote, you could also use:
x,y = x.difference(y), y.difference(x)
This effectively removes items that are in x and y from x and y.

Try this:
for i in x:
if i in y:
callsomefunction(i)
x.remove(i)
y.remove(i)
EDIT: updated answer

how about this:
import itertools
x = [1,2,3,4,4,5,6,7,7]
y = [3,4,5,6,7,8,9,10]
output = map(somefunction, itertools.product(x,y))

Related

what's the difference between filter and comprehention with if?

def anagramwordchecker(z,w):
if sorted([x for x in w])==sorted([x for x in z]):return True
return False
def anagramlistchecker(l,w):
d={}
for x in w:
d.update({x:w.count(x)})
l=list(filter(lambda x:anagramwordchecker(x,w),l))
return l
print(anagramlistchecker(['bcda', 'abce', 'cbda', 'cbea', 'adcb'],'abcd'))
trying to check which words are anagram.
using both of this it will print the same:
l=[x for x in l if anagramwordchecker(x,w)]
l=list(filter(lambda x:anagramwordchecker(x,w),l))
and it will be:
['bcda', 'cbda', 'adcb']
then what's the difference? any advantage using filter? cause comprehension is easier.
If you print the results of the following example, you will know which one is faster (Comments are results I got).
timeit.Timer('''[x for x in range(100) if x % 2 == 0]''' ).timeit(number=100000)
timeit.Timer('''list(filter(lambda x: x % 2 == 0, range(100)))''').timeit(number=100000)
# 0.3664856200000486
# 0.6642515319999802
So in your case, list comprehension would be faster. But let's see the following example.
timeit.Timer('''[x for x in range(100) if x % 2 == 0]''' ).timeit(number=100000)
timeit.Timer('''(x for x in range(100) if x % 2 == 0)''' ).timeit(number=100000)
timeit.Timer('''filter(lambda x: x % 2 == 0, range(100))''').timeit(number=100000)
# 0.5541256509999357
# 0.024836917000016
# 0.017953075000036733
The results show that casting an iterable to list takes much time and filter is faster than generator expression. So if your result does not really have to be a list, returning an iterable in a timely manner would be better.
As stated in here,
Note that filter(function, iterable) is equivalent to the generator expression (item for item in iterable if function(item)) if function is not None and (item for item in iterable if item) if function is None.
But list comprehension can do much more than simply filtering. If filter is given to the interpreter, it will knows it is a filter function. However, if a list comprehension is given to the interpreter, the interpreter does not know what it really is. After taking some time interpreting the list comprehension to something like a function, it would be a filter or filterfalse function in the end. Or, something else completely different.
filter with not condition can do what filterfalse does. But filterfalse is still there. Why? not operator does not need to be applied.
There is no magic. Human-friendly 1-for-many grammars are based on encapsulation. For them to be machine-executable binaries, they need to be decapsulated back and it takes time.
Go with a specific solution if it is enough than taking a more general solutions. Not only in coding, general solutions are usually for convenience, not for best results.

Is defining a variable using a for loop possible in Python?

If I'd want to do for i in range(10): x+= 1 it obviously wouldn't work since x is undefined, I'd have to declare x e.g. x=0 before trying to write to it, but now I have two lines, one for declaring x=0 and one for actually writing the data I want to x (for i in range(10): x+= 1).
I was wondering, is there a way to do this in a single line? Or more specifically, declaring x as the result of a for loop?
Something along the lines of x = for i in range(10): x+= 1 ?
Would this even be possible or am I asking a nonsense question?
Not with a for statement.
However, you can use a similar construct, such as a list comprehension expression or a generator expression:
x = sum(i for i in range(10))
which is equivalent to just saying
x = sum(range(10))
Edit: as Dougal correctly notes, in your example you increment by 1, so the equivalent is
x = sum(1 for i in range(10))
which is in turn the same as x = len(range(10)) or just x = 10.
In most cases, i would imagine you would just want to use x as the loop variable:
for x in range(10): pass#or range(1,11)
An enumerate would also achieve a similar effect as what you are describing, assuming you want a different variable as the loop variable so as to do something else in the loop:
e.g.
for i,x in enumerate(range(x_start,x_end+1)):
#do something

Set comprehension and different comparable relations

I have set objects which are comparable in some way and I want to remove objects from the set. I thought about how this problem changes, for different comparable relations between the elements. I was interested in the development of the search space, the use of memory and how the problem scales.
1st scenario: In the most simple scenario the relation would be bidirectional and thus we could remove both elements, as long as we can ensure that by removing the elements do not remove other 'partners'.
2nd scenario: The comparable relation is not bidirectional. Remove only the element in question, not the one it is comparable to. A simplified scenario would be the set consisting of integers and the comparable operation would be 'is dividable without rest'
I can do the following, not removing elements:
a_set = set([4,2,3,7,9,16])
def compare(x, y):
if (x % y == 0) and not (x is y):
return True
return False
def list_solution():
out = set()
for x in a_set:
for y in a_set:
if compare(x,y):
out.add(x)
break
result = a_set-out
print(result)
Of course, my first question as a junior Python programmer would be:
What is the appropriate set comprehension for this?
Also: I cannot modify the size of Python sets during iteration, without copying, right?
And now for the algo people out there: How does this problem change if the relation between the number of elements an element can be comparable to increases. How does it change if the comparable relation represent partial orders?
I will precede with confirming your claim - changing set while iterating it would trigger a RuntimeError, which would claim something along the lines of "Set changed size during iteration".
Now, lets start from the compare function: since you are using sets, x is y is probably like x == y and the last one is always a better choice when possible.
Furthermore, no need for the condition; you are already performing one:
def compare (x, y):
return x != y and x % y == 0
Now to the set comprehension - that's a messy one. After setting the set as an argument - which is better than using global variable - the normal code would be something like
for x in my_set:
for y in my_set:
if compare(x, y):
for a in (x, y):
temp.append(a)
notice the last two lines, which do not use unpacking because that would be impossible in the comprehension. Now, all what's left is to move the a to the front and make all : disappear - and the magic happens:
def list_solution (my_set):
return my_set - {a for x in my_set for y in my_set if compare(x, y) for a in (x, y)}
you can test it with something like
my_set = set([4, 2, 3, 7, 9, 16])
print(list_solution(my_set)) # {7}
The condition and the iteration on (x, y) can switch places - but I believe that iterating after confirming would be faster then going in and starting to iterate when there's a chance you wouldn't perform any action.
The change for the second case is minor - merely using x instead of the x, y unpacking:
def list_solution_2 (my_set):
return my_set - {x for x in my_set for y in my_set if compare(x, y)}

Difference In List - Python - Syntax Explanation [duplicate]

This question already has answers here:
Explanation of how nested list comprehension works?
(11 answers)
Closed 6 years ago.
Can someone please explain the meaning the syntax behind the following line of code:
temp3 = [x for x in temp1 if x not in s]
I understand it's for finding the differences between 2 lists, but what does the 'x' represent here? Each individual element in the list that is being compared? I understand that temp1 and s are lists. Also, does x for x have to have the same variable or could it be x for y?
[x for x in temp1 if x not in s]
It may help to re-order it slightly, so you can read the whole thing left to right. Let's move the first x to the end.
[for x in temp1 if x not in s yield x]
I've added a fake yield keyword so it reads naturally as English. If we then add some colons it becomes even clearer.
[for x in temp1: if x not in s: yield x]
Really, this is the order that things get evaluated in. The x variable comes from the for loop, that's why you can refer to it in the if and yield clauses. But the way list comprehensions are written is to put the value being yielding at the front. So you end up using a variable name that's not yet defined.
In fact, this final rewrite is exactly how you'd write an explicit generator function.
def func(temp1, s):
for x in temp1:
if x not in s:
yield x
If you call func(temp1, s) you get a generator equivalent to the list. You could turn it into that list with list(func(temp1, s)).
It iterates through each element in temp1 and checks to see if it is not in s before including it in temp3.
It is a shorter and more pythonic way of writing
temp3 = []
for item in temp1:
if item not in s:
temp3.append(item)
Where temp1 and s are the two lists you are comparing.
As for your second question, x for y will work, but probably not in the way you intend to, and certainly not in a very useful way. It will assign each item in temp1 to the variable name y, and then search for x in the scope outside of the list comprehension. Assuming x is defined previously (otherwise you will get NameError or something similar), the condition if x not in s will evaluate to the same thing for every item in temp1, which is why it’s not terribly useful. And if that condition is true, your resulting temp3 will be populated with xs; the y values are unused.
Do not take this as saying that using different variables in a list comprehension is never useful. In fact list comprehensions like [a if condition(x) else b for x in original_sequence] are often very useful. A list comprehension like [a for x in original_sequence if condition(x)] can also be useful for constructing a list containing exactly as many instances of a as the number of items in original_sequence that satisfy condition().
Try yourself:
arr = [1,2,3]
[x+5 for x in arr]
This should give you [6, 7, 8] that are the values on the [1,2,3] list plus 5. This syntax is know as list comprehension (or mapping). It applies the same instructions to all elements on a list. Would be the same as doing this:
for x in arr:
arr += 5
X is same variable and it is not y. It works same as below code
newList = []
for x in temp1:
if x not in s:
newList.append(x)
So x for x, here first is x which is inside append in code and x after for is same as for x in temp1.

Python nested forloop list comprehension

I'm trying to convert this working nested forloop into a single line list comprehension & i cannot seem to get it to work. The pseudo-code is as follows:
result = []
for x in something:
for y in x.address:
m = re.search("some pattern to match",y)
if m:
result += [m.group(1)]
Any pointers on how do i go about this ?
You'll need a generator expression..
matches = ( re.search(r'some pattern to match', y) for x in something
for y in x.address )
result = [ m.group(1) for m in matches if m ]
Nested loops are not really a problem for list comprehensions, as you can nest those there too:
lst = []
for y in z:
for x in y:
lst.append(f(x))
This translates into the following list comprehension:
[f(x) for y in z for x in y]
And you can easily continue that for multiple levels.
Conditions that decide on whether you want to add something to the list or not also work just fine:
lst = []
for x in y:
if t(x):
lst.append(f(x))
This translated into the following list comprehension with a filter:
[f(x) for x in y if t(x)]
Of course you can also combine that with multiple levels.
Now what is some kind of a problem though is when you want to execute something first, then filter on the result of that and append also something that depends on the result. The naive solution would be to move the function call inside and do it twice:
rexpr = re.compile('some pattern to match')
[rexpr.search(y).group(1) for x in something for y in x.address if rexpr.search(y)]
But this obviously runs the search twice which you generally want to avoid. At this point, you could use some hackish solutions which I generally wouldn’t recommend (as they harm readability). Since your result only depends on the result of the regular expression search, you could also solve this in two steps: First, you search on every element and map them to a match object, and then you filter on those matches and just return the valid ones:
[m.group(1) for m in (rexpr.search(y) for x in something for y in x.address) if m]
Note that I’m using generator expressions here: Those are essentially the same as list comprehensions, but don’t create the full result as a list but only yield on element at a time. So it’s more efficient if you only want to consume this one by one (which is the case here). After all, you’re only interested in the result from the list comprehension, so the comprehension will consume the generator expression.
I would do something like this:
# match function
def match(x):
m = re.search("some pattern to match",x)
if m:
return m.group(1)
else:
return None
#list comprehension
results = [match(y) for x in something for y in x.address if match(y)]

Categories