I am having a hard time trying to understand how the "for" function works.
I want to make a script that only outputs the strings in list2 that are not inside list1. For example:
list1 = ["link1.com", "link2.com", "link3.com"]
list2 = ["link2.com", "link123.com"]
for list2 in list1:
print(list2)
{My intention was that the code printed:
link123.com
But instead it printed the strings from list1}
I can't get it to work. Help would be much appreciated. I am using python 3 by the way.
Use Set for this .
set(list2)-set(list1)
Check with python set
The loop for list2 in list1 is actually an assignment: in each iteration the variable list2 gets the value of the next item in list1 - that is why only the elements of list1 are printed.
You could iterate over the elements of list2 and print, if they are not in list1:
for element in list2:
if element not in list1:
print(element)
Or if you want to use a for loop (note that this isn't very efficient for large lists):
for string in list2:
if not string in list1:
print (string)
For loops
For loops allow you to repeat a piece of code multiple times. You could easily just write a conditional and a print statement multiple times for each item in the list. A for loops allows you to write that piece of code once and have it repeated for each item in the list.
You can iterate over item in list2 by doing the following
for item in list2:
print(item)
item is an arbitrary variable name that holds the value of the current item we are on, what follows in is the list that we want to iterate over. print(item) is the piece of code we want to repeat for each element in the list.
What this does is goes through every item in list2 and prints them but that is not what we want. We want to check to make sure that item is not in list1. This can be achieved through a conditional statement.
if item not in list1:
print(item)
Now we can join the two piece of code together.
for item in list2:
if item not in list1:
print(item)
Sets
Are a collection of items in no order where every item is unique. These sets are the same ones we encounter in mathematics, therefore we can perform mathematical set operations on them.
To go from a list of items to a set we use sList1 = set(list1) sList1 is now of type set and stores the elements in list1. The same can be done for list2.
Now that we have sList1 and sList2 we want to eliminate any duplicates in the two for that we can take the difference of sList1 and sList2 and print them out as follows print(sList2-sList1).
We can do all of this in one step.
print( set(list2) - set(list1) )
The semantic, that python will check if these items are in list1 is NOT part of the for-each-loop.
The 'set' - solution is maybe too advanced for you.
So straightforward you would:
for item in list2:
if item not in list1:
print(item)
I am new at Python, so I'm having trouble with something. I have a few string lists in one list.
list=[ [['AA','A0'],['AB','A0']],
[['AA','B0'],['AB','A0']],
[['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
]
And I have to find how many lists have the same element. For example, the correct result for the above list is 3 for the element ['AB','A0'] because it is the element that connects the most of them.
I wrote some code...but it's not good...it works for 2 lists in list,but not for more....
Please,help!
This is my code...for the above list...
for t in range(0,len(list)-1):
pattern=[]
flag=True
pattern.append(list[t])
count=1
rest=list[t+1:]
for p in pattern:
for j in p:
if flag==False:
break
pair= j
for i in rest:
for y in i:
if pair==y:
count=count+1
break
if brojac==len(list):
flag=False
break
Since your data structure is rather complex, you might want to build a recursive function, that is a function that calls itself (http://en.wikipedia.org/wiki/Recursion_(computer_science)).
This function is rather simple. You iterate through all items of the original list. If the current item is equal to the value you are searching for, you increment the number of found objects by 1. If the item is itself a list, you will go through that sub-list and find all matches in that sub-list (by calling the same function on the sub-list, instead of the original list). You then increment the total number of found objects by the count in your sub-list. I hope my explanation is somewhat clear.
alist=[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
def count_in_list(val, arr):
val_is_list = isinstance(val, list)
ct = 0
for item in arr:
item_is_list = isinstance(item, list)
if item == val or (val_is_list and item_is_list and sorted(item) == sorted(val)):
ct += 1
if item_is_list :
ct += count_in_list(val, item)
return ct
print count_in_list(['AB', 'A0'], alist)
This is an iterative approach that will also work using python3 that will get the count of all sublists:
from collections import defaultdict
d = defaultdict(int)
def counter(lst, d):
it = iter(lst)
nxt = next(it)
while nxt:
if isinstance(nxt, list):
if nxt and isinstance(nxt[0], str):
d[tuple(nxt)] += 1
rev = tuple(reversed(nxt))
if rev in d:
d[rev] += 1
else:
lst += nxt
nxt = next(it,"")
return d
print((counter(lst, d)['AB', 'A0'])
3
It will only work on data like your input, nesting of strings beside lists will break the code.
To get a single sublist count is easier:
def counter(lst, ele):
it = iter(lst)
nxt = next(it)
count = 0
while nxt:
if isinstance(nxt, list):
if ele in (nxt, nxt[::-1]):
count += 1
else:
lst += nxt
nxt = next(it, "")
return count
print(counter(lst, ['AB', 'A0']))
3
Ooookay - this maybe isn't very nice and straightforward code, but that's how i'd try to solve this. Please don't hurt me ;-)
First,
i'd fragment the problem in three smaller ones:
Get rid of your multiple nested lists,
Count the occurence of all value-pairs in the inner lists and
Extract the most occurring value-pair from the counting results.
1.
I'd still use nested lists, but only of two-levels depth. An outer list, to iterate through, and all the two-value-lists inside of it. You can finde an awful lot of information about how to get rid of nested lists right here. As i'm just a beginner, i couldn't make much out of all that very detailed information - but if you scroll down, you'll find an example similar to mine. This is what i understand, this is how i can do.
Note that it's a recursive function. As you mentioned in comments that you think this isn't easy to understand: I think you're right. I'll try to explain it somehow:
I don't know if the nesting depth is consistent in your list. and i don't want to exctract the values themselves, as you want to work with lists. So this function loops through the outer list. For each element, it checks if it's a list. If not, nothing happens. If it is a list, it'll have a look at the first element inside of that list. It'll check again if it's a list or not.
If the first element inside the current list is another list, the function will be called again - recursive - but this time starting with the current inner list. This is repeated until the function finds a list, containing an element on the first position that is NOT a list.
In your example, it'll dig through the complete list-of-lists, until it finds your first string values. Then it gets the list containing this value - and put that in another list, the one that is returned.
Oh boy, that sounds really crazy - tell me if that clarified anything... :-D
"Yo dawg, i herd you like lists, so i put a list in a list..."
def get_inner_lists(some_list):
inner_lists = []
for item in some_list:
if hasattr(item, '__iter__') and not isinstance(item, basestring):
if hasattr(item[0], '__iter__') and not isinstance(item[0], basestring):
inner_lists.extend(get_inner_lists(item))
else:
inner_lists.append(item)
return inner_lists
Whatever - call that function and you'll find your list re-arranged a little bit:
>>> foo = [[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
>>> print get_inner_lists(foo)
[['AA', 'A0'], ['AB', 'A0'], ['AA', 'B0'], ['AB', 'A0'], ['A0', '00'], ['00', 'A0'], ['00', 'BB'], ['AB', 'A0'], ['AA', 'A0']]
2.
Now i'd iterate through that lists and build a string with their values. This will only work with lists of two values, but as this is what you showed in your example it'll do. While iterating, i'd build up a dictionary with the strings as keys and the occurrence as values. That makes it really easy to add new values and raise the counter of existing ones:
def count_list_values(some_list):
result = {}
for item in some_list:
str = item[0]+'-'+item[1]
if not str in result.keys():
result[str] = 1
else:
result[str] += 1
return result
There you have it, all the counting is done. I don't know if it's needed, but as a side effect there are all values and all occurrences:
>>> print count_list_values(get_inner_lists(foo))
{'00-A0': 1, '00-BB': 1, 'A0-00': 1, 'AB-A0': 3, 'AA-A0': 2, 'AA-B0': 1}
3.
But you want clear results, so let's loop through that dictionary, list all keys and all values, find the maximum value - and return the corresponding key. Having built the string-of-two-values with a seperator (-), it's easy to split it and make a list out of it, again:
def get_max_dict_value(some_dict):
all_keys = []
all_values = []
for key, val in some_dict.items():
all_keys.append(key)
all_values.append(val)
return all_keys[all_values.index(max(all_values))].split('-')
If you define this three little functions and call them combined, this is what you'll get:
>>> print get_max_dict_value(count_list_values(get_inner_lists(foo)))
['AB', 'A0']
Ta-Daa! :-)
If you really have such lists with only nine elements, and you don't need to count values that often - do it manually. By reading values and counting with fingers. It'll be so much easier ;-)
Otherwise, here you go!
Or...
...you wait until some Guru shows up and gives you a super fast, elegant one-line python command that i've never seen before, which will do the same ;-)
This is as simple as I can reasonably make it:
from collections import Counter
lst = [ [['AA','A0'],['AB','A0']],
[['AA','B0'],['AB','A0']],
[['A0','00'],['00','A0'], [['00','BB'],['AB','A0'],['AA','A0']] ]
]
def is_leaf(element):
return (isinstance(element, list) and
len(element) == 2 and
isinstance(element[0], basestring)
and isinstance(element[1], basestring))
def traverse(iterable):
for element in iterable:
if is_leaf(element):
yield tuple(sorted(element))
else:
for value in traverse(element):
yield value
value, count = Counter(traverse(lst)).most_common(1)[0]
print 'Value {!r} is present {} times'.format(value, count)
The traverse() generate yields a series of sorted tuples representing each item in your list. The Counter object counts the number of occurrences of each, and its .most_common(1) method returns the value and count of the most common item.
You've said recursion is too difficult, but I beg to differ: it's the simplest way possible to attack this problem. The sooner you come to love recursion, the happier you'll be. :-)
Hopefully soemthing like this is what you were looking for. It is a bit tenuous and would suggest that recursion is better. But Since you didn't want it that way here is some code that might work. I am not super good at python but hope it will do the job:
def Compare(List):
#Assuming that the list input is a simple list like ["A1","B0"]
myList =[[['AA','A0'],['AB','A0']],[['AA','B0'],['AB','A0']],[['A0','00'],['00','A0'],[['00','BB'],['AB','A0'],['AA','A0']]]]
#Create a counter that will count if the elements are the same
myCounter = 0;
for innerList1 in myList:
for innerList2 in innerList1
for innerList3 in innerList2
for element in innerList3
for myListElements in myList
if (myListElements == element)
myCounter = myCounter + 1;
#I am putting the break here so that it counts how many lists have the
#same elements, not how many elements are the same in the lists
break;
return myCounter;
I'm trying to right a program that takes in a nested list, and returns a new list that takes out proper nouns.
Here is an example:
L = [['The', 'name', 'is', 'James'], ['Where', 'is', 'the', 'treasure'], ['Bond', 'cackled', 'insanely']]
I want to return:
['the', 'name', 'is', 'is', 'the', 'tresure', 'cackled', 'insanely']
Take note that 'where' is deleted. It is ok since it does not appear anywhere else in the nested list. Each nested list is a sentence. My approach to it is append every first element in the nested list to a newList. Then I compare to see if elements in the newList are in the nested list. I would lowercase the element's in the newList to check. I'm half way done with this program, but I'm running into an error when I try to remove the element from the newList at the end. Once i get the new updated list, I want to delete items from the nestedList that are in the newList. I'd lastly append all the items in the nested list to a newerList and lowercase them. That should do it.
If someone has a more efficient approach I'd gladly listen.
def lowerCaseFirst(L):
newList = []
for nestedList in L:
newList.append(nestedList[0])
print newList
for firstWord in newList:
sum = 0
firstWord = firstWord.lower()
for nestedList in L:
for word in nestedList[1:]:
if firstWord == word:
print "yes"
sum = sum + 1
print newList
if sum >= 1:
firstWord = firstWord.upper()
newList.remove(firstWord)
return newList
Note this code is not finished due to the error in the second to last line
Here is with the newerList (updatedNewList):
def lowerCaseFirst(L):
newList = []
for nestedList in L:
newList.append(nestedList[0])
print newList
updatedNewList = newList
for firstWord in newList:
sum = 0
firstWord = firstWord.lower()
for nestedList in L:
for word in nestedList[1:]:
if firstWord == word:
print "yes"
sum = sum + 1
print newList
if sum >= 1:
firstWord = firstWord.upper()
updatedNewList.remove(firstWord)
return updatedNewList
error message:
Traceback (most recent call last):
File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 1, in <module>
# Used internally for debug sandbox under external interpreter
File "/Applications/WingIDE.app/Contents/MacOS/src/debug/tserver/_sandbox.py", line 80, in lowerCaseFirst
ValueError: list.remove(x): x not in list
The error in your first function is because you try to remove an uppercased version of firstWord from newlist where there are no uppercase words (you see that from the printout). Remember that you store a upper/lowercased version of your words in a new variable, but you don't change the contents of the original list.
I still don't understand your approach. You want to do to things as you describe your task; 1) flatten the a lists of lists to a list of elements (always an interesting programming exercise) and 2) remove proper nouns from this list. This means that you have to decide what is a proper noun. You could do that rudimentarily (all non-starting capitalized words, or an exhaustive list), or you could use a POS tagger (see: Finding Proper Nouns using NLTK WordNet). Unless I misunderstand your task completely, you needn't worry about the casing here.
The first task can be solved in many ways. Here is a nice way that illustrates well what actually happenes in the simple case where your list L is a list of lists (and not lists that can be infinitely nested):
def flatten(L):
newList = []
for sublist in L:
for elm in sublist:
newList.append(elm)
return newList
this function you could make into flattenAndFilter(L) by checking each element like this:
PN = ['James', 'Bond']
def flattenAndFilter(L):
newList = []
for sublist in L:
for elm in sublist:
if not elm in PN:
newList.append(elm)
return newList
You might not have such a nice list of PNs, though, then you would have to expand on the checking, as for instance by parsing the sentence and checking the POS tags.