Check if elements list are in column DataFrame - python

Objective: I have a list of 200 elements(urls) and I would like to check if each one is in a specific column of the Dataframe. If it is, I would like to remove the element from the list.
Problem: I am trying a similar solution by adding to a new list the ones that are not there but it adds all of them.
pruned = []
for element in list1:
if element not in transfer_history['Link']:
pruned.append(element)
I have also tried the solution I asked for without success. I think it's a simple thing but I can't find the key.
for element in list1:
if element in transfer_history['Link']:
list1.remove(element)

When you use in with a pandas series, you are searching the index, not the values. To get around this, convert the column to a list using transfer_history['Link'].tolist(), or better, convert it to a set.
links = set(transfer_history["Link"])
A good way to filter the list is like this:
pruned = [element for element in list1 if element not in links]
Don't remove elements from the list while iterating over it, which may have unexpected results.

Remember, your syntax for transfer_history['Link'] is the entire column itself. You need to call each item in the column using another array transfer_history['Link'][x]. Use a for loop to iterate through each item in the column.
Or a much easier way is to just check if the item is in a list made of the entire column with a one liner:
pruned = []
for element in list1:
if element not in [link for link in transfer_history['Link']]:
pruned.append(element)

If the order of the urls doesn't matter, this can be simplified a lot using sets:
list1 = list(set(list1) - set(transfer_history['Link']))

Related

Insert element in list of list from list

Want to insert list elements in list of list such that first element of list should be inserted to first index of first list of list then second element of list of list to first element of 2nd list of list and so on...
For eg.
lst_of_lst = [[1,2,3,4][5,6,7,8][9,10,11,12][13,14,15,16]]
list = ['a','b','c','d']
output - lst_of_lst=[['a',1,2,3,4]['b',5,6,7,8]['c',9,10,11,12]['d',13,14,15,16]]
All you need to do is just iterate over your the list you want to insert the items from and just insert the respective item at 0th position.
It can be done as follows:
for i in range(len(list)):
lst_of_lst[i].insert(0, list[i])
That's it!
Also, you missed , in defining lst_of_lst, it will give you error. And also it is not a good way to name any variable or data structure a name of data type. Like you did for list array. You can change it to _list if you want.
Little trickery for fun/speed:
lst_of_lst = [[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]]
list = ['a','b','c','d']
for L, L[:0] in zip(lst_of_lst, zip(list)):
pass
print(lst_of_lst)
Try it online!

Is there any way to get the name of a list once a string within has been searched for?

First of all, complete python/programming newbie here, so I apologise if this is a stupid question. Also, sorry for the awkward title.
I'm attempting to adapt the top answer in this question:
Check if a Python list item contains a string inside another string. However, I'm not sure if what I'm trying to achieve is even possible.
list6 = [list1,list2,list3,list4,list5]
'list6' is a list containing several other lists.
Then, this line can be used to check if 'Cool' is a string inside of any of those lists:
if any("Cool" in s for s in list6):
My question is: Assuming 'Cool' is only in 'list1', is it possible to fetch and store 'list1' in order to use other values within that list? For example:
if "Cool" is in any list in list6:
list = list with "Cool"
something = list[4]
You can use next with a generator expression (Thanks ShadowRanger)
next(sublist for sublist in list6 if "Cool" in sublist)
or you can use next with an iterator such as filter to get the first item that contains the element
next(filter(lambda x: "Cool" in x, list6))
You would have to use for-loop to check every sublist separatelly and then you can keep it or get other elements from this sublist.
for sublist in list6:
if "Cool" in sublist:
something = sublist[4]

How to loop through a list of lists, finding the value in the same index position?

I have a list of lists containing information about smartphone applications. Each list (within the list) contains the same type of information, in the same order.
[id, name, ..., ].
The list of lists looks like this: [[id1, name1,...], [id2, name2, ...]]
I want to access the 10th index in each list and check its value.
I tried this, but it does not work. I imagined this would iterate over every list, except the first which is a header, and would select the 10th item in each list.
for c_rating in apps_data[1:][10]:
print(c_rating)
Instead, it prints out every item within the 10th list.
The given solution is:
for row in apps_data[1:]:
c_rating = row[10]
print(c_rating)
I understand why this code works. It breaks the process into two steps. I don't understand why the first code does not work. Any help would be appreciated.
That's due to the python expression evaluation order.
apps_data[1:][10] is evaluated in this order:
apps_data[1:] -> this gives a list of the inner lists with all but the first inner list. Let's call this inner_lists
inner_lists[10] -> this gives you the 10th element from that list of lists. Which gives you one of those inner lists.
So you end up with a select + select
What you want is a iterate + select. You can do it like this:
[print(x[10]) for x in apps_data]
This goes through all the inner_lists, selecting the 10th element from each and prints it.

Python: check if item exists in variable amount of lists

I'm working on a small search engine and I'm lost a certain point. I have multiple lists containing items, and I want to check which items exist in all lists. The amount of lists can vary, since they are created based on the number of words in the search query, done with:
index_list = [[] for i in range((len(query)+1))]
I figured I start with finding out what the shortest list is, since that is the maximum amount of items that need to be checked. So for example, with a three-word-search-query:
index_list[1]=[set(1,2,3,4,5)]
index_list[2]=[set(3,4,5,6,7)]
index_list[3]=[set(4,5,6,7)]
shortest_list = index_list[3]
(What the shortest list is, is figured out with a function, not relevant for now).
Now I want to check if the items of the shortest list, index_list[3], also exist in the other lists. In this case there are 3 lists in total, but when entering a longer search query, the amount of lists increase. I thought to do something with loops, like:
result = []
for element in shortest_list:
for subelement in element:
for element2 in index_list[1]:
if subelement in element2:
for element3 in index_list[2]:
if subelement in element3:
result.append(subelement)
So, the result should be:
[4, 5]
since these items exist in all lists.
But, the loop above won't work when there are more lists. As described earlier, I don't know the amount of lists beforehand because it depends on the amount of words in the search query. So basically the depth of my loop depends on the amount of lists I have.
When doing research I found some postings suggesting recursion may do the job. Unfortunately I'm not Python skilled that well.
Any suggestions?
Thanks in advance!
Just use all sets and use set.intersection to find the common elements, also {1,2,3,4,5} is how to create a set of ints not set(1,2,3,4,5):
index_list = [set() for i in range(4)]
index_list[0].update({1,2,3,4,5})
index_list[1].update({3,4,5,6,7})
index_list[2].update({4,5,6,7})
shortest_list = index_list[2]
print(shortest_list.intersection(*index_list[:2]))
set([4, 5])
Try to go about it the opposite way: First make a list of all the index lists by doing something like
index_list_list = []
for ix_list in get_index_lists(): #Or whatever
index_list_list.append(ix_list)
Then you can loop through all of these, removing the elements in your 'remaining_items' list if they are not contained in the others:
remaining_items = shortest_list
for index_list in index_list_list:
curr_remaining_items = copy(remaining_items)
for element in curr_remaining_items:
if element not in index_list:
remaining_items.remove(element)
Your final 'remaining_items' list would then contain the elements that are common to all the lists.
I written code by your approach. You can try out following code:
index_list=['1','2','3','4','5']
index_list1=['3','4','5','6','7']
index_list2=['4','5','6','7']
result = []
for element in index_list:
for subelement in element:
for element2 in index_list1:
if subelement in element2:
for element3 in index_list2:
if subelement in element3:
result.append(subelement)
print result
output:
['4', '5']
It is a little confusing that you appear to have something shadowing the built in type set, which happens to be built for precisely this type of job.
subset = set(shortest_list)
# Use map here to only lookup method once.
# We don't need the result, which will be a list of None.
map(subset.intersection_update, index_lists)
# Alternative: describe the reduction more directly
# Cost: rebuilds a new set for each list
subset = reduce(set.intersection, index_lists, set(shortest_list))
Note: As Padraic indicated in his answer, set.intersection and set.intersection_update both take an arbitrary number of arguments so it is unnecessary to use map or reduce in this case.
It is also by far preferable that all the lists already be sets, since the intersection can be optimized to the size of the smaller set, but a list intersection requires scanning the list.

Append specific rows from one list to another

Having some difficulty trying to take a 2d list with 7 columns and 10 rows, and append all rows from only columns 4,5 and 6 (or 3,4,5 from index 0) to a new list. The original list is actually a csv and is much, much longer but I've just put part of it in the function for troubleshooting purposes.
What I have so far is...
def coords():
# just an example of first couple lines...
bigList = [['File','FZone','Type','ID','Lat','Lon','Ref','RVec']
['20120505','Cons','mit','3_10','-21.77','119.11','mon_grs','14.3']
newList=[]
for row in bigList[1:]: # skip the header
newList.append(row[3])
return newList # return newList to main so it can be sent to other functions
This code gives me a new list with 'ID' only but I also want 'Lat' and 'Lon'.
The new list should look like...['3_10', '-21.77','119.11']['4_10','-21.10'...]
I tried re-writing newList.append(row[3,4,5])...and of course that doesn't work but not sure how to go about it.
row[3] refers to the fourth element. You seem to want the fourth through sixth elements, so slice it:
row[3:6]
You could also do this all with a list comprehension:
newList = [row[3:6] for row in myList[1:]]

Categories