remove the common elements in array based on index value - python

I have one array want to remove the duplicate value based on index
Eg: a= [1,3,4,3]
Expected array : a = [1,4,3]
Want to remove the common elements with lower index value

It's not optimal, but since you don't seem to look for a fast algorithm, this should be enough (especially with small arrays):
[1,3,4,3].reverse.uniq.reverse
This code is for Ruby only.

You will need to loop through the list in reverse order:
for index in range(len(my_list)-1,-1,-1):
if my_list[index] in my_list[index+1:]:
del my_list[index]

Related

Find indices of x minimum values of a list

I have a list of length n.
I want to find the indices that hold the 5 minimum values of this list.
I know how to find the index holding the minimum value using operator
min_index,min_value = min(enumerate(list), key=operator.itemgetter(1))
Can this code be altered to get a list of the 5 indices I am after?
Although this requires sorting the entire list, you can get a slice of the sorted list:
data = sorted(enumerate(list), key=operator.itemgetter(1))[:5]
if use package heapq, it can be done by nsamllest:
heapq.nsmallest(5, enumerate(list), key=operator.itemgetter(1))
what about something like this?
map(lambda x: [a.index(x),x],sorted(list)[:5])
that will return a list of lists where list[x][0] = the index and list[x][1] = the value
EDIT:
This assumes the list doesn't have repeated minimum values. As adhg McDonald-Jensen pointed out, it this will only return the first instance of the give value.

Indices of cross-referenced lists

I'm cross-referencing two lists to find which items coincide between the two lists. The first list orig is 32 items in size, and I'm cross-referencing it with a much-larger list sdss which is 112,000 items in size. So far this is what I've got:
for i in range(0,len(orig),1):
if orig[i] in sdss:
print('\n %s' % (orig[i]))
This gives me the items that are the same between the two lists, however, how would I efficiently return the indices (or location) of cross-referenced items inside of the sdss list (the larger list)?
EDIT: I guess I should've been clearer. I am actually cross-referencing two arrays with ints, not strings.
If the order is not of importance you can use set intersection to find the unique common elements and list comprehension to get the index and element as tuple
[(sdss.index(common_element),common_element) for common_element in set(orig) & set(sdss)]
Note that "index" raises ValueError if the value is not found in the list but in this case the value WILL exist in sdss. So, no need to worry about nonexistent elements throwing errors.
You can also use numpy.intersect1d
You can use .find() which gives the index of an item in a list, but returns -1 on failure:
for item in orig:
index = sdss.find(item)
if index != -1:
print("\n %d" % index)
I modified how you iterated because you don't need the index in orig; you need each item. BTW, you could have used range(len(orig)) because your start and step arguments are already the defaults.

Python: How to insert into a nestled list via iteration at a variable index position?

I've been banging my head over this one for a while, so hopefully you can help me! So here is what I have:
grouped_list = [[["0","1","1","1"]["1","0","1","1"]][["1","1","0","1","1","1"]][["1","1","1","0","1"]]]
index_list = [[2,3][][4]]
and I want to insert a "-" into the sublists of grouped_list at the corresponding index positions indicated in the index_list. The result would look like:
[[["0","1","-","-","1","1"]["1","0","-","-","1","1"]][["1","1","0","1","1","1"]][["1","1","1","0","-","1"]]]
And since I'm new to python, here is my laughable attempt at this:
for groups in grouped_list:
for columns in groups:
[[columns[i:i] = ["-"] for i in index] for index in index_list]
I get a syntax error, pointing at the = in the list comprehension, but I didn't think it would really work to start. I would prefer not to do this manually, because I'm dealing with rather large datasets, so some sort of iteration would be nice! Do I need to use numpy or pandas for something like this? Could this be solved with clever use of zipping? Any help is greatly appreciated!
I am sadly unable to make this a one liner:
def func(x, il):
for i in il:
x.insert(i,'-')
return x
s = [[func(l, il) for l in ll] for (ll, il) in zip(grouped_list, index_list)]
I think what you want is
for k, groups in enumerate(grouped_list):
for columns in groups:
for i in sorted(index_list[k], reverse=True):
columns.insert(i, "-")
Here, I iterate over the grouped lists and save the index k to determine which indices to use from index_list. I modify the lists in-place using list.insert, which inserts elements in place. Note that this only works when the indices are used from the largest to the smallest, since otherwise the positions shift. This is why I use sorted in the loop.

Python: Listing the duplicates in a list

I am fairly new to Python and I am interested in listing duplicates within a list. I know how to remove the duplicates ( set() ) within a list and how to list the duplicates within a list by using collections.Counter; however, for the project that I am working on this wouldn't be the most efficient method to use since the run time would be n(n-1)/2 --> O(n^2) and n is anywhere from 5k-50k+ string values.
So, my idea is that since python lists are linked data structures and are assigned to the memory when created that I begin counting duplicates from the very beginning of the creation of the lists.
List is created and the first index value is the word 'dog'
Second index value is the word 'cat'
Now, it would check if the second index is equal to the first index, if it is then append to another list called Duplicates.
Third index value is assigned 'dog', and the third index would check if it is equal to 'cat' then 'dog'; since it matches the first index, it is appended to Duplicates.
Fourth index is assigned 'dog', but it would check the third index only, and not the second and first, because now you can assume that since the third and second are not duplicates that the fourth does not need to check before, and since the third/first are equal, the search stops at the third index.
My project gives me these values and append it to a list, so I would want to implement that above algorithm because I don't care how many duplicates there are, I just want to know if there are duplicates.
I can't think of how to write the code, but I figured the basic structure of it, but I might be completely off (using random numgen for easier use):
for x in xrange(0,10):
list1.append(x)
for rev, y in enumerate(reversed(list1)):
while x is not list1(y):
cond()
if ???
I really don't think you'll get better than a collections.Counter for this:
c = Counter(mylist)
duplicates = [ x for x,y in c.items() if y > 1 ]
building the Counter should be O(n) (unless you're using keys which are particularly bad for hashing -- But in my experience, you need to try pretty hard to make that happen) and then getting the duplicates list is also O(n) giving you a total complexity of O(2n) == O(n) (for typical uses).

How do I extract the last two items from the list, strings or tuples in Python?

User will input the string, list or tuples.
I have to extract the first do and the last two values. For the first two values:
ls[:2]
For the last two values how can I do it?
If n is the total number of values the last two item can be sliced as:
[n-1:]
How can I put down in the code?
ls[-2:]
Negative numbers in slices are simply evaluated by adding len(ls), so this is the same as ls[len(ls) - 2:]. For more information on slices, refer to the Python tutorial or this excellent stackoverflow answer.
ls[-2:]
would be the way to do it, as negative indexes count from the end.
if you have a list like this: a_list = [1,2,3] you can get last two [2,3] elements by a_list[-2:]

Categories