Related
[enter image description here][1]
I have obtained a DataFrame of compounds given below, whereas BaSiO3 has two possibilities. Now I want to remove one of them.
(ABO3) (A_OS, B_OS)
['CaFeO3'] [3, 2]
['BaSiO3'] [1,5], [4, 2]
['BaGeO3'] [4, 2]
Desired output is :
(ABO3) (A_OS, B_OS)
['CaFeO3'] [3,2]
['BaSiO3'] [1,5]
['BaGeO3'] [4,2]
I have tried to create this dataframe as follows:
res = ['CeTmO3', 'CeLuO3', 'CeRhO3']
list1 = [['Ce', 'Tm', 'O'], ['Ce', 'Lu', 'O'], ['Ce', 'Rh', 'O']] # to extract the properties of elements
A = []
B = []
final = []
for i in range(len(list1)):
A = element(list1[i][0]).oxistates # Only calling the Oxidation-
B = element(list1[i][1]).oxistates # states of A and B
D = A,B
final.append(D)
#print(final)
solutions = []
for x in range(len(final)): #for y in range((len(final[x]))):
E = [(x1,x2) for x1 in final[x][0] for x2 in final[x][1] if sum([x1,x2]) == 6]
solutions.append(E)
#print(solutions)
list2 = [(x, y) for x, y in zip(res, solutions)]
list2
and then
import pandas as pd
import numpy as np
dataPd1 = pd.DataFrame(res)
data = np.array(solutions)
dataPd2 = pd.DataFrame(data = data)
df_os = pd.concat([dataPd1,dataPd2],axis=1)
df_os.columns = ["ABO3", "Oxi_State"]
df_os
Now in the above code, 'CeTmO3' and 'CeRhO3' have two possibilities. I want only one and delete other. Like as:
ABO3 Oxi_State
0 CeTmO3 [(4, 2), (3, 3)]
1 CeLuO3 [(3, 3)]
2 CeRhO3 [(4, 2), (3, 3)]
The desired one is:
ABO3 Oxi_State
0 CeTmO3 [(4, 2)]
1 CeLuO3 [(3, 3)]
2 CeRhO3 [(4, 2)]
Please note that few compounds have more than two possibilities also in the original data set. For ex:
BrNO3 [(7,-1),(5,1),(3,3),(1,5)]
The simplest approach, assuming the desired compound is always the first one listed, given a df of the form:
ABO3 Oxi_State
0 CeTmO3 [(4, 2), (3, 3)]
1 CeLuO3 [(3, 3)]
2 CeRhO3 [(4, 2), (3, 3)]
Is to utilize the following approach:
df['Oxi_State'] = [[x[0]] for x in df['Oxi_State'].to_list()]
This produces:
ABO3 Oxi_State
0 CeTmO3 [(4, 2)]
1 CeLuO3 [(3, 3)]
2 CeRhO3 [(4, 2)]
I have two lists like
num = [1,2,3,4]
names = ['shiva','naga','sharath','krishna','pavan','adi','mulagala']
I want to print the two lists parallel and if one list(num) ends i want to repeat the first list(num) till second(names) list ends.
now i want the output as
1 for shiva
2 for naga
3 for sarath
4 for krishna
1 for pavan
2 for adi
3 for mulagala
Using itertools.cycle and zip:
>>> num = [1,2,3,4]
>>> names = ['shiva','naga','sharath','krishna','pavan','adi','mulagala']
>>> import itertools
>>> for i, name in zip(itertools.cycle(num), names):
... print('{} for {}'.format(i, name))
...
1 for shiva
2 for naga
3 for sharath
4 for krishna
1 for pavan
2 for adi
3 for mulagala
You'll want to use a combination of itertools.cycle and itertools.izip. For example:
>>> num = [1,2,3,4]
>>> names = ['shiva','naga','sharath','krishna','pavan','adi','mulagala']
>>> import itertools
>>> list(itertools.izip(itertools.cycle(num), names))
[(1, 'shiva'), (2, 'naga'), (3, 'sharath'), (4, 'krishna'), (1, 'pavan'), (2, 'adi'), (3, 'mulagala')]
list(roundrobin('ABC', 'D', 'EF'))
output : ['A', 'D', 'E', 'B', 'F', 'C']
from itertools import chain, izip_longest
def roundrobin(*iterables):
sentinel = object()
return (x for x in chain(*izip_longest(fillvalue=sentinel, *iterables)) if x is not sentinel)
aList = [2, 1, 4, 3, 5]
aList.sort()
=[1, 2, 3, 4, 5]
del aList[2]
=[1, 2, 4, 5]
**unsort the list back to original sequence with '3' deleted**
=[2, 1, 4, 5]
In reality I have a list of tuples that contain (Price, Quantity, Total).
I want to sort the list, allow the user to delete items in the list and
then put it back in the original order minus the deleted items.
One thing to note is that the values in the tuples can repeat in the list,
such as:
aList = [(4.55, 10, 45.5), (4.55, 10, 45.5), (1.99, 3, 5.97), (1.99, 1, 1.99)]
You cannot unsort the list but you could keep the original unsorted index to restore positions.
E.g.
from operator import itemgetter
aList = [(4.55, 10, 45.5), (4.55, 10, 45.5), (1.99, 3, 5.97), (1.99, 1, 1.99)]
# In keyList:
# * every element has a unique id (it also saves the original position in aList)
# * list is sorted by some criteria specific to your records
keyList = sorted(enumerate(aList), key = itemgetter(1))
# User want to delete item 1
for i, (key, record) in enumerate(keyList):
if key == 1:
del keyList[i]
break
# "Unsort" the list
theList = sorted(keyList, key = itemgetter(0))
# We don't need the unique id anymore
result = [record for key, record in theList]
As you can see this works with duplicate values.
Unsorting can be done
This approach is like others - the idea is to keep the original indices to restore the positions. I wanted to add a clearer example on how this is done.
In the example below, we keep track of the original positions of the items in a by associating them with their list index.
>>> a = [4, 3, 2, 1]
>>> b = [(a[i], i) for i in range(len(a))]
>>> b
[(4, 0), (3, 1), (2, 2), (1, 3)]
b serves as a mapping between the list values and their indices in the unsorted list.
Now we can sort b. Below, each item of b is sorted by the first tuple member, which is the corresponding value in the original list.
>>> c = sorted(b)
>>> c
[(1, 3), (2, 2), (3, 1), (4, 0)]
There it is... sorted.
Going back to the original order requires another sort, except using the second tuple item as the key.
>>> d = sorted(c, key=lambda t: t[1])
>>> d
[(4, 0), (3, 1), (2, 2), (1, 3)]
>>>
>>> d == b
True
And now it's back in its original order.
One use for this could be to transform a list of non sequential values into their ordinal values while maintaining the list order. For instance, a sequence like [1034 343 5 72 8997] could be transformed to [3, 2, 0, 1, 4].
>>> # Example for converting a list of non-contiguous
>>> # values in a list into their relative ordinal values.
>>>
>>> def ordinalize(a):
... idxs = list(range(len(a)))
... b = [(a[i], i) for i in idxs]
... b.sort()
... c = [(*b[i], i) for i in idxs]
... c.sort(key=lambda item: item[1])
... return [c[i][2] for i in idxs]
...
>>> ordinalize([58, 42, 37, 25, 10])
[4, 3, 2, 1, 0]
Same operation
>>> def ordinalize(a):
... idxs = range(len(a))
... a = sorted((a[i], i) for i in idxs)
... a = sorted(((*a[i], i) for i in idxs),
... key=lambda item: item[1])
... return [a[i][2] for i in idxs]
You can't really do an "unsort", the best you can do is:
aList = [2, 1, 4, 3, 5]
aList.remove(sorted(aList)[2])
>>> print aList
[2, 1, 4, 5]
Try this to unsort a sorted list
import random
li = list(range(101))
random.shuffle(li)
Here's how I recommend to sort a list, do something, then unsort back to the original ordering:
# argsort is the inverse of argsort, so we use that
# for undoing the sorting.
sorter = np.argsort(keys)
unsorter = np.argsort(sorter)
sorted_keys = np.array(keys)[sorter]
result = do_a_thing_that_preserves_order(sorted_keys)
unsorted_result = np.array(result)[unsorter]
I had the same use case and I found an easy solution for that, which is basically random the list:
import random
sorted_list = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k']
unsorted_list = random.sample(sorted_list, len(sorted_list))
This question already has answers here:
Transpose/Unzip Function (inverse of zip)?
(14 answers)
Closed 8 years ago.
I have a list of many 2-tuples.
I would like to split the list into two lists, one list consisting of the first elements of all the tuples in the list, and the other list consisting of the second elements of all the tuples. I wonder how to do that efficiently? Thanks!
For example, I have a list y:
>>> y = [('ab',1), ('cd', 2), ('ef', 3) ]
>>> type(y)
<type 'list'>
I hope to get two lists ['ab', 'cd', 'ef'] and [1, 2, 3].
a,b = zip(*y)
is all you need ...
or if you need them as lists and not tuples
a,b = map(list,zip(*y))
zip with * argument unpacking will give you tuples:
>>> a, b = zip(*y)
>>> a
('ab', 'cd', 'ef')
>>> b
(1, 2, 3)
If you need lists, you can use map on that:
>>> a, b = map(list, zip(*y))
>>> a
['ab', 'cd', 'ef']
>>> b
[1, 2, 3]
Use zip and a list comprehension:
>>> y = [('ab', 1), ('cd', 2), ('ef', 3)]
>>> a,b = [list(c) for c in zip(*y)]
>>> a
['ab', 'cd', 'ef']
>>> b
[1, 2, 3]
>>>
try this:
def get_list(tuples):
list1 = []
list2 = []
for i in tuples:
list1.append(i[0])
list2.append(i[1])
return list1, list2
y = [('ab',1), ('cd', 2), ('ef', 3) ]
letters, numbers = get_list(y)
One way to do it is first convert the list into a temp dictionary, then assign the keys & values of the temp dictionary into two lists
y = [('ab', 1), ('cd', 2), ('ef', 3)]
temp_d = dict(y)
list1 = temp_d.keys()
list2 = temp_d.values()
print list1
print list2
l1 = []
l2 = []
for i in y:
l1.append(i[0])
l2.append(i[1])
l1
['ab', 'cd', 'ef']
l2
[1, 2, 3]
Appending each value into another
Say I have two lists one longer than the other, x = [1,2,3,4,5,6,7,8] and y = [a,b,c] and I want to merge each element in y to every 3rd index in x so the resulting list z would look like: z = [1,2,a,3,4,b,5,6,c,7,8]
What would be the best way of going about this in python?
Here is an adapted version of the roundrobin recipe from the itertools documentation that should do what you want:
from itertools import cycle, islice
def merge(a, b, pos):
"merge('ABCDEF', [1,2,3], 3) --> A B 1 C D 2 E F 3"
iterables = [iter(a)]*(pos-1) + [iter(b)]
pending = len(iterables)
nexts = cycle(iter(it).next for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
Example:
>>> list(merge(xrange(1, 9), 'abc', 3)) # note that this works for any iterable!
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Or here is how you could use roundrobin() as it is without any modifications:
>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']
>>> list(roundrobin(*([iter(x)]*2 + [y])))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Or an equivalent but slightly more readable version:
>>> xiter = iter(x)
>>> list(roundrobin(xiter, xiter, y))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Note that both of these methods work with any iterable, not just sequences.
Here is the original roundrobin() implementation:
from itertools import cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
pending = len(iterables)
nexts = cycle(iter(it).next for it in iterables)
while pending:
try:
for next in nexts:
yield next()
except StopIteration:
pending -= 1
nexts = cycle(islice(nexts, pending))
This approach modifies x in place. Alternatively, you could make a copy of x and return the modified copy if you didn't want to change the original.
def merge(x, y, offset):
for i, element in enumerate(y, 1):
x.insert(i * offset - 1, element)
>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']
>>> merge(x, y, 3)
>>> x
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
All extra elements of y past the end of x just get appended to the end.
>>> from itertools import chain
def solve(x,y):
it = iter(y)
for i in xrange(0, len(x), 2):
try:
yield x[i:i+2] + [next(it)]
except StopIteration:
yield x[i:]
...
>>> x = [1,2,3,4,5,6,7,8]
>>> y = ['a','b','c']
>>> list(chain.from_iterable(solve(x,y)))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
Here's another way:
x = range(1, 9)
y = list('abc')
from itertools import count, izip
from operator import itemgetter
from heapq import merge
print map(itemgetter(1), merge(enumerate(x), izip(count(1, 2), y)))
# [1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
This keeps it all lazy before building the new list, and lets merge naturally merge the sequences... kind of a decorate/undecorate... It does require Python 2.7 for count to have a step argument though.
So, to walk it through a bit:
a = list(enumerate(x))
# [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8)]
b = zip(count(1, 2), y)
# [(1, 'a'), (3, 'b'), (5, 'c')]
print list(merge(a, b))
# [(0, 1), (1, 2), (1, 'a'), (2, 3), (3, 4), (3, 'b'), (4, 5), (5, 6), (5, 'c'), (6, 7), (7, 8)]
Then the itemgetter(1) just takes the actual value removing the index...
The above solutions are really cool. Here's an alternative that doesn't involve roundrobin or itertools.
def merge(x, y):
result = []
while y:
for i in range(0, 2): result.append(x.pop(0))
for i in range(0, 1): result.append(y.pop(0))
result.extend(x)
return result
where 2 and 1 are arbitrary and list y is assumed to be shorter than list x.
def merge(xs, ys):
ys = iter(ys)
for i, x in enumerate(xs, 1):
yield x
if i % 2 == 0:
yield next(ys)
''.join(merge('12345678', 'abc')) # => '12a34b56c78'
Using itertools.izip_longest:
>>> from itertools import izip_longest, chain
>>> y = ['a','b','c']
>>> x = [1,2,3,4,5,6,7,8]
>>> lis = (x[i:i+2] for i in xrange(0, len(x) ,2)) # generator expression
>>> list(chain.from_iterable([ (a + [b]) if b else a
for a, b in izip_longest(lis, y)]))
[1, 2, 'a', 3, 4, 'b', 5, 6, 'c', 7, 8]
sep, lst = 2, []
for i in range(len(y)+1):
lst += x[i*sep:(i+1)*sep] + y[i:i+1]
Where sep is the number of elements of x before an element of y is inserted.
Performance:
>>> timeit.timeit(stmt="for i in range(len(y)+1): lst += x[i*sep:(i+1)*sep] + y[i:i+1]", setup="lst = [];x = [1,2,3,4,5,6,7,8];y = ['a','b','c'];sep = 2", number=1000000)
2.8565280437469482
Pretty damn good. I wasn't able to get the stmt to begin with let = [] so I think it kept appending to lst (unless I misunderstand timeit), but still... pretty good for a million times.