Loop print of 2 columns in to 1 - python

Suppose I have an array of 2 columns. It looks like this
column1 = [1,2,3,...,830]
column2 = [a,b,c,...]
I want to have output in a print form of single columns, that includes value of both columns one by one. output form: column = [1,a,2,b ....]
I tried to do by this code,
dat0 = np.genfromtxt("\", delimiter = ',')
mu = dat0[:,0]
A = dat0[:,1]
print(mu,A)
R = np.arange(0,829,1)
l = len(mu)
K = np.zeros((l, 1))
txtfile = open("output_all.txt",'w')
for x in mu:
i = 0
K[i,0] = x
dat0[i,1] = M
txtfile.write(str(x))
txtfile.write('\n')
txtfile.write(str(M))
txtfile.write('\n')
print K

I do not understand your code completely, is the reference to numpy really relevant for your question? What is M?
If you have two lists of the same lengths you can get pairs of elements using the zip builtin.
A = [1, 2, 3]
B = ['a', 'b', 'c']
for a, b in zip(A, B):
print(a)
print(b)
This will print
1
a
2
b
3
c

I'm sure there is a better way to do this, but one method is
>>> a = numpy.array([[1,2,3], ['a','b','c'],['d','e','f']])
>>> new_a = []
>>> for column in range(0,a.shape[1]): # a.shape[1] is the number of columns in a
... for row in range(0,a.shape[1]): # a.shape[0] is the number of rows in a
... new_a.append(a[row][column])
...
>>> numpy.array(new_a)
array(['1', 'a', 'd', '2', 'b', 'e', '3', 'c', 'f'],
dtype='|S1')

Related

How to create multiple rows of a data frame based on some original values

I am a Python newbie and have a question.
As a simple example, I have three variables:
a = 3
b = 10
c = 1
I'd like to create a data frame with three columns ('a', 'b', and 'c') with:
each column +/- a certain constant from the original value AND also >0 and <=10.
If the constant is 1 then:
the possible values of 'a' will be 2, 3, 4
the possible values of 'b' will be 9, 10
the possible values of 'c' will be 1, 2
The final data frame will consist of all possible combination of a, b and c.
Do you know any Python code to do so?
Here is a code to start.
import pandas as pd
data = [[3 , 10, 1]]
df1 = pd.DataFrame(data, columns=['a', 'b', 'c'])
You may use itertools.product for this.
Create 3 separate lists with the necessary accepted data. This can be done by calling a method which will return you the list of possible values.
def list_of_values(n):
if 1 < n < 9:
return [n - 1, n, n + 1]
elif n == 1:
return [1, 2]
elif n == 10:
return [9, 10]
return []
So you will have the following:
a = [2, 3, 4]
b = [9, 10]
c = [1,2]
Next, do the following:
from itertools import product
l = product(a,b,c)
data = list(l)
pd.DataFrame(data, columns =['a', 'b', 'c'])

python: sort array when sorting other array

I have two arrays:
a = np.array([1,3,4,2,6])
b = np.array(['c', 'd', 'e', 'f', 'g'])
These two array are linked (in the sense that there is a 1-1 correspondence between the elements of the two arrays), so when i sort a by decreasing order I would like to sort b in the same order.
For instance, when I do:
a = np.sort(a)[::-1]
I get:
a = [6, 4, 3, 2, 1]
and I would like to be able to get also:
b = ['g', 'e', 'd', 'f', 'c']
i would do smth like this:
import numpy as np
a = np.array([1,3,4,2,6])
b = np.array(['c', 'd', 'e', 'f', 'g'])
idx_order = np.argsort(a)[::-1]
a = a[idx_order]
b = b[idx_order]
output:
a = [6 4 3 2 1]
b = ['g' 'e' 'd' 'f' 'c']
I don't know how or even if you can do this in numpy arrays. However there is a way using standard lists albeit slightly convoluted. Consider this:-
a = [1, 3, 4, 2, 6]
b = ['c', 'd', 'e', 'f', 'g']
assert len(a) == len(b)
c = []
for i in range(len(a)):
c.append((a[i], b[i]))
r = sorted(c)
for i in range(len(r)):
a[i], b[i] = r[i]
print(a)
print(b)
In your problem statement, there is no relationship between the two tables. What happens here is that we make a relationship by grouping relevant data from each table into a temporary list of tuples. In this scenario, sorted() will carry out an ascending sort on the first element of each tuple. We then just rebuild our original arrays

Python Reshape a list with odd number of elements

I have a list with an odd number of elements. I want to convert it into a specific size.
My code:
alist = ['a','b','c']
cols= 2
rows = int(len(alist)/cols)+1 # 2
anarray = np.array(alist.extend([np.nan]*((rows*cols)-len(months_list)))).reshape(rows,cols)
Present output:
ValueError: cannot reshape array of size 1 into shape (2,2)
Expected output:
anarray = [['a','b'],['c',nan]]
You can try:
out = np.full((rows,cols), np.nan, dtype='object')
out.ravel()[:len(alist)] = alist
Output:
array([['a', 'b'],
['c', nan]], dtype=object)
As a side note, this might be better for you:
rows = int(np.ceil(len(alist)/cols))
You can use list comprehension to achieve the result:
li = ['a','b','c']
l = len(li)
new_list = [li[x:x+2] for x in range(l // 2)]
if l % 2 != 0:
new_list.append([li[-1], None])
print(new_list) # [['a', 'b'], ['c', None]]
Try (without any external library)
import math
alist = ['a', 'b', 'c']
cols = 2
new_list = []
steps = math.ceil(len(alist) / cols)
start = 0
for x in range(0, steps):
new_list.append(alist[x * cols: (x + 1) * cols])
new_list[-1].extend([None for t in range(cols - len(new_list[-1]))])
print(new_list)
output
[['a', 'b'], ['c', None]]

index a list of lists in one loop [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
What logic can I use a single index to iterate through two lists in one loop, e.g, using indexes [0,0], [0,1], [1,0], [1,1] for a 2x2 iterating through 1 2 3 4? Here is my best attempt so far:
numbers_list = [[1,2],[3,4]]
letters_list = ['a', 'b', 'c', 'd']
for i in [1,2,3,4]:
indx1, = i%2,
indx2 = i % 2 + i-2
print indx1, indx2
print numbers_list[indx1][indx2], letters_list[i]
desired output is
0 a
1 b
2 c
3 d
Since I don't know the generic structure of your lists, am gonna take the lists you provided. So in a single loop :
for i in range(4):
div,rem = divmod(i,2)
print(numbers_list[div][rem],letters_list[i])
So, we get :
IN : letters_list = ['a', 'b', 'c', 'd']
IN : numbers_list = [[1,2],[3,4]]
OUT : 1 a
2 b
3 c
4 d
Just nest the loops.
lists = [[1,2],[3,4]]
for i in range(len(lists)):
sublist = lists[i]
for j in range(len(sublist)):
element = sublist[j]
print element
This will give:
1
2
3
4
Your method will work, but you have to know the length of the lists. Also all your lists have to be the same length, whereas with this method they can vary.
If you're not opposed to using a built-in function like enumerate() this may work:
numbers_list = [[1,2],[3,4]]
letters_list = ['a', 'b', 'c', 'd']
for i, n in enumerate([1,2,3,4]):
print i, letters_list[i]
output:
0 a
1 b
2 c
3 d
Use itertools.product.
For a 2x3 list:
from itertools import product
numbers_list = [[1, 2, 3], [3, 4, 5]]
for i, j in itertools.product(range(2), range(3)):
print("indices:")
print(i)
print(j)
print("Item:")
print(numbers_list[i][j])
If you just want to flatten the list into a list of tuples containing (index_1, index_2, item), you could do a nested list comprehension:
letter_list = [['A', 'B', 'C'], ['D', 'E', 'F']]
[(i, j, x) for i, y in enumerate(letter_list) for j, x in enumerate(y)]
#Returns [(0, 0, 'A'), (0, 1, 'B'), (0, 2, 'C'), (1, 0, 'D'), (1, 1, 'E'), (1, 2, 'F')]
If you want to be able to handle any shape of two-deep nested lists in a single top-level loop, you could first define a helper iterator function (you only need to do this once):
def nest_iter(x):
for i, a in enumerate(x):
for j, b in enumerate(a):
yield i, j, b
Once you have that, you can use it in the rest of your code as follows:
numbers_list = [[1, 2], [3, 4, 5]]
for i, j, b in nest_iter(numbers_list):
print i, j, b
The output is:
0 0 1
0 1 2
1 0 3
1 1 4
1 2 5
In this example, there are two sub-lists of different lengths.

How do I do a SQL style disjoint or set difference on two Pandas DataFrame objects?

I'm trying to use Pandas to solve an issue courtesy of an idiot DBA not doing a backup of a now crashed data set, so I'm trying to find differences between two columns. For reasons I won't get into, I'm using Pandas rather than a database.
What I'd like to do is, given:
Dataset A = [A, B, C, D, E]
Dataset B = [C, D, E, F]
I would like to find values which are disjoint.
Dataset A!=B = [A, B, F]
In SQL, this is standard set logic, accomplished differently depending on the dialect, but a standard function. How do I elegantly apply this in Pandas? I would love to input some code, but nothing I have is even remotely correct. It's a situation in which I don't know what I don't know..... Pandas has set logic for intersection and union, but nothing for disjoint/set difference.
Thanks!
You can use the set.symmetric_difference function:
In [1]: df1 = DataFrame(list('ABCDE'), columns=['x'])
In [2]: df1
Out[2]:
x
0 A
1 B
2 C
3 D
4 E
In [3]: df2 = DataFrame(list('CDEF'), columns=['y'])
In [4]: df2
Out[4]:
y
0 C
1 D
2 E
3 F
In [5]: set(df1.x).symmetric_difference(df2.y)
Out[5]: set(['A', 'B', 'F'])
Here's a solution for multiple columns, probably not very efficient, I would love to get some feedback on making this faster:
input = pd.DataFrame({'A': [1, 2, 2, 3, 3], 'B': ['a', 'a', 'b', 'a', 'c']})
limit = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'c']})
def set_difference(input_set, limit_on_set):
limit_on_set_sub = limit_on_set[['A', 'B']]
limit_on_tuples = [tuple(x) for x in limit_on_set_sub.values]
limit_on_dict = dict.fromkeys(limit_on_tuples, 1)
entries_in_limit = input_set.apply(lambda row:
(row['A'], row['B']) in limit_on_dict, axis=1)
return input_set[~entries_in_limit]
>>> set_difference(input, limit)
item user
1 a 2
3 a 3

Categories