Find rows where values change in array - python

How do I find the rows(indices) of my array, where its values change?
for example I have an array:
0 -0.638127 0.805294 1.30671
1 -0.638127 0.805294 1.30671
2 -0.085362 0.523378 0.550509
3 -0.085362 0.523378 0.550509
4 -0.323397 0.94502 0.49001
5 -0.323397 0.94502 0.49001
6 -0.323397 0.94502 0.49001
7 -0.291798 0.421398 0.962115
I want a result like:
[0 2 4 7]
I am happy to use existing librarys and I am not limited to anything. All I want are the numbers of the rows. How would I calculate that?
I tried
a = []
for i, row in enumerate(vecarray):
if i > 0:
a[i] = vecarray[i] - vecarray[i-1]
b = np.where(a != 0)
but that gives me IndexError: list assignment index out of range

arr = [
(-0.638127, 0.805294, 1.30671),
(-0.638127, 0.805294, 1.30671),
(-0.085362, 0.523378, 0.550509),
(-0.085362, 0.523378, 0.550509),
(-0.323397, 0.94502, 0.49001),
(-0.323397, 0.94502, 0.49001),
(-0.323397, 0.94502, 0.49001),
(-0.291798, 0.421398, 0.962115)
]
i = 0
prev_t = None
for t in arr:
if t != prev_t:
prev_t = t
print(i)
i += 1

Related

How to form a matrix from one column in a file?

I have a file that contains this column of info
1.0000000000000002
0.6593496737729044
1.0000000000000002
I can read this data from a file and I want to form a matrix 2*2 from it. I tried a lot, but I got a wrong output.
my code
with open("final_overlap.txt", "r") as final_over:
for i in range(2):
for j in range(2):
i = final_over.readline()
j = final_over.readline()
S = np.array([i,j])
print(S)
The output I want like this.
[[1.0000000000000002 0.6593496737729044]
[0.6593496737729044 1.0000000000000002]]
How can I form this matrix.
Take into account that I have another input, and it has more info, so I want a method that can form a different matrix not only 2*2.
Like this input too
1 1 1.0000000000000002
2 1 0.6593496737729044
2 2 1.0000000000000002
3 1 0.1192165290691592
3 2 0.0954901018165798
3 3 1.0000000000000002
4 1 0.0954901018165798
4 2 0.1192165290691592
4 3 0.6593496737729044
4 4 1.0000000000000002
and the matrix will be 4*4
One more question about the matrix. I got the right answer but if I have input like this.
`
1 1 1 1 0.7746059439198979
2 1 1 1 0.4441350695399573
2 1 2 1 0.2970603935859659
2 2 1 1 0.5696940113278337
2 2 2 1 0.4441350695399575
2 2 2 2 0.7746059439198979
I tried with this code, but I got error "list index out of range"
for line in open('Two_Electron.txt'):
r,c,d,e,v = line.split()
r = int(r)-1
c = int(c)-1
d = int(d)-1
e = int(e)-1
v = float(v)
if c == 0:
data.append( [v] )
else:
data[-1].append(v)
print(data)
# Fill in the upper triangle.
for i in range(len(data)-1):
for j in range(i+1,len(data)):
data[i].append( data[j][i] )
for k in range(len(data)-1):
for l in range(k+1,len(data)):
data[k].append( data[l][k] )
V_ee = np.array(data)
The output should I get.
[[[[0.77460594 0.4441351 ]
[0.4441351 0.56969403]]
[[0.4441351 0.29706043]
[0.29706043 0.4441351 ]]]
[[[0.4441351 0.29706043]
[0.29706043 0.4441351 ]]
[[0.56969403 0.4441351 ]
[0.4441351 0.77460594]]]]
Load the data into a simple list, then build the rows from the list.
with open("final_overlap.txt", "r") as final_over:
data = [float(line) for line in final_over]
S = np.array( [data[0:2], data[1:]] )
print(S)
Output:
[[1. 0.65934967]
[0.65934967 1. ]]
Followup
OK, assuming your data has row and column numbers like your second example, this will read the data, fill in the upper triangle, and convert to np.array.
import numpy as np
# Read in the data to find out the size.
data = []
for line in open('x.txt'):
r,c,v = line.split()
r = int(r)-1
c = int(c)-1
v = float(v)
if c == 0:
data.append( [v] )
else:
data[-1].append(v)
# Fill in the upper triangle.
for i in range(len(data)-1):
for j in range(i+1,len(data)):
data[i].append( data[j][i] )
array = np.array(data)
print(array)
Output:
[[1. 0.65934967 0.11921653 0.0954901 ]
[0.65934967 1. 0.0954901 0.11921653]
[0.11921653 0.0954901 1. 0.65934967]
[0.0954901 0.11921653 0.65934967 1. ]]
It would still be possible to do this, even if you don't have the row and column numbers, just by keeping an internal counter.

how to find minimum element of adjacent elements of a position in a matrix

I have a 5x5 matrix and I have to find the minimum of adjacent elements for a position and add that minimum number to that position... this has to be done for all the elements in the matrix except for the 1st row and 1st column.
This is the matrix
A= [[1 1 2 2 3],[1 1 0 1 0],[2 0 1 0 1],[3 2 1 2 1],[4 0 1 0 1]]
import numpy as np
a = [1,2,1,3,1]
b = [2,1,2,1,2]
First Matrix
def get_matrix1(a,b):
d = []
for x in a:
for y in b:
d.append(abs(y-x))
return np.reshape(d,(5,5))
Second Matrix
def get_matrix2():
# Matrix
m1 = get_matrix1(a,b)
print('First Matrix : {}'.format(m1))
# Cumulative Addition
m1[0] = np.cumsum(m1[0])
m1[:,0] = np.cumsum(m1[:,0])
m2 = m1.copy()
print('\nCumulative Addition Matrix : {}'.format(m2))
# Second Matrix
i_rows,j_cols = [0,1,2,3],[0,1,2,3]
edge_rows,edge_cols = [1,2,3,4],[1,2,3,4]
for i,row in zip(i_rows, edge_rows):
for j,col in zip(j_cols, edge_cols):
# old
old = m2[row,col]
print('\nOld : {}'.format(old))
# edges
c,u,l = m2[i,j],m2[i,j+1],m2[i+1,j]
r = (c,u,l)
print('Edges : {}'.format(r))
# new
new = min(r) + old
print('New : {}'.format(new))
# update
m2[row,col] = new
print('Updated Matrix :')
print(m2)
get_matrix2()

what can I do to make long to wide format in python

I have this long data. I like to sort this by 30 each and save separately.
Data print like this,
A292340
A291630
A278240
A267770
A267490
A261250
A261110
A253150
A252400
A253250
A243890
A243880
A236350
A233740
A233160
A225800
A225060
A225050
A225040
A225130
A219900
A204450
A204480
A204420
A196030
A196220
A167860
A152500
A123320
A122630
.
This is fairly simple question, but I need your help..
Thank you.
(And how can I make a list out of one results printed? list addtion?
I believe need create MultiIndex by modulo and floor divide np.arange by length of DataFrame and then unstack:
But if length modulo is not equal 0 (e.g. (30 % 12)), last values are not matched to last column and Nones are added:
N = 12
r = np.arange(len(df))
df.index = [r % N, r // N]
df = df['col'].unstack()
print (df)
0 1 2
0 A292340 A236350 A196030
1 A291630 A233740 A196220
2 A278240 A233160 A167860
3 A267770 A225800 A152500
4 A267490 A225060 A123320
5 A261250 A225050 A122630
6 A261110 A225040 None
7 A253150 A225130 None
8 A252400 A219900 None
9 A253250 A204450 None
10 A243890 A204480 None
11 A243880 A204420 None
Setup:
d = {'col': ['A292340', 'A291630', 'A278240', 'A267770', 'A267490', 'A261250', 'A261110', 'A253150', 'A252400', 'A253250', 'A243890', 'A243880', 'A236350', 'A233740', 'A233160', 'A225800', 'A225060', 'A225050', 'A225040', 'A225130', 'A219900', 'A204450', 'A204480', 'A204420', 'A196030', 'A196220', 'A167860', 'A152500', 'A123320', 'A122630']}
df = pd.DataFrame(d)
print (df.head())
col
0 A292340
1 A291630
2 A278240
3 A267770
4 A267490
If you don't have Pandas and Numpy modules you can use this:
Setup:
long_list = ['A292340', 'A291630', 'A278240', 'A267770', 'A267490', 'A261250', 'A261110', 'A253150', 'A252400',
'A253250', 'A243890', 'A243880', 'A236350', 'A233740', 'A233160', 'A225800', 'A225060', 'A225050',
'A225040', 'A225130', 'A219900', 'A204450', 'A204480', 'A204420', 'A196030', 'A196220', 'A167860',
'A152500', 'A123320', 'A122630', 'A292340', 'A291630', 'A278240', 'A267770', 'A267490', 'A261250',
'A261110', 'A253150', 'A252400', 'A253250', 'A243890', 'A243880', 'A236350', 'A233740', 'A233160',
'A225800', 'A225060', 'A225050', 'A225040', 'A225130', 'A219900', 'A204450', 'A204480', 'A204420',
'A196030', 'A196220', 'A167860', 'A152500', 'A123320', 'A122630']
Code:
number_elements_in_sublist = 30
sublists = []
sublists.append([])
sublist_index = 0
for index, element in enumerate(long_list):
sublists[sublist_index].append(element)
if index > 0:
if (index+1) % number_elements_in_sublist == 0:
if index == len(long_list)-1:
break
sublists.append([])
sublist_index += 1
for index, sublist in enumerate(sublists):
print("Sublist Nr." + str(index+1))
for element in sublist:
print(element)

How to cycle through the index of an array?

line 14 is where my main problem is.i need to cycle through each item in the array and use it's index to determine whether or not it is a multiple of four so i can create proper spacing for binary numbers.
def decimalToBinary(hu):
bits = []
h = []
while hu > 0:
kla = hu%2
bits.append(kla)
hu = int(hu/2)
for i in reversed(bits):
h.append(i)
if len(h) <= 4:
print (''.join(map(str,h)))
else:
for j in range(len(h)):
h.index(1) = h.index(1)+1
if h.index % 4 != 0:
print (''.join(map(str,h)))
elif h.index % 4 == 0:
print (' '.join(map(str,h)))
decimalToBinary( 23 )
If what you're looking for is the index of the list from range(len(h)) in the for loop, then you can change that line to for idx,j in enumerate(range(len(h))): where idx is the index of the range.
This line h.index(1) = h.index(1)+1 is incorrect. Modified your function, so at least it executes and generates an output, but whether it is correct, i dont know. Anyway, hope it helps:
def decimalToBinary(hu):
bits = []
h = []
while hu > 0:
kla = hu%2
bits.append(kla)
hu = int(hu/2)
for i in reversed(bits):
h.append(i)
if len(h) <= 4:
print (''.join(map(str,h)))
else:
for j in range(len(h)):
h_index = h.index(1)+1 # use h_index variable instead of h.index(1)
if h_index % 4 != 0:
print (''.join(map(str,h)))
elif h_index % 4 == 0:
print (' '.join(map(str,h)))
decimalToBinary( 23 )
# get binary version to check your result against.
print(bin(23))
This results:
#outout from decimalToBinary
10111
10111
10111
10111
10111
#output from bin(23)
0b10111
You're trying to join the bits to string and separate them every 4 bits. You could modify your code with Marcin's correction (by replacing the syntax error line and do some other improvements), but I suggest doing it more "Pythonically".
Here's my version:
def decimalToBinary(hu):
bits = []
while hu > 0:
kla = hu%2
bits.append(kla)
hu = int(hu/2)
h = [''.join(map(str, bits[i:i+4])) for i in range(0,len(bits),4)]
bu = ' '.join(h)
print bu[::-1]
Explanation for the h assignment line:
range(0,len(bits),4): a list from 0 to length of bits with step = 4, eg. [0, 4, 8, ...]
[bits[i:i+4] for i in [0, 4, 8]: a list of lists whose element is every four elements from bits
eg. [ [1,0,1,0], [0,1,0,1] ...]
[''.join(map(str, bits[i:i+4])) for i in range(0,len(bits),4)]: convert the inner list to string
bu[::-1]: reverse the string
If you are learning Python, it's good to do your way. As #roippi pointed out,
for index, value in enumerate(h):
will give you access to both index and value of member of h in each loop.
To group 4 digits, I would do like this:
def decimalToBinary(num):
binary = str(bin(num))[2:][::-1]
index = 0
spaced = ''
while index + 4 < len(binary):
spaced += binary[index:index+4]+' '
index += 4
else:
spaced += binary[index:]
return spaced[::-1]
print decimalToBinary(23)
The result is:
1 0111

Taking 2d list, and writing program for column max and average

i have a hw assignment i just finished up but it looks pretty horrendous knowing that theres a much simpler and efficient way to get the correct output but i just cant seem to figure it out.
Heres the objective of the assignment.
Write a program that stores the following values in a 2D list (these will be hardcoded):
2.42 11.42 13.86 72.32
56.59 88.52 4.33 87.70
73.72 50.50 7.97 84.47
The program should determine the maximum and average of each column
Output looks like
2.42 11.42 13.86 72.32
56.59 88.52 4.33 87.70
73.72 50.50 7.97 84.47
============================
73.72 88.52 13.86 87.70 column max
44.24 50.15 8.72 81.50 column average
The printing of the 2d list was done below, my problem is calculating the max, and averages.
data = [ [ 2.42, 11.42, 13.86, 72.32],
[ 56.59, 88.52, 4.33, 87.70],
[ 73.72, 50.50, 7.97, 84.47] ]
emptylist = []
r = 0
while r < 3:
c = 0
while c < 4 :
print "%5.2f" % data[r][c] ,
c = c + 1
r = r + 1
print
print "=" * 25
This prints the top half but the code i wrote to calculate the max and average is bad. for max i basically comapred all indexes in columns to each other with if, elif, statements and for the average i added the each column indency together and averaged, then printed. IS there anyway to calculate the bottom stuff with some sort of loop. Maybe something like the following
for numbers in data:
r = 0 #row index
c = 0 #column index
emptylist= []
while c < 4 :
while r < 3 :
sum = data[r][c]
totalsum = totalsum + sum
avg = totalsum / float(rows)
emptylist.append(avg) #not sure if this would work? here im just trying to
r = r + 1 #dump averages into an emptylist to print the values
c = c + 1 #in it later?
or something like that where im not manually adding each index number to each column and row. The max one i have no clue how to do in a loop . also NO LIST METHODS can be used. only append and len() can be used. Any help?
Here is what you're looking for:
num_rows = len(data)
num_cols = len(data[0])
max_values = [0]*num_cols # Assuming the numbers in the array are all positive
avg_values = [0]*num_cols
for row_data in data:
for col_idx, col_data in enumerate(row):
max_values[col_idx] = max(max_values[col_idx],col_data) # Max of two values
avg_values[col_idx] += col_data
for i in range(num_cols):
avg_values[i] /= num_rows
Then the max_values will contain the maximum for each column, while avg_values will contain the average for each column. Then you can print it like usual:
for num in max_values:
print num,
print
for num in avg_values:
print num
or simply (if allowed):
print ' '.join(max_values)
print ' '.join(avg_values)
I would suggest making a two new lists, each of the same size of each of your rows, and keeping a running sum in one, and a running max in the second one:
maxes = [0] * 4 # equivalent to [0, 0, 0, 0]
avgs = [0] * 4
for row in data: # this gives one row at a time
for c in range(4): # equivalent to for c in [0,1,2,3]:
#first, check if the max is big enough:
if row[c] > maxes[c]:
maxes[c] = row[c]
# next, add that value to the sum:
avgs[c] += row[c]/4.
You can print them like so:
for m in maxes:
print "%5.2f" % m,
for s in sums:
print "%5.2f" % s,
If you are allowed to use the enumerate function, this can be done a little more nicely:
for i, val in enumerate(row):
print i, val
0 2.42
1 11.42
2 13.86
3 72.32
So it gives us the values and the index, so we can use it like this:
maxes = [0] * 4
sums = [0] * 4
for row in data:
for c, val in enumerate(row):
#first, check if the max is big enough:
if val > maxes[c]:
maxes[c] = val
# next, add that value to the sum:
sums[c] += val

Categories