Finding same values in a row of csv in python

Finding same values in a row of csv in python - python

I have a code that looks for numbers within a csv file that are within 1.0 decimal places of each other in the same row. Although, when I run it, it prints everything. Not just rows that have the condition that I want i.e. that the values from the 2nd and 3rd column be within 1.0 of each other. I want to run the code and have it display, the first column (the time at which it was recorded or better yet the column number), the 2nd and the 3rd column because they should be within 1.0 of each other. This is what the data file looks like:
Time Chan1 Chan2
04:07.0 52.31515503 16.49450684
04:07.1 23.55230713 62.48802185
04:08.0 46.06217957 24.94955444
04:08.0 41.72077942 31.32516479
04:08.0 19.80723572 25.73182678
Here's my code:
import numpy as np
from matplotlib import *
from pylab import *
filename = raw_input("Enter file name: ") + '.csv'
filepath = '/home/david/Desktop/' + filename
data = np.genfromtxt(filepath, delimiter=',', dtype=float)
first=[row[0] for row in data]
rownum1=[row[1] for row in data]
rownum2=[row[2] for row in data]
for row in data:
if ((abs(row[1]-row[2]) <= 1.0)):
print("The values in row 0 are 1 and 2, are within 1.0 of each other.", first, rownum1, rownum2)
This is my output:
26.3460998535, 44.587371826199998, 42.610519409200002, 24.7272491455, 89.397918701199998, 25.479614257800002, 30.991180419900001, 25.676086425800001
But I want this as an output:
4:09.0, 23.456, 22.5

You can do that like this:
data = np.genfromtxt(filepath, names=True, dtype=None)
idx = np.abs(data['Chan1'] - data['Chan2'])<1
print data[idx]

Related

How to copy one row in a .csv file to another row with python

I am trying to merge the strings from the first two rows of a .csv file into one row in python.
I am using a file with over 150 columns.
Right now it looks like this:
;; Test1; Test1; Test1; Test 2; Test 2; Test2
Name;Birthday; Points Q1; Points Q2; Points Q3; Points Q1; Points Q2; Points Q3
but I need to merge the information into one row, that looks like this:
Name;Birthday;Test1 PointsQ1; Test1 Points Q2; Test1 Points Q3; Test2 Points Q1;...

Using the pandas library you can do this,
First of installing the pandas library if you don't have,
By using this command
pip install pandas
Then add this script in your code
import pandas as pd
data=pd.read_csv("YOUR FILE LOCATION WITH FILE NAME")
This line will print you the first row of your excel sheet
print(data.ioc[0])
You can combine the first two rows into one row and save in the first row
data.iloc[0]=[data.iloc['YOUR COLUMN NAME'][0]+data.iloc['YOUR COLUMN NAME'][1],data.iloc[]+.......]
Like this for your every column and it will save in the first row
And Run a loop for remaining rows and update it.
For saving from the data variable that is data frame TO CSV
data.to_csv('your-file-name.csv', sep=',')

Ok, I managed to do it by converting the .csv file into an array.
def addheader(inputfile):
results = []
with open(inputfile, "r") as r:
reader = csv.reader(r, delimiter=';')
for row in reader:
i = i+1
if i < x:
results.append(row)
i = 0
x = len(results[0])-1
while i < x:
i = i+1
results[1][i] = results[0][i] + " " + results[1][i]

starting at index 1 of row when using writerow at python

i have a script to create matrix of size n and write it to csv file.
i want the matrix to have "boarders" at size of n.
my code:
a = []
firstRow = []
for i in range(n):
row = []
row.append(i+1)
firstRow.append(i+1)
for j in range(n):
row.append(random.randint(x,y))
a.append(row)
writer.writerow(firstRow)
writer.writerows(a)
output when using n = 3
1,2,3
1,74,82,68
2,87,70,72
3,68,71,74
i need the output to be like this:
1, 2, 3
1,74,82,68
2,87,70,72
3,68,71,74
with blank box at the csv index 0,0.
also i need the all matrix to start at row 1 instead of 0

Using pandas we can get the following valid csv with a few lines of easy-to-understand code:
,1,2,3
1,91,66,70
2,82,24,79
3,57,56,73
Example code used:
import pandas as pd
import numpy as np
# Create random numbers 0-99, 3x3
data = np.random.randint(0,100, size=(3,3))
df = pd.DataFrame(data)
# Add 1 to index and columns
df.columns = df.columns + 1
df.index = df.index + 1
#df.to_csv('output.csv') # Uncomment this row to write to file.
print(df.to_csv())
And if you insist that you want to remove the leadning ,:
with open('output.csv', 'w') as f:
f.write(df.to_csv()[1:])

Subtract elements from 2 arrays found in 2 different .csv files

I have two csv files with 1 row of data each and multiple columns
csv1: 0.1924321564, 0.8937481241, 0.6080270062, ........
csv2: 0.1800000000, 0.7397439374, 0.3949274792, ........
I want to subtract the first value in csv1 from the first value in csv2:
e.g 0.1924321564 - 0.1800000000 = 0.0124321564
0.8937481241 - 0.7397439374 = 0.15400418706
and continue this for the remaining columns.
I then want to take the results of the subtraction of each column and sum them together into one value e.g sum(0.0124321564 + 0.15400418706 + n)
I am very new to python so this is the code I started with:
import numpy as np
import csv
array1 = np.array('1.csv')
array2 = np.array('2.csv')
array3 = np.subtract(array1, array2)
total = np.sum(array3)

genfromtxt
note: the delimeter is comma followed by a space because that is what you showed. Please change accordingly.
import numpy as np
array1 = np.genfromtxt('1.csv', delimiter=', ')
array2 = np.genfromtxt('2.csv', delimiter=', ')
(array1 - array2).sum()
0.37953587010000012

How to Perform Mathematical Operation on One Value of a CSV file?

I am dealing with a csv file that contains three columns and three rows containing numeric data. The csv data file simply looks like the following:
Colum1,Colum2,Colum3
1,2,3
1,2,3
1,2,3
My question is how to write a python code that take a single value of one of the column and perform a specific operation. For example, let say I want to take the first value in 'Colum1' and subtract it from the sum of all the values in the column.
Here is my attempt:
import csv
f = open('columns.csv')
rows = csv.DictReader(f)
value_of_single_row = 0.0
for i in rows:
value_of_single_Row += float(i) # trying to isolate a single value here!
print value_of_single_row - sum(float(r['Colum1']) for r in rows)
f.close()

Based on the code you provided, I suggest you take a look at the doc to see the preferred approach on how to read through a csv file. Take a look here:
How to use CsvReader
with that being said, you can modify the beginning of your code slightly to this:
import csv
with open('data.csv', 'rb') as f:
rows = csv.DictReader(f)
for row in rows:
# perform operation per row
From there you now have access to each row.
This should give you what you need to do proper row-by-row operations.
What I suggest you do is play around with printing out your rows to see what your data looks like. You will see that each row being outputted is a dictionary.
So if you were going through each row, you can just simply do something like this:
for row in rows:
row['Colum1'] # or row.get('Colum1')
# to do some math to add everything in Column1
s += float(row['Column1'])
So all of that will look like this:
import csv
s = 0
with open('data.csv', 'rb') as f:
rows = csv.DictReader(f)
for row in rows:
s += float(row['Colum1'])

You can do pretty much all of this with pandas
from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt
import pandas as pd
import sys
import os
Location = r'path/test.csv'
df = pd.read_csv(Location, names=['Colum1','Colum2','Colum3'])
df = df[1:] #Remove the headers since they're unnecessary
print df
df.xs(1)['Colum1']=int(df.loc[1,'Colum1'])+5
print df
You can write back to your csv using df.to_csv('File path', index=False,header=True) Having headers=True will add the headers back in.
To do this more along the lines of what you have you can do it like this
import csv
Location = r'C:/Users/tnabrelsfo/Documents/Programs/Stack/test.csv'
data = []
with open(Location, 'r') as f:
for line in f:
data.append(line.replace('\n','').replace(' ','').split(','))
data = data[1:]
print data
data[1][1] = 5
print data
it will read in each row, cut out the column names, and then you can modify the values by index

So here is my simple solution using pandas library. Suppose we have sample.csv file
import pandas as pd
df = pd.read_csv('sample.csv') # df is now a DataFrame
df['Colum1'] = df['Colum1'] - df['Colum1'].sum() # here we replace the column by subtracting sum of value in the column
print df
df.to_csv('sample.csv', index=False) # save dataframe back to csv file
You can also use map function to do operation to one column, for example,
import pandas as pd
df = pd.read_csv('sample.csv')
col_sum = df['Colum1'].sum() # sum of the first column
df['Colum1'] = df['Colum1'].map(lambda x: x - col_sum)

How can I combine two csv files into one, by adding one column to the end of the first one?

I simply need to add the column of the second CSV file to the first CSV file.
Example CSV file #1
Time Press RH Dewpt Alt
Value Value Value Value Value
For N number of rows.
Example CSV file #2
SmoothedTemperature
Value
I simply want to make it
Time Press RH Dewpt Alt SmoothedTemperature
Value Value Value Value Value Value
Also one has headers the other does not.
Here is sample code of what I have so far, however the output is the final row of file 1 repeated with the full data set of File #2 next to it.
##specifying what they want to open
File = open(askopenfilename(), 'r')
##reading in the other file
Averaged = open('Moving_Average_Adjustment.csv','r')
##opening the new file via raw input to write to
filename = raw_input("Enter desired filename, EX: YYYYMMDD_SoundingNumber_Time.csv; must end in csv")
New_File = open(filename,'wb')
R = csv.reader(File, delimiter = ',')
## i feel the issue is here in my loop, i don't know how to print the first columns
## then also print the last column from the other CSV file on the end to make it mesh well
Write_New_File = csv.writer(New_File)
data = ["Time,Press,Dewpt,RH,Alt,AveragedTemp"]
Write_New_File.writerow(data)
for i, line in enumerate(R):
if i <=(header_count + MovingAvg/2):
continue
Time,Press,Temp,Dewpt,RH,Ucmp,Vcmp,spd,Dir,Wcmp,Lon,Lat,Ele,Azi,Alt,Qp,Qt,Qrh,Qu,Qv,QdZ=line
for i, line1 in enumerate(Averaged):
if i == 1:
continue
SmoothedTemperature = line1
Calculated_Data = [Time,Press,Dewpt,RH,Alt,SmoothedTemperature]
Write_New_File.writerow(Calculated_Data)

If you want to go down this path, pandas makes csv manipulation very easy. Say your first two sample tables are in files named test1.csv and test2.csv:
>>> import pandas as pd
>>> test1 = pd.read_csv("test1.csv")
>>> test2 = pd.read_csv("test2.csv")
>>> test3 = pd.concat([test1, test2], axis=1)
>>> test3
Time Press RH Dewpt Alt SmoothedTemperature
0 1 2 3 4 5 6
[1 rows x 6 columns]
This new table can be saved to a .csv file with the DataFrame method to_csv.
If, as you mention, one of the files has no headers, you can specify this when reading the file:
>>> test2 = pd.read_csv('test2.csv', header=None)
and then change the header row manually in pandas.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding same values in a row of csv in python - python

You can do that like this: data = np.genfromtxt(filepath, names=True, dtype=None) idx = np.abs(data['Chan1'] - data['Chan2'])<1 print data[idx]

Related

How to copy one row in a .csv file to another row with python

starting at index 1 of row when using writerow at python

Subtract elements from 2 arrays found in 2 different .csv files

How to Perform Mathematical Operation on One Value of a CSV file?

How can I combine two csv files into one, by adding one column to the end of the first one?

Categories

Resources