How to write csv inside a loop python - python

i've done got my outputs for the csv file, but i dont know how to write it into csv file because output result is numpy array
def find_mode(np_array) :
vals,counts = np.unique(np_array, return_counts=True)
index = np.argmax(counts)
return(vals[index])
folder = ("C:/Users/ROG FLOW/Desktop/Untuk SIDANG TA/Sudah Aman/testbikincsv/folderdatacitra/*.jpg")
for file in glob.glob(folder):
a = cv2.imread(file)
rows = a.shape[0]
cols = a.shape[1]
middlex = cols/2
middley = rows/2
middle = [middlex,middley]
titikawalx = middlex - 10
titikawaly = middley - 10
titikakhirx = middlex + 10
titikakhiry = middley + 10
crop = a[int(titikawaly):int(titikakhiry), int(titikawalx):int(titikakhirx)]
c = cv2.cvtColor(crop, cv2.COLOR_BGR2HSV)
H,S,V = cv2.split(c)
hsv_split = np.concatenate((H,S,V),axis=1)
Modus_citra = (find_mode(H)) #how to put this in csv
my outputs is modus citra which is array np.uint8, im trying to put it on csv file but im still confused how to write it into csv because the result in loop.
can someone help me how to write it into csv file ? i appreciate every help

Run your loop, and put the data into lists
eg. mydata = [result1,result2,result3]
Then use csv.writerows(mydata) to write your list into csv rows
https://docs.python.org/3/library/csv.html#csv.csvwriter.writerows

You can save your NumPy arrays to CSV filesĀ using the savetxt() function. This function takes a filename and array as arguments and saves the array into CSV format. You must also specify the delimiter; this is the character used to separate each variable in the file, most commonly a comma. For example:
import numpy as np
my_array = np.array([1,2,3,4,5,6,7,8,9,10])
my_file = np.savetxt('randomtext.csv', my_array, delimiter = ',', fmt = '%d')
print(my_file)

Related

Is there a faster way to write python outputs back to excel using xlwings in Python?

I have excel file column A, B and C as inputs and then I want to do calculation in python and then return the outputs back o the excel column D and E. Is there faster way than for loop?
import xlwings as xw
import pandas as pd
def square(inputs):
age = inputs['AGE']
weight = inputs['WEIGHT']
outputs = {}
outputs['output_age_square'] = age*age
outputs['output_weight_square'] = weight*weight
return outputs
wb = xw.Book(r'C:\Users\TEST.xlsx') #connect to the daily file xlsm
sheet = wb.sheets['Sheet1']
end_row_num = sheet.range('A' + str(sheet.cells.last_cell.row)).end('up').row
df = sheet.range('A1'+':'+'C'+str(end_row_num)).options(pd.DataFrame, header=1, index=False).value #read all inputs
inputs = df.to_dict('records') #inputs is a list of dicts
outputs = [square(single_input) for single_input in inputs]
for i in range(len(inputs)):
row = 2+i
###########Is there faster way to return back outputs to excel cells#######
sheet.range('D'+str(row)).value = outputs[i]['output_age_square']
sheet.range('E'+str(row)).value = outputs[i]['output_weight_square']
With xlwings (as with VBA), you have to assign whole arrays to the range, instead of looping through individual cells to make it fast. E.g., you can assign a DataFrame directly to the top left cell like so:
sheet.range('D1').value = df

creating columns with continuous values on individual csv files

I have a large csv file which I have split into six individual files. I am using a 'for loop' to read each file and create a column
in which the values ascend by one.
whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']
first_file=True
for piece in whole_file:
if not first_file:
skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file
else:
skip_row = []
V_raw = pd.read_csv(piece)
V_raw['centiseconds'] = np.arange(len(V_raw)) #label each centisecond
My output:
My desired output
Is there a clever way of doing what I intend.
Store the last value for centiseconds and count from there:
whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']
first_file=True
## create old_centiseconds variable
old_centiseconds = 0
for piece in whole_file:
if not first_file:
skip_row = [0] # if it is not the first csv file then skip the header row (row 0) of that file
else:
skip_row = []
V_raw = pd.read_csv(piece)
# add old_centiseconds onto what you had before
V_raw['centiseconds'] = np.arange(len(V_raw)) + old_centiseconds #label each centisecond
# update old_centiseconds
old_centiseconds += len(V_raw)
As I said in my comment you may want to view the data as a numpy array as this requires less memory. You can this by opening the .csv files as numpy array and then append to an empty list. If you would like to append these numpy arrays together you can .vstack. The following code should be able to do this:
from numpy import genfromtxt
whole_file=['100Hz1-raw.csv','100Hz2-raw.csv','100Hz3-raw.csv','100Hz4-raw.csv','100Hz5-raw.csv','100Hz6-raw.csv']
whole_file_numpy_array = []
for file_name in whole_file:
my_data = genfromtxt(file_name, delimiter=',')
whole_file_numpy_array.append(file_name)
combined_numpy_array = np.vstack(whole_file_numpy_array)

export data to xls file format

I have a text file with data of 6000 records in this format
{"id":"1001","user":"AB1001","first_name":"David ","name":"Shai","amount":"100","email":"me#no.mail","phone":"9999444"}
{"id":"1002","user":"AB1002","first_name":"jone ","name":"Miraai","amount":"500","email":"some1#no.mail","phone":"98894004"}
I want to export all data to excel file as shown bellow example
I would recommend reading in the text file, then converting to a dictionary with json, and using pandas to save a .csv file that can be opened with excel.
In the example below, I copied your text into a text file, called "myfile.txt", and I saved the data as "myfile2.csv".
import pandas as pd
import json
# read lines of text file
with open('myfile.txt') as f:
lines=f.readlines()
# remove empty lines
lines2 = [line for line in lines if not(line == "\n")]
# convert to dictionaries
dicts = [json.loads(line) for line in lines2]
# save to .csv
pd.DataFrame(dicts ).to_csv("myfile2.csv", index = False)
You can use VBA and a json-parser
Your two lines are not a valid JSON. However, it is easy to convert it to a valid JSON as shown in the code below. Then it is a relatively simple matter to parse it and write it to a worksheet.
The code assumes no blank lines in your text file, but it is easy to fix if that is not the case.
Using your data on two separate lines in a windows text file (if not windows, you may have to change the replacement of the newline token with a comma depending on what the generating system uses for newline.
I used the JSON Converter by Tim Hall
'Set reference to Microsoft Scripting Runtime or
' use late binding
Option Explicit
Sub parseData()
Dim JSON As Object
Dim strJSON As String
Dim FSO As FileSystemObject, TS As TextStream
Dim I As Long, J As Long
Dim vRes As Variant, v As Variant, O As Object
Dim wsRes As Worksheet, rRes As Range
Set FSO = New FileSystemObject
Set TS = FSO.OpenTextFile("D:\Users\Ron\Desktop\New Text Document.txt", ForReading, False, TristateUseDefault)
'Convert to valid JSON
strJSON = "[" & TS.ReadAll & "]"
strJSON = Replace(strJSON, vbLf, ",")
Set JSON = parsejson(strJSON)
ReDim vRes(0 To JSON.Count, 1 To JSON(1).Count)
'Header row
J = 0
For Each v In JSON(1).Keys
J = J + 1
vRes(0, J) = v
Next v
'populate the data
I = 0
For Each O In JSON
I = I + 1
J = 0
For Each v In O.Keys
J = J + 1
vRes(I, J) = O(v)
Next v
Next O
'write to a worksheet
Set wsRes = Worksheets("sheet6")
Set rRes = wsRes.Cells(1, 1)
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
Application.ScreenUpdating = False
With rRes
.EntireColumn.Clear
.Value = vRes
.Style = "Output"
.EntireColumn.AutoFit
End With
End Sub
Results from your posted data
Try using the pandas module in conjunction with the eval() function:
import pandas as pd
with open('textfile.txt', 'r') as f:
data = f.readlines()
df = pd.DataFrame(data=[eval(i) for i in data])
df.to_excel('filename.xlsx', index=False)

How to write continuous outputs in a single txt file

I am working with multiple data files (File_1, File_2, .....). I want the desired outputs for each data file to be saved in the same txt file as row values of a new column.
I tried the following code for my first data file (File_1). The desired outputs (Av_Age_btwn_0_to_5, Av_Age_btwn_5_to_10) are stored as row values of a column in the output txt file (Result.txt). Now, I want these outputs to be stored as row values of a next column of the same txt file when I work with File_2. Then for File_3, in a similar manner, I want the outputs in the next column and so on.
import numpy as np
data=np.loadtxt('C:/Users/Hrihaan/Desktop/File_1.txt')
Age=data[:,0]
Age_btwn_0_to_5=Age[(Age<5) & (Age>0)]
Age_btwn_5_to_10=Age[(Age<10) & (Age>=5)]
Av_Age_btwn_0_to_5=np.mean(Age_btwn_0_to_5)
Av_Age_btwn_5_to_10=np.mean(Age_btwn_5_to_10)
np.savetxt('/Users/Hrihaan/Desktop/Result.txt', (Av_Age_btwn_0_to_5, Av_Age_btwn_5_to_10), delimiter=',')
Any help would be appreciated.
If I understand correctly, each of your files is a column, and you want to combine them into a matrix (one file per column).
Maybe something like this could work?
import numpy as np
# Simulate some dummy data
def simulate_data(n_files):
for i in range(n_files):
ages = np.random.randint(0,10,100)
np.savetxt("/tmp/File_{}.txt".format(i),ages,fmt='%i')
# Your file processing
def process(age):
age_btwn_0_to_5=age[(age<5) & (age>0)]
age_btwn_5_to_10=age[(age<10) & (age>=5)]
av_age_btwn_0_to_5=np.mean(age_btwn_0_to_5)
av_age_btwn_5_to_10=np.mean(age_btwn_5_to_10)
return (av_age_btwn_0_to_5, av_age_btwn_5_to_10)
n_files = 5
simulate_data(n_files)
results = []
for i in range(n_files):
# load data
data=np.loadtxt('/tmp/File_{}.txt'.format(i))
# Process your file and extract your information
data_processed = process(data)
# Store the result
results.append(data_processed)
results = np.asarray(results)
np.savetxt('/tmp/Result.txt',results.T,delimiter=',',fmt='%.3f')
In the end, you have something like that:
2.649,2.867,2.270,2.475,2.632
7.080,6.920,7.288,7.231,6.880
Is it what you're looking for?
import numpy as np
# some data
age = np.arange(10)
time = np.arange(10)
mean = np.arange(10)
output = np.array(list(zip(age,time,mean)))
np.savetxt('FooFile.txt', output, delimiter=',', fmt='%s')
# ^^^^^^^^ --> Use this keyword argument if you want to save it as int. For simplicity just don't use it.
output:
0,0,0
1,1,1
2,2,2
3,3,3
4,4,4
5,5,5
6,6,6
7,7,7
8,8,8
9,9,9

Pandas ValueError: Shape of passed values

In the following code I iterate through a list of images and count the frequencies of a given number, in this case zeros and ones. I then write this out to a csv. This works fine when I write out the list of frequencies only, but when I try to add the filename then I get the error:
ValueError: Shape of passed values is (1, 2), indices imply (2, 2)
When I try to write out one list of frequencies (number of ones) and the filenames it works fine.
My code is as follows:
import os
from osgeo import gdal
import pandas as pd
import numpy as np
# Input directory to the .kea files
InDir = "inDirectory"
# Make a list of the files
files = [file for file in os.listdir(InDir) if file.endswith('.kea')]
# Create empty list to store the counts
ZeroValues = []
OneValues = []
# Iterate through each kea file and open it
for file in files:
print('opening ' + file)
# Open file
ds = gdal.Open(os.path.join(InDir, file))
# Specify the image band
band = ds.GetRasterBand(1)
# Read the pixel values as an array
arr = band.ReadAsArray()
# remove values that are not equal (!=) to 0 (no data)
ZeroPixels = arr[arr==0]
OnePixels = arr[arr==1]
print('Number of 0 pixels = ' + str(len(ZeroPixels)))
print('Number of 1 pixels = ' + str(len(OnePixels)))
# Count the number of values in the array (length) and add to the list
ZeroValues.append(len(ZeroPixels))
OneValues.append(len(OnePixels))
# Close file
ds = Non
# Pandas datagram and out to csv
out = pd.DataFrame(ZeroValues, OneValues, files)
# Write the pandas dataframe to a csv
out.to_csv("out.csv", header=False, index=files)
Pandas thinks you're trying to pass OneValues and files as positional index and columns arguments. See docs.
Try wrapping your fields in a dict:
import pandas as pd
ZeroValues = [2,3,4]
OneValues = [5,6,7]
files = ["A.kea","B.kea","C.kea"]
df = pd.DataFrame(dict(zero_vals=ZeroValues, one_vals=OneValues, fname=files))
Output:
fname one_vals zero_vals
0 A.kea 5 2
1 B.kea 6 3
2 C.kea 7 4

Categories