Sort a column in Worksheet using Python - python

In the below program i have created a workbook which contains a worksheet named sort
where i have placed words in one column and Numbers in another column
Now i have successfully outputed the .xlsxv file
But i need the numbers should be sorted from DESCENDING TO ASCENDING ORDER.
I don't know how to place the code for that.
Code
=====
import csv
import xlsxwriter
import re
workbook = xlsxwriter.Workbook('wordsandnumbers.xlsx')
worksheet = workbook.add_worksheet('sort')
with open('sort.csv') as f:
reader = csv.reader(f)
alist = list(reader)
worksheet.write(2,0,'words')
worksheet.write(2,1,'Numbers')
newlist = []
for values in alist:
convstr = str(values)
convstr = convstr.split(",")
newlist.extend(convstr)
a=3
for i in range(3,10):
newlist[a] = re.sub('[^a-zA-Z]','',newlist[a])
worksheet.write(i,0,newlist[a].strip('['))
a=a+1
newlist[a] = re.sub('[^0-9]','',newlist[a])
int(newlist[a])
worksheet.write(i,1,newlist[a])
a=a+1
workbook.close()
The Output i'm getting in .xlsx sheet is :
Needed output:
(The corresponding words which is in the same row of number should also be sorted)

I would recommend loading your original csv as a dataframe and then sorting it by a particular column. I've provided a fully reproducible example below that illustrates this.
I make my own version of sort.csv for demonstration purposes, then read it in as a dataframe using pandas.read_csv, and then sort using pandas.DataFrame.sort_values.
import pandas as pd
sort = open('sort.csv', 'w+')
sort.write('May, 5227\n')
sort.write('June, 417\n')
sort.write('Jan, 4\n')
sort.write('Feb, 424\n')
sort.write('Dec, 36\n')
sort.write('Mar, 4981\n')
sort.write('Apr, 3460\n')
sort.close()
df = pd.read_csv('sort.csv', names = ['words', 'Numbers'])
df = df.sort_values(['Numbers'], ascending=[False])
writer = pd.ExcelWriter('wordsandnumbers.xlsx', engine='xlsxwriter')
df.to_excel(writer, index=False, startrow=2)
writer.save()
Outputted sort.csv:
Outputted wordsandnumbers.xlsx:

Once you get the data into the array its straightforward to sort it and maintain order. You can just use the built in sort but give it a key which is the value you want the list sorted based on. See this.
import csv
import xlsxwriter
import re
workbook = xlsxwriter.Workbook('wordsandnumbers.xlsx')
worksheet = workbook.add_worksheet('sort')
with open('./sort.csv') as f:
reader = csv.reader(f)
alist = list(reader)
worksheet.write(2,0,'words')
worksheet.write(2,1,'Numbers')
#Here convert the number to an integer
newerlist = [[x[0], int(x[1])] for x in alist[1:]]
print(newerlist)
#key is the function applied to the arguments to get the answer and lambda
#is just a 1 line way to write a function f(x) which returns x[1] (the number in the rows)
newerlist.sort(key = lambda x : x[1], reverse = True)
a=3
for i in range(3,9):
for j in range(0,2):
worksheet.write(i,j,str(newerlist[i-a][j]))
workbook.close()

Related

How to convert list elements in column

import csv
import pandas as pd
imp=[]
feature1 = []
issued = []
used = []
with open("lmutil_lmstat.txt", "r") as input:
f=open("lmutil_lmstat.txt","r")
found = False
for x in f.readlines():
if ("Users" in x):
found = True
feature1.append(x.split(" ")[2][:-1])
issued.append(x.split(" ")[6][:])
used.append(x.split(" ")[12][:])
#print(x)
data = pd.DataFrame({'Feature_Name': [feature1], 'Licesence_Issued': [issued], 'Licesence_Used':[used]})
data_frame = pd.DataFrame(data)
with open('license_summery1.csv', 'w', newline='') as file:
writer = csv.writer(file)
writer.writerow(data_frame)
This is my code I am taking specific data from file and store it in the list. while creating data frame I am getting the output (1, 3) but I want to create a table with output (57,3)
Please check the above code and give suggestions. Any help will be appreciated.
I have actually just answered the same question here: how can change numpy array to single value?
You can use explode on your dataFrame. If you have multiple values in every element, it will expand your rows with single elements.
data_frame = data_frame.apply(pd.Series.explode)

How do you order row values from an Excel file in Python with a dictionary?

Let's say I have an excel sheet with 2 rows:
0.296178 0.434362 0.033033 0.758968
0.559323 0.455792 0.780323 0.770423
How could I go about putting each rows values in order from highest to lowest with a dictionary?
For example, the dictionary input for row 1 should look like: {1:[4,2,1,3]} since the 4th value in row 1 is highest and the 3rd value in row 1 is lowest.
(not indexing from 0 due to it being an Excel file)
For that, first, you need a module to import excel files. I recommend pandas, as it is widely used. (install it using 'pip install pandas', if you haven't)
after that use this code:
import pandas as pd
path = r'C:\Users\tanuj\Desktop\temp.xlsx' # replace it with your file path
df = pd.read_excel(path, header = None)
df.head() # to visualise the file
#And then, use this simple logic to get the required dictionary
d = {}
for x in range(df.shape[0]):
temp = {}
values = list(df.iloc[x])
for y in range(len(values)):
temp[df.loc[x][y]] = y+1
l = []
for t in sorted(temp):
l.append(temp[t])
l.reverse()
d[x+1] = l
print(d)
argsort function in numpy will do the trick. Consider this code:
import numpy as np
import pandas as pd
df = pd.read_csv('excel.csv', delimiter=',', header=None)
i = 0
dict = {}
for row in df.values:
arg = np.argsort(row)
iarg = list(map(lambda x: x+1, arg))
iarg.reverse()
dict[i]=iarg
i = i + 1
print(dict)
It reads input data as formatted csv and gives you the desired output.
After Reading Your Question I think you want to Read row values from an excel sheet and store it in dictionary and then want to sort values from dictionary from highest to lowest order...
So First you have to read excel file that store such value for that u can use
openpyxl module
from openpyxl import load_workbook
wb = load_workbook("values.xlsx")
ws = wb['Sheet1']
for row in ws.iter_rows():
print([cell.value for cell in row])
the above code will generate a list of values that are in excel file
In your case:
[0.296178, 0.434362, 0.033033, 0.758968]
[0.559323, 0.455792, 0.780323, 0.770423]
now you have to store it in dictionary and now sort it...
from openpyxl import load_workbook
wb = load_workbook("values.xlsx")
ws = wb['Sheet1']
value_dict={}
n=1
#extracting value from excel
for row in ws.iter_rows():
values=[cell.value for cell in row]
value_dict[n]=values
n=n+1
print(value_dict)
#Sorting Values
for keys,values in value_dict.items():
values.sort(reverse=True)
print("Row "+ str(keys),values)
The Above Code Perform The same task that you want to perform...
Output Image
For each row in a df you can compare each element to a sorted version of that row and get the indexes.
import pandas as pd
a = [0.296178, 0.434362, 0.033033, 0.758968]
b = [0.559323, 0.455792, 0.780323, 0.770423]
df = pd.DataFrame(columns = ['1', '2'], data = zip(a, b)).T
def compare_sort(x):
x = list(x)
y = sorted(x.copy())[::-1]
return([x.index(y[count]) +1 for count, _ in enumerate(y) ])
print(df.apply(compare_sort, axis=1)) # apply func. to each row of df
1 [4, 2, 1, 3]
2 [3, 4, 1, 2]
# Get data by row name
df = df.apply(compare_sort, axis=1)
print(df['1'])
[4, 2, 1, 3]
Useful links.
get-indices-of-items
reverse-a-list

How do I add values from a csv file to a list?

x,y
6.1101,17.592
5.5277,9.1302
8.5186,13.662
7.0032,11.854
5.8598,6.8233
8.3829,11.886
7.4764,4.3483
8.5781,12
6.4862,6.5987
5.0546,3.8166
5.7107,3.2522
14.164,15.505
How do I put each value for x in a list and the same for y values ?
I'm basically trying to create a plot.
You could do:
import csv
from collections import defaultdict
columns = defaultdict(list)
with open("my.csv") as fin:
dr = csv.DictReader(fin)
for row in dr:
for key, val in row.items():
columns[key].append(float(val))
print(columns["x"])
print(columns["y"])
Gives:
[6.1101, 5.5277, 8.5186, 7.0032, 5.8598, 8.3829, 7.4764, 8.5781, 6.4862, 5.0546, 5.7107]
[17.592, 9.1302, 13.662, 11.854, 6.8233, 11.886, 4.3483, 12.0, 6.5987, 3.8166, 3.2522]
Obviously this is assuming that the contents will be numeric data that needs to be converted to float (as the question says that you are trying to create a plot). If there were non-numeric values, this would raise a ValueError, so if this might be the case then you would need to test for this or handle the exception.
use pandas
import pandas as pd
df=pd.read_csv('myfile.csv', sep=',',header=None)
Use pandas, an example below.
import pandas as pd
df = pd.read_csv('data.csv')
x = df.x.tolist()
y = df.y.tolist()
x and y variables will contain values from column x and y in your CSV as lists respectively.
you can use pandas to do this:
import pandas as pd
df = pd.read_csv('cord.csv', sep=',')
x = df['x'].tolist()
y = df['y'].tolist()
output:
[6.1101, 5.5277, 8.5186, 7.0032, 5.8598, 8.3829, 7.4764, 8.5781, 6.4862, 5.0546, 5.7107, 14.164]
[17.592, 9.1302, 13.662, 11.854, 6.8233, 11.886, 4.3483, 12.0, 6.5987, 3.8166, 3.2522, 15.505]
import pandas as pd
df = pd.read_csv('data.csv', header = None)
print(type(df.columns))
print(type(df.index))
then once you know the default type of the data sets, you can use
df.columns.tolist()
df.index.tolist()
print(type(data.columns.tolist()))
print(type(data.index.tolist()))
Easy way is to use csv module functionnality.
First create a csv reader function:
import csv
def csv_dict_reader(file, has_header=False, skip_comment_char=None, **kwargs):
"""
Reads CSV file into memory
:param file: (str) path to csv file to read
:param has_header: (bool) skip first line
:param skip_comment_char: (str) optional character which, if found on first row, will skip row
:param delimiter: (char) CSV delimiter char
:param fieldnames: (list) CSV field names for dictionnary creation
:param kwargs:
:return: csv object that can be iterated
"""
with open(file) as fp:
csv_data = csv.DictReader(fp, **kwargs)
# Skip header
if has_header:
next(csv_data)
fieldnames = kwargs.get('fieldnames')
for row in csv_data:
# Skip commented out entries
if fieldnames is not None:
if skip_comment_char is not None:
if not row[fieldnames[0]].startswith(skip_comment_char):
yield row
else:
yield row
else:
# list(row)[0] is key from row, works with Python 3.7+
if skip_comment_char is not None:
if not row[list(row)[0]].startswith(skip_comment_char):
yield row
else:
yield row
The above function returns a generator that can be iterated over, which is useful if your csv file is very large, so it hasn't to fit in memory at once.
Then use that function to read your data and iterate over the values
fieldnames = ('x', 'y')
data = csv_dict_reader('/path/to/my/file')
x_list = []
y_list = []
for row in data:
x_list.append(row['x'])
y_list.append(row['y'])
Btw, perhaps using two separate lists isn't the most optimized way.
You could remove both lists and simply use row['x'] directly.

Convert this list of lists in CSV

I am a novice in Python, and after several searches about how to convert my list of lists into a CSV file, I didn't find how to correct my issue.
Here is my code :
#!C:\Python27\read_and_convert_txt.py
import csv
if __name__ == '__main__':
with open('c:/python27/mytxt.txt',"r") as t:
lines = t.readlines()
list = [ line.split() for line in lines ]
with open('c:/python27/myfile.csv','w') as f:
writer = csv.writer(f)
for sublist in list:
writer.writerow(sublist)
The first open() will create a list of lists from the txt file like
list = [["hello","world"], ["my","name","is","bob"], .... , ["good","morning"]]
then the second part will write the list of lists into a csv file but only in the first column.
What I need is from this list of lists to write it into a csv file like this :
Column 1, Column 2, Column 3, Column 4 ......
hello world
my name is bob
good morning
To resume when I open the csv file with the txtpad:
hello;world
my;name;is;bob
good;morning
Simply use pandas dataframe
import pandas as pd
df = pd.DataFrame(list)
df.to_csv('filename.csv')
By default missing values will be filled in with None to replace None use
df.fillna('', inplace=True)
So your final code should be like
import pandas as pd
df = pd.DataFrame(list)
df.fillna('', inplace=True)
df.to_csv('filename.csv')
Cheers!!!
Note: You should not use list as a variable name as it is a keyword in python.
I do not know if this is what you want:
list = [["hello","world"], ["my","name","is","bob"] , ["good","morning"]]
with open("d:/test.csv","w") as f:
writer = csv.writer(f, delimiter=";")
writer.writerows(list)
Gives as output file:
hello;world
my;name;is;bob
good;morning

How to Perform Mathematical Operation on One Value of a CSV file?

I am dealing with a csv file that contains three columns and three rows containing numeric data. The csv data file simply looks like the following:
Colum1,Colum2,Colum3
1,2,3
1,2,3
1,2,3
My question is how to write a python code that take a single value of one of the column and perform a specific operation. For example, let say I want to take the first value in 'Colum1' and subtract it from the sum of all the values in the column.
Here is my attempt:
import csv
f = open('columns.csv')
rows = csv.DictReader(f)
value_of_single_row = 0.0
for i in rows:
value_of_single_Row += float(i) # trying to isolate a single value here!
print value_of_single_row - sum(float(r['Colum1']) for r in rows)
f.close()
Based on the code you provided, I suggest you take a look at the doc to see the preferred approach on how to read through a csv file. Take a look here:
How to use CsvReader
with that being said, you can modify the beginning of your code slightly to this:
import csv
with open('data.csv', 'rb') as f:
rows = csv.DictReader(f)
for row in rows:
# perform operation per row
From there you now have access to each row.
This should give you what you need to do proper row-by-row operations.
What I suggest you do is play around with printing out your rows to see what your data looks like. You will see that each row being outputted is a dictionary.
So if you were going through each row, you can just simply do something like this:
for row in rows:
row['Colum1'] # or row.get('Colum1')
# to do some math to add everything in Column1
s += float(row['Column1'])
So all of that will look like this:
import csv
s = 0
with open('data.csv', 'rb') as f:
rows = csv.DictReader(f)
for row in rows:
s += float(row['Colum1'])
You can do pretty much all of this with pandas
from pandas import DataFrame, read_csv
import matplotlib.pyplot as plt
import pandas as pd
import sys
import os
Location = r'path/test.csv'
df = pd.read_csv(Location, names=['Colum1','Colum2','Colum3'])
df = df[1:] #Remove the headers since they're unnecessary
print df
df.xs(1)['Colum1']=int(df.loc[1,'Colum1'])+5
print df
You can write back to your csv using df.to_csv('File path', index=False,header=True) Having headers=True will add the headers back in.
To do this more along the lines of what you have you can do it like this
import csv
Location = r'C:/Users/tnabrelsfo/Documents/Programs/Stack/test.csv'
data = []
with open(Location, 'r') as f:
for line in f:
data.append(line.replace('\n','').replace(' ','').split(','))
data = data[1:]
print data
data[1][1] = 5
print data
it will read in each row, cut out the column names, and then you can modify the values by index
So here is my simple solution using pandas library. Suppose we have sample.csv file
import pandas as pd
df = pd.read_csv('sample.csv') # df is now a DataFrame
df['Colum1'] = df['Colum1'] - df['Colum1'].sum() # here we replace the column by subtracting sum of value in the column
print df
df.to_csv('sample.csv', index=False) # save dataframe back to csv file
You can also use map function to do operation to one column, for example,
import pandas as pd
df = pd.read_csv('sample.csv')
col_sum = df['Colum1'].sum() # sum of the first column
df['Colum1'] = df['Colum1'].map(lambda x: x - col_sum)

Categories