csv file to excel, resulting in messy table - python

I want to convert my csv file to excel, but the first line of the csv get read as header
I first created a csv with the lists below then I used pandas to convert it to excel
import pandas as pd
id=["id",1,2,3,4,5]
name=["name","Salma","Ahmad","Manar","Mustapha","Zainab"]
age=["age",14,12,15,13,10]
#this is how i created the csv file
Csv='path/csvfile.csv'
open_csv=open(Csv, 'w')
outfile=cvs.writer(open_csv)
outfile.writerows([id]+[name]+[age])
open_csv.close()
#Excel file
Excel='path/Excelfile.xlsx'
Excel_open=open(Excel, 'w')
csv_file=pd.read_csv(Csv)
csv_file.to_excel(Excel)
This is what I get from this code
"Results"
I want the Id title to be in the same column as name and age

I would suggest this instead:
import pandas as pd
df = pd.DataFrame({
"id": [1,2,3,4,5],
"name":["Salma","Ahmad","Manar","Mustapha","Zainab"],
"age":[14,12,15,13,10]
})
excel_file = df.to_excel("excel_file.xlsx", index=False)
In this way you can create a dataframe more easily and understandable.

Related

how to insert data from list into excel in python

how to insert data from list into excel in python
for example i exported this data from log file :
data= ["101","am1","123450","2015-01-01 11:19:00","test1 test1".....]
["102","am2","123451","2015-01-01 11:20:00","test2 test3".....]
["103","am3","123452","2015-01-01 11:21:00","test3 test3".....]
Output result:
[1]: https://i.stack.imgur.com/7uTOE.png
.
The module pandas has a DataFrame.to_excel() function that would do that.
import pandas as pd
data= [["101","am1","123450","2015-01-01 11:19:00","test1 test1"],
["102","am2","123451","2015-01-01 11:20:00","test2 test3"],
["103","am3","123452","2015-01-01 11:21:00","test3 test3"]]
df = pd.DataFrame(data)
df.to_excel('my_data.xmls')
That should do it.

Python pandas xlsx/ csv

I want to convert xlsx to csv and it works, but after conversion python add ".0" to string...
Sample xlsx :
Name, Age
Mark, 20
CSV after conversion :
Name, Age
Mark, 20.0 <- add ".0"
What could the problem be?
#importing pandas as pd
import pandas as pd
# Read and store content
# of an excel file
read_file = pd.read_excel ("EXPORT.xlsx")
# Write the dataframe object
# into csv file
read_file.to_csv ("data.csv",
index = True,
header=True,
encoding='utf-8-sig')
# read csv file and convert
# into a dataframe object
df = pd.DataFrame(pd.read_csv("data.csv"))
# show the dataframe
df
I've tried to reproduce this behavior, but in my case pd.read_excel() automatically assigned the int64 format on the Age column using the presented Excel sheet.
However this case can be easily solved with the df.astype() function, that can transforms data types, e.g. for your case from floating to integer format.
#importing pandas as pd
import pandas as pd
# Read and store content
# of an excel file
read_file = pd.read_excel ("EXPORT.xlsx")
# transform data type of column "Age" to int64
read_file = read_file.astype({'Age': 'int64'})
# Write the dataframe object
# into csv file
read_file.to_csv ("data.csv",
index = True,
header=True,
encoding='utf-8-sig')
# read csv file and convert
# into a dataframe object
df = pd.DataFrame(pd.read_csv("data.csv"))
# show the dataframe
print(df)
I added float_format option and it seems that works
read_file.to_csv ("basf.csv",
index = None,
header=True,
encoding='utf-8-sig',
decimal=',',
float_format='%d'
)

How can I convert the column names(task, asset,name,owner) as row and store it in a new .csv file using Python?

In Python, how can I convert the column names(task, asset,name,owner) as row and store it in a new .csv file ?
Data Set (sample_change.csv) :
task asset name owner
JJJ01 61869 assetdev hoskot,john (100000)
JJJ02 87390 assetprod hope, ricky (100235)
JJJ10 28403 assetprod shaw, adam (199345)
The below is the code I started to write, but couldn't think of an approach.
import pandas as pd
import csv
#reading csv file and making the data frame
dataframe = pd.read_csv(r"C:\AWSGEEKS\dataset\sample_change.csv")
columns = list(dataframe.head(0))
print(columns)
Output :
columns
task
asset
name
owner
To write as a single row:
pd.DataFrame(columns=dataframe.columns).to_csv('header.csv')
To write as as single column:
pd.DataFrame(dataframe.columns).to_csv('header.csv', index=False, header=['Name'])
df = pd.DataFrame(dataframe.columns, columns=['column names'])

Overwrite specific columns after modification pandas python

I have a csv file where i did some modifications in two columns. My question is the following: How can I print my csv file with the updated columns? My code is the following :
import pandas as pd
import csv
data = pd.read_csv("testdataset.csv")
data = data.join(pd.get_dummies(data["ship_from"]))
data = data.drop("ship_from", axis=1)
data['market_name'] = data['market_name'].map(lambda x: str(x)[39:-1])
data = data.join(pd.get_dummies(data["market_name"]))
data = data.drop("market_name", axis=1)
Thank you in advance!
You can write to a file with pandas.DataFrame.to_csv
data.to_csv('your_file.csv')
However, you can view it without writing with
print(data.to_csv())

Python Pandas performing operation on each row of CSV file

I have a 1million line CSV file. I want to do call a lookup function on each row's 1'st column, and append its result as a new column in the same CSV (if possible).
What I want is this is something like this:
for each row in dataframe
string=row[1]
result=lookupFunction(string)
row.append[string]
I Know i could do it using python's CSV library by opening my CSV, read each row, do my operation, write results to a new CSV.
This is my code using Python's CSV library
with open(rawfile, 'r') as f:
with open(newFile, 'a') as csvfile:
csvwritter = csv.writer(csvfile, delimiter=' ')
for line in f:
#do operation
However I really want to do it with Pandas because it would be something new to me.
This is what my data looks like
77,#oshkosh # tannersville pa,,PA,US
82,#osithesakcom ca,,CA,US
88,#osp open records or,,OR,US
89,#ospbco tel ord in,,IN,US
98,#ospwmnwithn return in,,IN,US
99,#ospwmnwithn tel ord in,,IN,US
100,#osram sylvania inc ma,,MA,US
106,#osteria giotto montclair nj,,NJ,US
Any help and guidance will be appreciated it. THanks
here is a simple example of adding 2 columns to a new column from you csv file
import pandas as pd
df = pd.read_csv("yourpath/yourfile.csv")
df['newcol'] = df['col1'] + df['col2']
create df and csv
import pandas as pd
df = pd.DataFrame(dict(A=[1, 2], B=[3, 4]))
df.to_csv('test_add_column.csv')
read csv into dfromcsv
dfromcsv = pd.read_csv('test_add_column.csv', index_col=0)
create new column
dfromcsv['C'] = df['A'] * df['B']
dfromcsv
write csv
dfromcsv.to_csv('test_add_column.csv')
read it again
dfromcsv2 = pd.read_csv('test_add_column.csv', index_col=0)
dfromcsv2

Categories