col1, col2, geometry
11.54000000,0.00000000,"{"type":"Polygon","coordinates":[[[-61.3115751786311,-33.83968838375797],[-61.29737019968823,-33.83207774370677],[-61.29443049860791,-33.83592770721248],[-61.29241347742871,-33.83489393774538],[-61.28994584513501,-33.83806650089736],[-61.292499308117186,-33.83938539699006],[-61.28958106470898,-33.8431993873636],[-61.29307859612687,-33.84495487100211],[-61.295256567865046,-33.846135537383866],[-61.296388484054326,-33.84676149889543],[-61.296747927196776,-33.84651421268175],[-61.297498943449426,-33.84670133707654],[-61.297992472179686,-33.847120134589964],[-61.299741220055196,-33.84901812154847],[-61.3012164422457,-33.85018089588664],[-61.3015892874819,-33.850566250375365],[-61.30284190607861,-33.85079121660985],[-61.30496105223345,-33.848193766906206],[-61.306084952130036,-33.84682375029292],[-61.30707604410075,-33.845532812572294],[-61.30672627175046,-33.84527169005647],[-61.306290670206494,-33.845188781884744],[-61.304604048903514,-33.847304098561025],[-61.30309763921784,-33.84654473836309],[-61.30013213880613,-33.84478736144466],[-61.30110629620797,-33.8431690707163],[-61.303046037678854,-33.844170576767105],[-61.30433047221653,-33.84266156764314],[-61.30484242472771,-33.842899106713375],[-61.30696068650711,-33.844104878773436],[-61.306418212892446,-33.84505221083753],[-61.307163201216696,-33.845464893960255],[-61.30760172622554,-33.84490909256552],[-61.307932962646014,-33.844513681420494],[-61.309176116985405,-33.84280834206188],[-61.30596211112515,-33.841126948963954],[-61.3056475423994,-33.841449215098756],[-61.30526859890979,-33.841557611902374],[-61.30483601097522,-33.84149669494795],[-61.30448925534122,-33.84120408616046],[-61.30410688411086,-33.840609953572034],[-61.30400151682434,-33.839925243738094],[-61.30240379835875,-33.83889223688216],[-61.30188418287129,-33.838444480832685],[-61.301130848179525,-33.83943255499186],[-61.30078636095504,-33.83996223583909],[-61.30059265818967,-33.84016469670277],[-61.30048478527255,-33.840438447848506],[-61.300252198180424,-33.84026774340676],[-61.29876711207748,-33.839489883020924],[-61.29799408649143,-33.840597902688785],[-61.297669258508,-33.84103160870988],[-61.297566592962134,-33.84112444052047],[-61.29748538503245,-33.841083604060834],[-61.297140578061956,-33.84134946797752],[-61.29709617977233,-33.84160419097128],[-61.297170540239335,-33.84168254110631],[-61.297341460506956,-33.84179653572337],[-61.297243418161194,-33.84197105818567],[-61.29699517169225,-33.84200300239938],[-61.29680176950715,-33.84179064473802],[-61.29691703393983,-33.8416707218475],[-61.297053755769845,-33.841604265738546],[-61.29707920124143,-33.84154875978832],[-61.29709391784669,-33.84147543150246],[-61.29711262215961,-33.84133768608576],[-61.296951411710374,-33.84119216012805],[-61.297262269660294,-33.84089514360839],[-61.297626491077864,-33.84051497848962],[-61.29865532547658,-33.83935363544152],[-61.30027710358755,-33.84011486145675],[-61.30046658230606,-33.83996490243917],[-61.30063460268783,-33.83979712050095],[-61.300992098665965,-33.8393813535522],[-61.301799802937595,-33.83832425565103],[-61.30135527704997,-33.837671541923235],[-61.30082030025984,-33.83731962483044],[-61.299512855628244,-33.83689640801839],[-61.29879550338594,-33.8363083288346],[-61.29831419490918,-33.835559835856905],[-61.298360098160686,-33.83408067231082],[-61.29976541168753,-33.83467181800819],[-61.30104200723692,-33.83586895614681],[-61.30133434017162,-33.83606352507277],[-61.30153415160492,-33.836339043812224],[-61.30164813329583,-33.83657891551336],[-61.30124575062752,-33.83743146168004],[-61.30195917352424,-33.83831965157767],[-61.30196183786503,-33.83843401993221],[-61.30250094586367,-33.83890484694379],[-61.304002690127376,-33.83984352469762],[-61.30473149692381,-33.8397514189025],[-61.3054487998093,-33.839941491549894],[-61.30582354557356,-33.84016574092716],[-61.30604808932503,-33.84046128014441],[-61.306143888278996,-33.840801374736316],[-61.30598219492593,-33.841088001849094],[-61.30757239940571,-33.841967156609876],[-61.30920555104759,-33.84277500140921],[-61.3115751786311,-33.83968838375797],[-61.3115751786311,-33.83968838375797]]]}"
How do I read a csv with syntax like above?
I am doing:
import pandas as pd
df = pd.read_csv('file.csv')
However, read_csv gets confused with the , within "{"type":"Polygon","coordinates": I want it to ignore the , within the quotes.
Your csv file contains a MultiIndex, which is causing your read and split issues.
I have tried multiple methods to read your file correctly. The best method that I have found so far is using the Python engine with an advanced separator in the read_csv function.
import pandas as pd
# these are for viewing the output
pd.set_option('display.max_columns', 30)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', 120)
# The separator matches the format of the string that you provided.
# I'm sure that it can be modified to be more efficient.
df = pd.read_csv('test.csv', skiprows=1, sep='(\d{1,2}.\d{1,8}),(\d{1,2}.\d{1,8}),("{"type":.*)',engine="python")
# some cleanup
df = df.drop(df.columns[0], axis=1)
# I had to save the processed file
df.to_csv('test_01.csv')
# read in the new file
df = pd.read_csv('test_01.csv', header=None, index_col=0)
print(df.to_string(index=False))
11.54 0.0 "{"type":"Polygon","coordinates":[[[-61.3115751786311,-33.83968838375797],[-61.29737019968823,-33.83207774370677],[-61.29443049860791,-33.83592770721248],[-61.29241347742871,-33.83489393774538],[-61.28994584513501,-33.83806650089736],[-61.292499308117186,-33.83938539699006],[-61.28958106470898,-33.8431993873636],[-61.29307859612687,-33.84495487100211],[-61.295256567865046,-33.846135537383866],[-61.296388484054326,-33.84676149889543],[-61.296747927196776,-33.84651421268175],[-61.297498943449426,-33.84670133707654],[-61.297992472179686,-33.847120134589964],[-61.299741220055196,-33.84901812154847],[-61.3012164422457,-33.85018089588664],[-61.3015892874819,-33.850566250375365],[-61.30284190607861,-33.85079121660985],[-61.30496105223345,-33.848193766906206],[-61.306084952130036,-33.84682375029292],[-61.30707604410075,-33.845532812572294],[-61.30672627175046,-33.84527169005647],[-61.306290670206494,-33.845188781884744],[-61.304604048903514,-33.847304098561025],[-61.30309763921784,-33.84654473836309],[-61.30013213880613,-33.84478736144466],[-61.30110629620797,-33.8431690707163],[-61.303046037678854,-33.844170576767105],[-61.30433047221653,-33.84266156764314],[-61.30484242472771,-33.842899106713375],[-61.30696068650711,-33.844104878773436],[-61.306418212892446,-33.84505221083753],[-61.307163201216696,-33.845464893960255],[-61.30760172622554,-33.84490909256552],[-61.307932962646014,-33.844513681420494],[-61.309176116985405,-33.84280834206188],[-61.30596211112515,-33.841126948963954],[-61.3056475423994,-33.841449215098756],[-61.30526859890979,-33.841557611902374],[-61.30483601097522,-33.84149669494795],[-61.30448925534122,-33.84120408616046],[-61.30410688411086,-33.840609953572034],[-61.30400151682434,-33.839925243738094],[-61.30240379835875,-33.83889223688216],[-61.30188418287129,-33.838444480832685],[-61.301130848179525,-33.83943255499186],[-61.30078636095504,-33.83996223583909],[-61.30059265818967,-33.84016469670277],[-61.30048478527255,-33.840438447848506],[-61.300252198180424,-33.84026774340676],[-61.29876711207748,-33.839489883020924],[-61.29799408649143,-33.840597902688785],[-61.297669258508,-33.84103160870988],[-61.297566592962134,-33.84112444052047],[-61.29748538503245,-33.841083604060834],[-61.297140578061956,-33.84134946797752],[-61.29709617977233,-33.84160419097128],[-61.297170540239335,-33.84168254110631],[-61.297341460506956,-33.84179653572337],[-61.297243418161194,-33.84197105818567],[-61.29699517169225,-33.84200300239938],[-61.29680176950715,-33.84179064473802],[-61.29691703393983,-33.8416707218475],[-61.297053755769845,-33.841604265738546],[-61.29707920124143,-33.84154875978832],[-61.29709391784669,-33.84147543150246],[-61.29711262215961,-33.84133768608576],[-61.296951411710374,-33.84119216012805],[-61.297262269660294,-33.84089514360839],[-61.297626491077864,-33.84051497848962],[-61.29865532547658,-33.83935363544152],[-61.30027710358755,-33.84011486145675],[-61.30046658230606,-33.83996490243917],[-61.30063460268783,-33.83979712050095],[-61.300992098665965,-33.8393813535522],[-61.301799802937595,-33.83832425565103],[-61.30135527704997,-33.837671541923235],[-61.30082030025984,-33.83731962483044],[-61.299512855628244,-33.83689640801839],[-61.29879550338594,-33.8363083288346],[-61.29831419490918,-33.835559835856905],[-61.298360098160686,-33.83408067231082],[-61.29976541168753,-33.83467181800819],[-61.30104200723692,-33.83586895614681],[-61.30133434017162,-33.83606352507277],[-61.30153415160492,-33.836339043812224],[-61.30164813329583,-33.83657891551336],[-61.30124575062752,-33.83743146168004],[-61.30195917352424,-33.83831965157767],[-61.30196183786503,-33.83843401993221],[-61.30250094586367,-33.83890484694379],[-61.304002690127376,-33.83984352469762],[-61.30473149692381,-33.8397514189025],[-61.3054487998093,-33.839941491549894],[-61.30582354557356,-33.84016574092716],[-61.30604808932503,-33.84046128014441],[-61.306143888278996,-33.840801374736316],[-61.30598219492593,-33.841088001849094],[-61.30757239940571,-33.841967156609876],[-61.30920555104759,-33.84277500140921],[-61.3115751786311,-33.83968838375797],[-61.3115751786311,-33.83968838375797]]]}"
Try this:
pd.read_csv('file.csv',quotechar='"',skipinitialspace=True)
In Python, I'm writing my Pandas Dataframe to a csv file and want to change the decimal delimiter to a comma (,). Like this:
results.to_csv('D:/Data/Kaeashi/BigData/ProcessMining/Voorbeelden/Voorbeeld/CaseEventsCel.csv', sep=';', decimal=',')
But the decimal delimiter in the csv file still is a .
Why? What do I do wrong?
If the decimal parameter doesn't work, maybe it's because the type of the column is object. (check the dtype value in the last line when you do df[column_name])
That can happen if some rows have values that couldn't be parsed as numbers.
You can force the column to change type:
Change data type of columns in Pandas.
But that can make you lose non numerical data in that column.
This functionality wasn't added until 0.16.0
Added decimal option in to_csv to provide formatting for non-‘.’ decimal separators (GH781)
Upgrade pandas to something more recent and it will work. The code below uses the 10 minute tutorial and pandas version 0.18.1
>>> import pandas as pd
>>> import numpy as np
>>> dates = pd.date_range('20130101', periods=6)
>>> df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
>>> df
A B C D
2013-01-01 -0.157833 1.719554 0.564592 -0.228870
2013-01-02 -0.316600 1.545763 -0.206499 0.793412
2013-01-03 1.905803 1.172803 0.744010 1.563306
2013-01-04 -0.142676 -0.362548 -0.554799 -0.086404
2013-01-05 1.708246 -0.505940 -1.135422 0.810446
2013-01-06 -0.150899 0.794215 -0.628903 0.598574
>>> df.to_csv("test.csv", sep=';', decimal=',')
This creates a "test.csv" file that looks like this:
;A;B;C;D
2013-01-01;-0,157833276159;1,71955439009;0,564592278787;-0,228870244247
2013-01-02;-0,316599953358;1,54576303958;-0,206499307398;0,793411528039
2013-01-03;1,90580284184;1,17280324924;0,744010110291;1,56330623177
2013-01-04;-0,142676406494;-0,36254842687;-0,554799190671;-0,0864039782679
2013-01-05;1,70824597265;-0,50594004498;-1,13542154086;0,810446051841
2013-01-06;-0,150899136973;0,794214730009;-0,628902891897;0,598573645748
In the case when data is an object, and not a plain float type, for example python decimal.Decimal(10.12). First, change a type and then write to CSV file:
import pandas as pd
from decimal import Decimal
data_frame = pd.DataFrame(data={'col1': [1.1, 2.2], 'col2': [Decimal(3.3), Decimal(4.4)]})
data_frame.to_csv('report_decimal_dot.csv', sep=';', decimal=',', float_format='%.2f')
data_frame = data_frame.applymap(lambda x: float(x) if isinstance(x, Decimal) else x)
data_frame.to_csv('report_decimal_comma.csv', sep=';', decimal=',', float_format='%.2f')
Somehow i don't get this to work either. I always just end up using the following script to rectify it. It's dirty but it works for my ends:
for col in df.columns:
try:
df[col] = df[col].apply(lambda x: float(x.replace('.','').replace(',','.')))
except:
pass
EDIT: misread the question, you might use the same tactic the other way around by changing all your floats to strings :). Then again, you should probably just figure out why it's not working. Due post it if you get it to work.
This example suppose to work (as it works for me):
import pandas as pd
import numpy as np
s = pd.Series(np.random.randn(10))
with open('Data/out.csv', 'w') as f:
s.to_csv(f, index=True, header=True, decimal=',', sep=';', float_format='%.3f')
out.csv:
;0 0;0,091 1;-0,009 2;-1,427 3;0,022 4;-1,270
5;-1,134 6;-0,965 7;-1,298 8;-0,854 9;0,150
I don't see exactly why your code doesn't work, but anyway, try to use the above example to your needs.