There is an extra id column in dataFrame read from csv [duplicate] - python

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?

Use index=False.
df.to_csv('your.csv', index=False)

There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)

If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)

As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')

Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')

If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

Related

How to avoid this column so that I only get a remaining data in csv and not the Unamed: 0 column when I save the dataframe as new csv file

I have a dataframe and I converted it into a new csv file but when I read this new csv file I get a ' Unamed: 0 ' column which has row index's. I need to avoid this column. I even tried to delete the this column and save this dataframe into new csv file but still after that I get the same Unamed: 0 column in next new csv file as well.
I even tried to delete this column and save this dataframe into new csv file. When I droped that column it got droped in the code but when I saved this dataframe as a new csv file I get the same Unamed: 0 column in next new csv file as well.
When writing to csv, use index=False to avoid including the index.
df.to_csv("data.csv", index=False)
When reading the csv file, you can use index_col to specify the column to use as an index:
df = pd.read_csv("data.csv", index_col=0) # Use 1st column as index
You can also use usecols to specify the columns to read:
df = pd.read_csv("data.csv", usecols=[cols_to_include])
See pandas I/O for more info.

Pandas dataframe updating with .loc adds columns and indexes [duplicate]

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?
Use index=False.
df.to_csv('your.csv', index=False)
There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)
If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)
As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')
Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')
If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

Delete first column and then take them as a index pandas

I have a word2vec dataframe like this which saved from save_word2vec_format using Gensim under txt file. After using pandas to read this file. (Picture below). How to delete first row and make them as a index?
My txt file: https://drive.google.com/file/d/1O206N93hPSmvMjwc0W5ATyqQMdMwhRlF/view?usp=sharing
try this,
to replace index as header,
_X_T.index=_X_T.columns
to replace first row as header,
_X_T.index=_X_T.iloc[0]
save the row:
new_index = df.iloc[0]
drop it to avoid length mismatch:
df.drop(df.index[0], inplace=True)
and set it:
df.set_index(new_index, inplace=True)
you will get a SettingWithCopyWarning but that's the most elegant solution i could come up with.
if you want to set the headers (and not the first row) do:
df.index = df.columns

Pandas is adding an extra column of data when converting from dta to csv [duplicate]

I am trying to save a csv to a folder after making some edits to the file.
Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing the index to csv.
I tried:
pd.read_csv('C:/Path to file to edit.csv', index_col = False)
And to save the file...
pd.to_csv('C:/Path to save edited file.csv', index_col = False)
However, I still got the unwanted index column. How can I avoid this when I save my files?
Use index=False.
df.to_csv('your.csv', index=False)
There are two ways to handle the situation where we do not want the index to be stored in csv file.
As others have stated you can use index=False while saving your
dataframe to csv file.
df.to_csv('file_name.csv',index=False)
Or you can save your dataframe as it is with an index, and while reading you just drop the column unnamed 0 containing your previous index.Simple!
df.to_csv(' file_name.csv ')
df_new = pd.read_csv('file_name.csv').drop(['unnamed 0'],axis=1)
If you want no index, read file using:
import pandas as pd
df = pd.read_csv('file.csv', index_col=0)
save it using
df.to_csv('file.csv', index=False)
As others have stated, if you don't want to save the index column in the first place, you can use df.to_csv('processed.csv', index=False)
However, since the data you will usually use, have some sort of index themselves, let's say a 'timestamp' column, I would keep the index and load the data using it.
So, to save the indexed data, first set their index and then save the DataFrame:
df.set_index('timestamp')
df.to_csv('processed.csv')
Afterwards, you can either read the data with the index:
pd.read_csv('processed.csv', index_col='timestamp')
or read the data, and then set the index:
pd.read_csv('filename.csv')
pd.set_index('column_name')
Another solution if you want to keep this column as index.
pd.read_csv('filename.csv', index_col='Unnamed: 0')
If you want a good format next statement is the best:
dataframe_prediction.to_csv('filename.csv', sep=',', encoding='utf-8', index=False)
In this case you have got a csv file with ',' as separate between columns and utf-8 format.
In addition, numerical index won't appear.

Creating a new pandas Dataframe from CSV file with no header

I am trying to create a new dataframe from csv:
frame = DataFrame(data=pd.read_csv(path))
the result is correct except that the first line becomes the columns:
so I add columns to the dtaframe:
columns = ['person-id','time-stamp','loc-id']
frame = DataFrame(data=pd.read_csv(path),columns=columns)
then it goes wrong:the dataframe is all nan
this confuses me,can anyone tell me what is going on with it?
You dont need DataFrame constructor, because output of read_csv is obviously DataFrame (if not use squeeze=True, then Series):
frame=pd.read_csv(path)
You need to tell read_csv() that your input has no column headers; by the time you give Dataframe the column names, it's too late. Try this:
columns = ['person-id','time-stamp','loc-id']
frame = pd.read_csv(path, names=columns)

Categories