Working with a CSV file in PyCharm. I want to delete the automatically-generated index column. When I print it, however, the answer I get in the terminal is "None". All the answers by other users indicate that the reset_index method should work.
If I just say "df = df.reset_index(drop=True)" it does not delete the column, either.
import pandas as pd
df = pd.read_csv("music.csv")
df['id'] = df.index + 1
cols = list(df.columns.values)
df = df[[cols[-1]]+cols[:3]]
df = df.reset_index(drop=True, inplace=True)
print(df)
I agree with #It_is_Chris. Also,
This is not true because return is None:
df = df.reset_index(drop=True, inplace=True)
It's should be like this:
df.reset_index(drop=True, inplace=True)
or
df = df.reset_index(drop=True)
Since you said you're trying to "delete the automatically-generated index column" I could think of two solutions!
Fist solution:
Assign the index column to your dataset index column. Let's say your dataset has already been indexed/numbered, then you could do something like this:
#assuming your first column in the dataset is your index column which has the index number of zero
df = pd.read_csv("yourfile.csv", index_col=0)
#you won't see the automatically-generated index column anymore
df.head()
Second solution:
You could delete it in the final csv:
#To export your df to a csv without the automatically-generated index column
df.to_csv("yourfile.csv", index=False)
Related
I was trying to add a new Column to my dataset but when i did the column only had 1 index
is there a way to make one value be in al indexes in a column
import pandas as pd
df = pd.read_json('file_1.json', lines=True)
df2 = pd.read_json('file_2.json', lines=True)
df3 = pd.concat([df,df2])
df3 = df.loc[:, ['renderedContent']]
görüş_column = ['Milet İttifakı']
df3['Siyasi Yönelim'] = görüş_column
As per my understanding, this could be your possible solution:-
You have mentioned these lines of code:-
df3 = pd.concat([df,df2])
df3 = df.loc[:, ['renderedContent']]
You can modify them into
df3 = pd.concat([df,df2],axis=1) ## axis=1 means second dataframe will add to columns, default value is axis=0 which adds to the rows
Second point is,
df3 = df3.loc[:, ['renderedContent']]
I think you want to write this one , instead of df3=df.loc[:,['renderedContent']].
Hope it will solve your problem.
Using df.drop() I removed the "ID" column from the df, and now I want to return that column.
df.drop('ID', axis=1, inplace=True)
df
# shows me df without ID column
What method should I use?
found that all you need to do is to reload again the first command of the import of the df :)
enter image description here See the attached screenshot. I want to delete all the rows which contain entries from 'Unnamed' column.
i know that the column can be removed by data.drop(data.columns[27], axis=1, inplace=True) but it wont delete the entire rows with it
import pandas as pd
import numpy as np
data = pd.read_csv('/home/syed/ML-Notebook/FL-P1/DATASET_FRAUDE.csv',
engine='python',
encoding=('latin1'),
parse_dates=['FECHA_SINIESTRO','FECHA_INI_VIGENCIA','FECHA_FIN_VIGENCIA','FECHA_DENUNCIO'])
#data.drop(data.columns[27], axis=1, inplace=True)
print(data.info())
df = df[df['Unnamed: 27'].astype(str).map(len) >0]
df
Drop Column:
df = df.loc[:, ~df.columns.str.contains('^Unnamed')]
To delete rows macthing a condition you can do:
df = df.drop(df[df.column_name == 'Unnamed'].index)
However this question should be helpfull: Deleting DataFrame row in Pandas based on column value
I wanna get the row number of the dataframe and store it in a new column called Id. Please advise me on how to code it.
Current dataframe:
expected outcome with new Id column:
If the first column is index use DataFrame.insert, if necessary subtract 1:
df.insert(0, 'Id', df.index - 1)
If you need count column for general solution with any index values:
df.insert(0, 'Id', np.arange(len(df)))
import pandas as pd
df = pd.DataFrame({'a':[34, 23,37,38],'b':[1,2,3,4]})
df. set_index(a, inplace=True)
Id = list(df. index)
df['Id'] = id
First, reset an index, it will add as a new column with name index. Then rename the column to the desired name
import pandas as pd
df = (df.reset_index()
.rename(columns= {'index':'ID'}
)
how set my indexes from "Unnamed" to the first line of my dataframe in python
import pandas as pd
df = pd.read_excel('example.xls','Day_Report',index_col=None ,skip_footer=31 ,index=False)
df = df.dropna(how='all',axis=1)
df = df.dropna(how='all')
df = df.drop(2)
To set the column names (assuming that's what you mean by "indexes") to the first row, you can use
df.columns = df.loc[0, :].values
Following that, if you want to drop the first row, you can use
df.drop(0, inplace=True)
Edit
As coldspeed correctly notes below, if the source of this is reading a CSV, then adding the skiprows=1 parameter is much better.