Can`t merge Pandas Dataframe [duplicate]

Can`t merge Pandas Dataframe [duplicate] - python

This question already has answers here:
Append to Series in python/pandas not working
(2 answers)
Closed 2 years ago.
I have around 8 .csv files in a given directory. When I am running this code, getting empty dataframe (new_df which I have specified.).
I have already seen how to use concat function to get the job done but just wondering what i am doing wrong in my approach since i read documentation on DataFrame.append() and it should have worked.
path = Path("/content/Sales_data/")
new_df = pd.DataFrame()
for file in path.glob("*.csv"):
df = pd.read_csv(file)
new_df.append(df, ignore_index=True)
new_df
Appreciate any recommendation.

Try setting new_df to the DataFrame with appended data:
new_df = new_df.append(df, ignore_index=True)
The problem with your code is due to the fact that append returns a new object, it does not modify the existing DataFrame in place

Related

Python Pandas .str.split() creates an extra column that can't be dropped [duplicate]

This question already has answers here:
How to avoid pandas creating an index in a saved csv
(6 answers)
Closed 5 months ago.
I'm using the pandas split function to create new columns from an existing one. All of that works fine and I get my expected columns created. The issue is that it is creating an additional column in the exported csv file. So, it says there are 3 columns when there are actually 4.
I've tried various functions to drop that column, but it isn't recognized as part of the data frame so it can't be successfully removed.
Hopefully someone has had this issue and can offer a possible solution.
[example of the csv data frame output with the unnecessary column added]

The column A doesn't come from split but it's the index of your actual dataframe by default. You can change that by setting index=False in df.to_csv:
df.to_csv('{PATH}.csv', index=False)

PANDAS: How to rename a column but not lose its previous text in pandas? [duplicate]

This question already has answers here:
Pandas read in table without headers
(5 answers)
Closed 1 year ago.
Okay so I was reading up a text file and using .read_csv() and ended up with this dataframe:
But the problem is that, the im feeling rather rotten... text is ended up being as a column rather than a dataframe feature, and when I try to rename the column I just end up losing the feature all together, and skipping onto the 2nd value in the dataframe:
EDIT:
This is how I read in the text file.
Any answers, comments are heartfully accepted.

The final solution would be (respectfully concluded by #luigigi)
pd.read_csv("emotions.txt", sep=";", header=None)
Thanks!

You can pre-defined the columns name with the code.
df = pd.read_csv('emotions.txt', sep =';', names=['TEXT','EMOTION'], header=None)

Appending DF to the CSV - How to write everything explicitly without any conversion whatsoever? [duplicate]

This question already has answers here:
float64 with pandas to_csv
(3 answers)
Closed 2 years ago.
I searched and found this answer
While I understand the answer, my question is: Are there any possible ways to write data from DF explicitly to CSV and without any possible conversion? Is there an option to do that?
Example, value ".227" is stored to CSV as "022699999999999998"
I have this simple code, after scraping some data with BS, I open it in DF and then write to CSV:
table = soup.find('table', id='awer')
df1 = pd.read_html(str(table))
df1[0]['seas'] = season
print(df1[0])
df1[0].to_csv('abc.csv', encoding='utf=8', index=False, header=None , mode='a')
To verify, before appending I have printed out DF and everything is fine. So there is some kind of a conversion ongoing while appending data to CSV.
Any ideas how to solve it?
Thanks

Do you mean something like df.astype('unicode')?

Pandas: Dictionary of Dataframes [duplicate]

This question already has answers here:
How can you dynamically create variables? [duplicate]
(8 answers)
Closed 5 years ago.
I have a function that I made to analyze experimental data (all individual .txt files)
This function outputs a dictionary ({}) of Pandas Dataframes
Is there a efficient way to iterate over this dictionary and output individual dataframes?
Let's say my dictionary is called analysisdict
for key in analysisdict.keys():
dfx=pd.concat([analysisdict[key]['X'], analysisdict[key]['Y']], axis=1)
Where dfx would be an individual dataframe. (I'm guessing a second loop might be required? Perhaps I should iterate through a list of df names?)
The output would be df1...dfn

EDIT: I initially misread your question, and thought you wanted to concatenate all the DataFrames into one. This does that:
dfx = pd.concat([df for df in analysisdict.values()], ignore_index=True)
(Thanks to #paul-h for the ignore_index=True tip)
I read your question more carefully and realized that you're asking how to assign each DataFrame in your dictionary to its own variable, resulting in separate DataFrames named df1, df2, ..., dfn. Everything in my experience says that dynamically creating variables in this way is an anti-pattern, and best left to dictionaries. Check out the discussion here: How can you dynamically create variables via a while loop?

Copy Pandas DataFrame using '=' trick [duplicate]

This question already has an answer here:
pandas dataframe, copy by value
(1 answer)
Closed 5 years ago.
I have two pandas DataFrames, sdm. I wanted to create a copy of that DataFrame and work on that and later, I want to create another copy from sdm and work on different analysis. However, when I create a new Data Frame like this,
new_df = sdm
It creates a copy, however, when I alter new_df, it makes changes to the my old DataFrame sdm. How can I handle this without using =?

What python does is passing by reference. Try this:
new_df = sdm.copy()
I think you should have search more, I am sure there will be lots of questions on this topic!

you need to use new_df = sdm.copy() instead which is described here in the official documentation. new_df = sdm doesn't work because this assignement operation performs a copy by reference and not by value which means in nutshell, both new_df and sdm will reference the same data in memory.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Can`t merge Pandas Dataframe [duplicate] - python

Try setting new_df to the DataFrame with appended data: new_df = new_df.append(df, ignore_index=True) The problem with your code is due to the fact that append returns a new object, it does not modify the existing DataFrame in place

Related

Python Pandas .str.split() creates an extra column that can't be dropped [duplicate]

PANDAS: How to rename a column but not lose its previous text in pandas? [duplicate]

Appending DF to the CSV - How to write everything explicitly without any conversion whatsoever? [duplicate]

Pandas: Dictionary of Dataframes [duplicate]

Copy Pandas DataFrame using '=' trick [duplicate]

Categories

Resources