While true, pandas dataframe won't show? - python

Using Jupyter Notebook, if I put in the following code:
import pandas as pd
df = pd.read_csv('path/to/csv')
while True:
df
The dataframe won't show. Can anyone tell me why this is the case? I'm guessing it's because the constant looping is preventing the dataframe from loading fully. Is that what's happening here?
I need code that would let me get a user's input. If they type in a name, for example, I'll extract the person with that name's info from the dataframe and display it, then the program needs to ask them to give another name. This will continue until they type in "quit". I figured a while loop would be the best for that, but it looks like there's just something about while loops and pandas that won't mix. Does anyone have any suggestions on what I can do instead?

Related

Can i make dataframe "active" in pandas

I dont know if i'm asking this question right but fell free ask more info if needed.
So i do this dataframe where i read csv file. Then i want to use the file to do another tasks. i want that df to be "active" but it seems like it dont recognise that dataframe outside of button.
def on_button_clicked(b):
df = pd.read_csv(F"./siivous/cleanedfiles/node_{karry.value}.csv")
with output:
display (df)
display(img)
clear_output(wait=True)
So how can i make that dataframe active just click of the button. So excample i wrote print(df) it print that df.
Your dataframe named df is declared inside of a function. If you do this you cannot access to it outside of that function.
I suggest you the check out this thread.
I hope it helped!

Working with .csv data as a Pandas DataFrame, getting redundancy error when applying logic

Been working on this project all day and it's destroying me. Currently have finished web scraping and have a final .csv which contains the elements of a pandas dataframe. Working with this dataframe in a new file, and currently have the following:
df = pd.read_csv('active_homes.csv')
for i in range(len(df)):
add = df['Address'][i]
price = df['Price'][i]
if (price<100000) == True:
print(price)
'active_homes.csv' looks like this:
Address,Status,Price,Meta
"387 8th St, Burlington, CO 80807",For Sale,169500,"4bed2bath1,560sqft"
,and the resulting df's shape is (1764, 4).
This should, in theory, print the price for each iteration of price<100000.
In practice, it prints this:
I have confirmed that at each iteration of the above for loop, it is collecting the correct 'Price' and 'Address' information, and have also confirmed that at each interval the logic (price<100000) is working correctly. However, it is still doing the above. I was originally trying to just drop the rows of the dataframe that were <100000 but that wasn't doing anything. I was also trying to reassign the data to a new dataframe and it would either return an empty dataframe, or return a dataframe with duplicate data of this house (with the 'Price' of 58900).
So far, from all of that, I believe that the program is recognizing the amount of correct houses < 100000, but for some reason the assignment is sticking for the one address. It also does the same thing without assignment, as in:
for i in range(len(df)):
if (df['Price'][i]<100000) == True:
print(df['Price'][i])
Any help in identifying the error would be much appreciated.
With Pandas you try to never iterate everything in the traditional python way. Instead, you could achieve the desired result using the following method:
df = pd.read_csv('active_homes.csv')
temp_df = df[df["Price"]<100000] # initiating a new df isn't required, just a force of a habit
print(temp_df["Price"]) # displaying a series of houses that are below 100K; imo prettier print

(Python) manually copy/paste data from pandas table without copying the index

I've been looking around but could not find an similar post, so I thought I'd give it a go.
I wrote an pandas program that sucessfully displays the resulting dataframe in pandas table format in a tkinter textbox. the aim is that the user can select the data ancopy/paste it into an (existing)excel sheet. when doing this, the index is always copied as well. I was wondering if one could programmatically select the complete table except the index?
I know that one can save to excel or other with index=false, but I could not find a kind of df.select....index=false. I hope my explanation is more or less clear ;-)
Thanks a lot
screenshot
you could use dataframe's 'to_string' function, here you could pass 'index = False' as one of the parameters. For Ex: say we have this df:
import pandas as pd
df = pd.DataFrame({'a': ['yes', 'no', 'yes' ], 'b': [10, 5, 20]})
print(df.to_string(index = False))
this would give you:
a b
yes 10
no 5
yes 20
Hope this helps!
I finally found it.
Instead of using something like self.mytable.copy('columns') to select everything and then switch to Excel and paste it, I use this line of code which does exactly what I need :
df.to_clipboard(sep="\t", index=False)
The sep="\t" makes it split up amongst columns in Excel.
Hopefully someone can use this at some stage.

Changing the dimension (axis) of the dataframe

I know the community hate people uploading a image, but it is hard to explain without showing the dataframe I have.
is there any way that I can group the data by the columns 'Open','High','Low','Close','Adj Close','Volume' ,'Symbol' like this:
Have been browsing through pandas documentation for days and tried plenty methods but still dont work. Thank you and sorry for uploading image.
Update:
The code for the df is as following:
import yfinance as yf
stock_df = yf.download(["AAPL","GOOG"], start="2020-05-19", end="2020-05-20", interval='1m',group_by='ticker')
stock_df
need to pip install yfinance first tho. Hope this could help you guys to test it thanks. group_by = can be deleted so now stock are group by the column. However they are still separated, you can see there is 12 columns in there where 6 of them are repeated, any way to add a Symbol Column like what my expected output? Thanks
You can try this:
df.rename_axis(('Symbol', None), axis=1).stack(level=0).reset_index(level=1).sort_values('Symbol')
Though I am unsure how your AACG rows have data when your original frame are NaNs.

Applying for loops on dataframe?

I am applying for loop to a column in python. But I am not able to execute it. It is producing error. I want square of a column. Please see where I am committing mistake. I know i can do this with lambda. But I want to perform it in traditional way.
import pandas as pd
output=[]
for i in pd.read_csv("infy.csv"):
output.append(i['Close']**2)
print(output)
the whole point of pandas is not to loop
output = pd.read_csv("infy.csv")['Close']**2

Categories