Pandas shows up with a bunch of errers when running simple code - python

I am trying to write a program that chooses a recipie for me to cook at random so i found a csv file to get the recipies from but when i tried running some code this happened.
I'm new to pandas so sorry if this is stupid
`
import random
import pandas as pd
import numpy as np
pd.read_csv('recipes.csv')
pd.isnull()
`
https://github.com/cweber/cookbook/blob/master/recipes.csv (this is the link to the csv file)

Related

Can explain the code at second last and last line, new to python

import os
import geopandas as gpd
import pandas as pd
file=os.listdir(r\\IP_Address\winshare\Affected_Area_Files)
path=[os.path.join(r\\IP_Address\winshare\Affected_Area_Files,i)
for i in file:
if".shp" in i:
# Below code need to be explained:
gdf=gpd.GeoDataFame(pd.concat([gpd.read_file(i).to_crs(gpd.read_file(path[0]).crs)for i in path],ignore_index=True),crs=gpd.read_file(path[0]).crs)
gdf.to_file(r\\IP_Address\winshare\affected_area\combined-AffectedArea_Test.shp)
I am new to python, the code has to be optimized for better performance.
It was running fine but now it is taking quite long to execute

Pandas read_csv gives decimal column numbers

I've been pulling my hair out trying to make a bipartite graph from a csv file and so far all I have is a panda matrix that looks like this
My code so far is just
`
import networkx as nx
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# import pyexcel as pe
# import pyexcel.ext.xlsx
from networkx.algorithms import bipartite
mat = pd.read_csv("networkdata3.csv")
# mat = pd.read_excel("networkdata1.xlsx",sheet_name="sheet_name_1")
print(mat.info)
sand = nx.from_pandas_adjacency(mat)
`
and I have no clue what I'm doing wrong. Initially I was trying to read it in as the original xlsx file but then I just converted it to a csv and it started reading. I assume I can't make the graph because the column numbers are decimals and the error that spits out claims that the column numbers don't match up. So how else should I be doing this to actually start making some progress?

Plotly express doesn't load and refuse to connect

I have this simple program that should display a pie chart, but whenever I run the program, it opens a page on Chrome and just keeps loading without any display, and sometimes it refuses to connect. How do I solve this?
P.S: I would like to use it offline, and I'm running it using cmd on windows10
import pandas as pd
import numpy as np
from datetime import datetime
import plotly.express as px
def graph(dataframe):
figure0 = px.pie(dataframe,values=dataframe['POPULATION'],names=dataframe['CONTINENT'])
figure0.show()
df = pd.DataFrame({'POPULATION':[60,17,9,13,1],'CONTINENT':['Asia','Africa','Europe','Americas','Oceania']})
graph(df)
Disclaimer: I extracted this answer from the OPs question. Answers should not be contained in the question itself.
Answer provided by g_odim_3:
So instead of figure0.show(), I used figure0.write_html('first_figure.html', auto_open=True) and it worked:
import pandas as pd
import numpy as np
from datetime import datetime
import plotly.express as px
def graph(dataframe):
figure0 = px.pie(dataframe,values=dataframe['POPULATION'],names=dataframe['CONTINENT'],title='Global Population')
# figure0.show()
figure0.write_html('first_figure.html', auto_open=True)
df = pd.DataFrame({'POPULATION':[60,17,9,13,1],'CONTINENT':['Asia','Africa','Europe','Americas','Oceania']})
graph(df)
I'm 99% sure that this is a version issue. It's a long time since you needed an internet connection to build Plotly figures. Follow the instructions here on how to upgrade your system. I've tried your exact code on my end at it produces the following plot:
I'm on Plotly 5.2.2. Run import plotly and plotly.__version__ to check the version on your end.

Is it possible to reuse import code in Python?

There are several imports that are common between some files in my project. I would like to reuse this code, concentrating it in a unique file and have just one import in the other files. Is it possible?
Or is there another way not to replicate the desired import list in multiple files?
Yes its possible. You can create a Python file with imports and then import that Python file in your code.
For Eg:
ImportFile.py
import pandas as pd
import numpy as np
import os
MainCode.py:
from ImportFile import *
#Here you can use pd,np,os and complete your code
OR
from ImportFile import pd,np
#And then use pd and np

Using Dask with Python causes issues when running Pandas code

I am trying to work with Dask because my dataframe has become large and that pandas by itself can't simply process it. I read my dataset in as follows and get the following result that looks odd, not sure why its not outputting the dataframe:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import dask.bag as db
import json
%matplotlib inline
Leads = db.read_text('Leads 6.4.18.txt')
Leads
This returns (instead of my pandas dataframe):
dask.bag<bag-fro..., npartitions=1>
Then when I try to rename a few columns:
Leads_updated = Leads.rename(columns={'Business Type':'Business_Type','Lender
Type':'Lender_Type'})
Leads_updated
I get:
AttributeError: 'Bag' object has no attribute 'rename'
Can someone please explain what I am not doing correctly. The ojective is to just use Dask for all these steps since it is too big for regular Python/Pandas. My understanding is the syntax used under Dask should be the same as Pandas.

Categories