So I'm trying to create a histogram using Python, and was wondering why my code wasn't working.
import plotly.express as px
data = pd.read_csv("/kaggle/input/dataset1csv/test.csv")
print(data.head())
data = data
figure = px.histogram(data, x = "sex",
color = "age_approx",
title= "Datadistribution")
figure.show()
After running this, I just get some data printed out to me.
What am I doing wrong?
Related
I am new to Python and Pandas so any help is much appreciated.
I am trying to make the graph below interactive, it would also be good to be able to choose which attributes show rather than them all.
Here is what I have so far
df.set_index('Current Year').plot(rot=45)
plt.xlabel("Year",size=16)
plt.ylabel("",size=16)
plt.title("Current year time series plot", size=18)
I know that i need to import the following import plotly.graph_objects as go but no idea how to implement this with the above time series graph. Thanks
EDIT
I am getting this error when trying to enter my plotted data.
All you need is:
df.plot()
As long as you import the correct libraries and set plotly as the plotting backend for pandas like this:
import pandas as pd
pd.options.plotting.backend = "plotly"
df = pd.DataFrame({'year':['2020','2021','2022'], 'value':[1,3,2]}).set_index('year')
fig = df.plot(title = "Current year time series plot")
fig.show()
Plot:
Complete code:
import pandas as pd
pd.options.plotting.backend = "plotly"
df = pd.DataFrame({'year':['2020','2021','2022'], 'value':[1,3,2]}).set_index('year')
fig = df.plot(title = "Current year time series plot")
fig.show()
I am trying to build slice by slice a heatmap. How can I update the data present in the graph without generating a full figure? The following code each time produces a new plot. I tried to use fig.update_traces() but it didn’t work.
What am I missing?
Thanks
import plotly.express as px
import pandas as pd
import time
df = pd.DataFrame(np.random.rand(1,100))
for i in range(0,10):
df = df.append(pd.DataFrame(np.random.rand(1,100)), ignore_index = True)
time.sleep(1)
fig = px.imshow(df)
fig.show()
I am trying to make this bargraph appear in the python run screen. But for some reason, it does not show the graph on the screen. However, if I put this code on the google online coding website, it shows the bar graph fine. Can anyone let me know what the problem is?
import pandas as pd
import plotly.express as px
df = pd.read_csv("Diversity2.csv")
df = df.groupby(['School','White'], as_index=False)[['School']].sum()
# df = df.groupby(['School','White']).sum().plot(kind='bar')
df['White']=df['White'].astype(float)
# df.plot(kind='bar', x='School', y='White', figsize=(20,10))
barchart = px.bar(
data_frame=df,
x="School",
y="White",
color="School",
opacity=0.9,
orientation="v",
barmode='overlay')
barchart.show()
I am working on a choropleth map and it is showing a white page instead of the map as shown here
https://i.stack.imgur.com/boYKY.png
I have both the geojson and the excel file downloaded in the same folder.
geojson https://drive.google.com/file/d/1N-rp9yHqE1Rzn2VxoAAweJ8-5XIjk61j/view?usp=sharing
excel https://docs.google.com/spreadsheets/d/1NKeUg20XxJe0jccMgjj9pMxrTIIWeuQk/edit?usp=sharing&ouid=100050178655652050254&rtpof=true&sd=true
Here is my code
import json
import numpy as np
import pandas as pd
import plotly.express as px
df = pd.read_excel('kraje.xlsx', sheet_name='List1')
regions_json = json.load(open("KRAJE.geojson", "r"))
fig = px.choropleth(df,
locations="K_KRAJ",
geojson=regions_json,
color='OB1506')
fig.show()
The console of my browser in which I am viewing the map shows
this
I am using a jupyter notebook in the brave browser.
Can anyone please help me solve this? Thanks
EDIT:
I found the correct geojson file but now I have a different issue. Only one region is colored and not even in the correct color and the rest of the map even outside of my regions is colored in the same color. When I hover over my regions I can see that they are in the correct place but with a wrong color. And I also have no idea why the code colored the whole map and not only the regions from the geojson file. here is an image of the output
new (should be correct) geojson https://drive.google.com/file/d/1S03NX5Q0pqgAsbJnjqt8O5w8gUHH1rt_/view?usp=sharing
import json
import numpy as np
import pandas as pd
import plotly.express as px
df = pd.read_excel('kraje.xlsx', sheet_name='List1')
regions_json = json.load(open("KRAJE.geojson", "r"))
for feature in regions_json['features']:
feature["id"] = feature["properties"]["K_KRAJ"]
fig = px.choropleth(df,
locations="K_KRAJ",
geojson=regions_json,
color='OB1506')
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
SOLUTION
Thanks to Rob Raymond it finally works. There was an issue with the geojson file. I also had a ton of problems installing geopandas and the only tutorial that actually worked was installing each package separately (https://stackoverflow.com/a/69210111/17646343)
there are multiple issues with your geojson
need to define the CRS, it's clearly not epsg:4326. Appears to be UTM CRS for Czech Republic
even with this there are invalid polygons
with valid geojson, a few points you have missed
locations needs to be common across your data frame and geojson
featureidkey needs to be used to define you are joining on name
import json
import numpy as np
import pandas as pd
import plotly.express as px
import geopandas as gpd
files = {
f.suffix: f
for p in ["KRAJE*.*", "KRAJE*.*".lower()]
for f in Path.home().joinpath("Downloads").glob(p)
}
# df = pd.read_excel('kraje.xlsx', sheet_name='List1')
df = pd.read_excel(files[".xlsx"], sheet_name="List1")
# regions_json = json.load(open("KRAJE.geojson", "r"))
regions_json = json.load(open(files[".geojson"], "r"))
regions_json = (
gpd.read_file(files[".geojson"])
.dropna()
.set_crs("EPSG:32633", allow_override=True)
.to_crs("epsg:4326")
.__geo_interface__
)
fig = px.choropleth(
df,
locations="N_KRAJ",
featureidkey="properties.name",
geojson=regions_json,
color="OB1506",
)
fig.update_geos(fitbounds="locations", visible=True)
fig
updated
there are still issues with your geojson. Have fixed it using geopandas and buffer(0) (see Fix invalid polygon in Shapely)
with this and change to plotly parameters I can now generate a figure
import json
import numpy as np
import pandas as pd
import plotly.express as px
import geopandas as gpd
from pathlib import Path
files = {
f.suffix: f
for p in ["KRAJ_*.*", "KRAJE*.*".lower()]
for f in Path.home().joinpath("Downloads").glob(p)
}
# df = pd.read_excel('kraje.xlsx', sheet_name='List1')
df = pd.read_excel(files[".xlsx"], sheet_name="List1")
# regions_json = json.load(open("KRAJE.geojson", "r"))
regions_json = json.load(open(files[".json"], "r"))
# geometry is still invalid!!! force it to valid by buffer(0)
regions_json = gpd.read_file(files[".json"]).assign(geometry=lambda d: d["geometry"].buffer(0)).__geo_interface__
fig = px.choropleth(
df,
locations="K_KRAJ",
featureidkey="properties.K_KRAJ",
geojson=regions_json,
color="OB1506",
)
fig.update_geos(fitbounds="locations", visible=True)
fig
I have a bar chart in plotly that I have produced, however, it is not in any type of order. How would I sort to ascending or descending?
What I am doing:
fig = px.bar(data, x='Old_SKU', y='u_power')
fig = data.sort_values('u_power', ascending=True)
fig.show()
I'm not sure what your desired output is, or what your data looks like. In any case fig in plotly terms is normaly a plotly figure object. When you're running fig = data.sort_values('u_power', ascending=True) you're not building a figure, but sorting a dataframe. So far I can only imagine that you'd like to sort a dataset that looks like this:
... into this:
Or maybe you're expecting a continuous increase or decrease? In that case you will have to share a dataset. Nevertheless, with a few tweaks depending on your dataset, the following snippet should not be far from a working solution:
import plotly.express as px
import numpy as np
import pandas as pd
var = np.random.randint(low=2, high=6, size=20).tolist()
data = pd.DataFrame({'u_power':var,
'Old_SKU':np.arange(0, len(var))})
# fig = px.bar(data, x='Old_SKU', y='u_power', barmode='stack')
fig = px.bar(data.sort_values('u_power'), x='Old_SKU', y='u_power', barmode='stack')
fig.show()