Importing from Excel in Pandas Python [duplicate] - python

This question already has answers here:
Reading an Excel file in python using pandas
(10 answers)
Closed 3 years ago.
I am trying to import data from an excel file. I would like to import from a specific sheet and in that specific sheet only import certain columns.
The code I am using is as follows:
%matplotlib inline
import pandas as pd
import math
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="darkgrid")
# Input Portfolio data
xl_file=pd.ExcelFile('C:/Users/e877780/Desktop/DEV/S2_SK_S_06_02_input.xlsx')
s2_0602=xl_file.parse("S.06.02")[('C0080','C0090')]
My idea is to only import columns C0080 and C0090. I I just use one, it works but also does not keep the column name.
Can someone help
Thks

You can try this:
xl_file=pd.ExcelFile('C:/Users/e877780/Desktop/DEV/S2_SK_S_06_02_input.xlsx')
sheet1=xl_file.parse("S.06.02")
s2_0602 = sheet1[['C0080','C0090']]
Hope it help

Related

How to save a Matplotlib figure in a HTML file? [duplicate]

This question already has answers here:
Dynamically serving a matplotlib image to the web using python
(6 answers)
Closed 1 year ago.
So I know that i can save my diagramm with
plt.savefig('/home/pi/test.png')
But I don't really know how to save and display my diagramm with a HTML file.
For my website it would be easier use a HTML file to display my data. So is it possible to save my diagramm in HTML and how?
If it helps here is my code:
from pandas import DataFrame
import sqlite3
import matplotlib.pyplot as plt
import pandas as pd
con = sqlite3.connect("/home/pi/test2.db")
df = pd.read_sql_query("SELECT * from data4 limit 79;",con)
df.plot(x = 'zeit', y = 'temp', kind ='line')
plt.savefig('/home/pi/test.png')
#plt.show()
I'm sorry if I did some mistakes I'm a beginner:)
As far as I know that is not possible with matplotlib But it is possible with https://mpld3.github.io/ and specifically this function: https://mpld3.github.io/modules/API.html#mpld3.fig_to_html
mpld3.fig_to_html(fig, d3_url=None, mpld3_url=None, no_extras=False, template_type='general', figid=None, use_http=False, **kwargs)
Output html representation of the figure

Del Command Not Executing Properly Pandas

I have a CSV file that I am uploading into Jupyter and I am trying to delete multiple columns at once. I thought the "DEL" command would be the best but I can't get it to work.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
% matplotlib inline
tmbd_movies = pd.read_csv('tmdb-movies.csv')
tmbd_movies.head()
del(tmbd_movies['imdb_id','homepage','tagline','keywords','overview'])
The goal was to remove the following columns:
imdb_id','homepage','tagline','keywords','overview
You want this:
tmbd_movies.drop(['imdb_id','homepage','tagline','keywords','overview'], 'columns', inplace=True)

Using Dask with Python causes issues when running Pandas code

I am trying to work with Dask because my dataframe has become large and that pandas by itself can't simply process it. I read my dataset in as follows and get the following result that looks odd, not sure why its not outputting the dataframe:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import dask.bag as db
import json
%matplotlib inline
Leads = db.read_text('Leads 6.4.18.txt')
Leads
This returns (instead of my pandas dataframe):
dask.bag<bag-fro..., npartitions=1>
Then when I try to rename a few columns:
Leads_updated = Leads.rename(columns={'Business Type':'Business_Type','Lender
Type':'Lender_Type'})
Leads_updated
I get:
AttributeError: 'Bag' object has no attribute 'rename'
Can someone please explain what I am not doing correctly. The ojective is to just use Dask for all these steps since it is too big for regular Python/Pandas. My understanding is the syntax used under Dask should be the same as Pandas.

Using .plot() functionality of pandas dataframe in a script [duplicate]

This question already has answers here:
How to show matplotlib plots?
(6 answers)
Closed 5 years ago.
I have a pandas data frame wit the following properties:
Name: df_name,
Concerned Column: col1
If I want to plot a column, I can execute the following code in python shell(>>>) or ipython notebook.
>>>df_name['col1'].plot(kind='bar')
However, I want to use the same function in a script and execute from command line, the plot doesn't appear.
The script I want to write looks like the following:
import pandas as pd
.
.
.
df_name=pd.read_csv('filename')
# Printing bar chart of 'col1'
df_name['col1'].plot(kind='bar')
Any Ideas how to make it execute from a script?
I think, you need to import matplotlib.pyplot and to use show method like in example.
import pandas as pd
import matplotlib.pyplot as plt
df_name=pd.DataFrame([1,2,3])
df_name[0].plot(kind='bar')
plt.show()

Scatter_Matrix Will Not Display Using Pandas and

Working through following the Machine Learning Tutorial:
http://machinelearningmastery.com/machine-learning-in-python-step-by-step/
Specifically, Section 4.2. Unfortunately, my code is throwing an error
NameError: name 'scatter_matrix' is not defined
Here is my code:
import pandas
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)
scatter_matrix(dataset)
plt.show()
There's at least one Stack Overflow question on scatter_matrix, but I haven't able to figure out what's missing.
Pandas scatter_matrix - plot categorical variables
You will have to import it like this:
from pandas.plotting import scatter_matrix
Cause you've imported the Pandas. You could use it like below:
pd.scatter_matrix(dataset)
However, pandas.scatter_matrix() is deprecated. use pandas.plotting.scatter_matrix() instead

Categories