Use Matplotlib to plot 100% Stacked bar from Excel data - python

I can plot a 100% stacked bar from Excel. is it possible to achieve the same with matplotlib ?
import pandas as pd
import matplotlib.pyplot as plt
data = [
[0.4 , 0.3 , 0.2 , 0.1],
[0.5 , 0.3 , 0.6 , 0.1],
[0.1 , 0.4 , 0.2 , 0.8],
]
columns = ["A","B","C","D" ]
df = pd.DataFrame(data=data , columns=columns , index = ["Empty" , "Wrong" , "Correct"] )
df.plot(kind="barh" , stacked=True )
plt.ylabel("Percentages")
plt.show()
print (df)

Related

Unable to turn off scientific notation in Matplotlib [duplicate]

This question already has an answer here:
Prevent scientific notation
(1 answer)
Closed 2 years ago.
I am plotting a simple plot in Matplotlib, Python using the following code:
temp=np.array([1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 1. ,
1. , 1. , 1. , 1. , 0.99999999,
0.99999999, 0.99999998, 0.99999996, 0.99999993, 0.99999989,
0.99999982, 0.99999972, 0.99999958, 0.99999933, 0.99999906,
0.99999857, 0.99999791, 0.9999971 , 0.99999611, 0.99999459,
0.99999276, 0.99999014, 0.99998735, 0.99998418, 0.99997975,
0.99997557, 0.99997059, 0.9999657 , 0.99996077])
temp2=np.array([0.025, 0.05 , 0.075, 0.1 , 0.125, 0.15 , 0.175, 0.2 , 0.225,
0.25 , 0.275, 0.3 , 0.325, 0.35 , 0.375, 0.4 , 0.425, 0.45 ,
0.475, 0.5 , 0.525, 0.55 , 0.575, 0.6 , 0.625, 0.65 , 0.675,
0.7 , 0.725, 0.75 , 0.775, 0.8 , 0.825, 0.85 , 0.875, 0.9 ,
0.925, 0.95 , 0.975])
plt.plot(temp2,temp)
plt.xlabel(r'$\frac{\tau}{\tau_c}$')
plt.ylabel(r'$\frac{\alpha ^{ss}}{\alpha {_0} ^{ss}}$')
plt.ticklabel_format(style='plain')
plt.rcParams.update({'font.size': 16})
I am getting the following figure in a scientific notation despite specifying the style to be plain.
What is the issue here and how do I resolve this ?
Setting useOffset=False, will do it, like this:
plt.ticklabel_format(style='plain', useOffset=False)

Numpy array references behaving strangely when inside a for loop

I'm writing a Python implementation of Euler's method, using an example from Paul's math notes here.
I'm using a n x 3 numpy array to store the results. The goal is to have the t-value in the first column, y in the second, and the value of y' computed using the current row in the third column.
When I did the first problem listed on the page, using only ten iterations, everything behaved exactly as expected. The step size was 0.1, so the values in the first column incremented by 0.1 with each iteration of the for loop.
But now that I've copied the code over and attempted to apply it to problem 3, the first column behaves very strangely. I inputted the step size as 0.01, but for the first ten iterations it increments by 0.1, then after the tenth iteration it appears to reset to zero, then uses the expected 0.01, but later on it resets again in a similar fashion.
Here's my code:
import numpy as np
def ex3(t,y):
return y + (-0.5 * np.exp(t/2) * np.sin(5*t)) + (5 * np.exp(t/2) * np.cos(5*t))
ex3out = np.empty((0,3), float)
# Input the initial conditions and first y' computation
ex3out = np.append(ex1out, np.array([[0,0,ex3(0,0)]]), axis=0)
h = 0.01
n = 500
for i in range(1,n+1):
# Compute the new t and y values and put in 0 as a dummy y' for now
new = np.array([[ex3out[i - 1,0] + h, ex3out[i - 1,1] + h * ex3out[i - 1,2],0]])
# Append the new row
ex3out = np.append(ex3out,new,axis=0)
# Replace the dummy 0 with y' based on the new values
ex3out[i,2] = ex3(ex3out[i,0],ex3out[i,1])
And here are the first several rows of ex3out after running the above code:
array([[ 0. , 1. , -1. ],
[ 0.1 , 0.9 , 5.2608828 ],
[ 0.2 , 0.852968 , 3.37361534],
[ 0.3 , 0.8374415 , 0.6689041 ],
[ 0.4 , 0.83983378, -2.25688988],
[ 0.5 , 0.85167737, -4.67599317],
[ 0.6 , 0.86780837, -5.90918813],
[ 0.7 , 0.8851749 , -5.51040903],
[ 0.8 , 0.90205891, -3.40904125],
[ 0.9 , 0.91757091, 0.031139 ],
[ 1. , 0.93132436, 4.06022317],
[ 0. , 0. , 5. ],
[ 0.01 , 0.99 , 5.98366774],
[ 0.02 , 0.95260883, 5.92721107],
[ 0.03 , 0.88670415, 5.82942804],
[ 0.04 , 0.84413054, 5.74211536],
[ 0.05 , 0.81726488, 5.65763415],
[ 0.06 , 0.80491744, 5.57481145],
[ 0.07 , 0.80871649, 5.4953251 ],
[ 0.08 , 0.83007081, 5.42066644],
[ 0.09 , 0.8679685 , 5.34993924],
[ 0.1 , 0.9178823 , 5.2787651 ],
[ 0.11 , 0.97192659, 5.19944036],
[ 0.12 , 0.05 , 4.13207859],
[ 0.13 , 1.04983668, 4.97466166],
[ 0.14 , 1.01188094, 4.76791408],
[ 0.15 , 0.94499843, 4.5210138 ],
[ 0.16 , 0.90155169, 4.28666725],
[ 0.17 , 0.87384122, 4.0575499 ],
[ 0.18 , 0.86066555, 3.83286568],
[ 0.19 , 0.86366974, 3.61469476],
[ 0.2 , 0.88427747, 3.40492482],
[ 0.21 , 0.92146789, 3.20302701],
I wondered if this might be a floating point issue, so I tried enclosing various parts of the for loop in float() with the same results.
I must've made a typo somewhere, right?
Simpler loop:
ex3out = [[0, 0, ex3(0,0)]]
h = 0.01
n = 50
for i in range(1,n+1):
# Compute the new t and y values and put in 0 as a dummy y' for now
last = ex3out[-1]
new = [last[0] + h, last[1] + h * last[2], 0]
new[2] = ex3(new[0], new[1])
# Append the new row
ex3out.append(new)
print(np.array(ex3out)) # for pretty numpy display

Create array of lists with random values [duplicate]

This question already has answers here:
How to generate n dimensional random variables in a specific range in python
(2 answers)
Closed 4 years ago.
I’m trying to create sample data like the smpl_dt array example below. I want to create an array where each element is a list of 8 random numbers between 0.0001 and 1.
I can easily create the list of 8 random numbers between 0 and 1 using:
Code:
[rd.uniform(0.0001,1) for _ in range(8)]
But I’m having trouble creating the array. Any tips are greatly appreciated.
Sample Data:
print(smpl_dt[0:5])
array([[0.0001, 0.0001, 0.3 , 0.0001, 0.2 , 0.0001, 0.2 , 0.3 ],
[0.1 , 0.1 , 0.1 , 0.2 , 0.2 , 0.2 , 0.1 , 0.0001],
[0.1 , 0.0001, 0.2 , 0.0001, 0.1 , 0.2 , 0.0001, 0.4 ],
[0.3 , 0.0001, 0.0001, 0.2 , 0.1 , 0.2 , 0.2 , 0.0001],
[0.2 , 0.3 , 0.1 , 0.0001, 0.2 , 0.1 , 0.1 , 0.0001]])
Use the size argument. For example,
>>> arr = np.random.uniform(0.0001, 1, size=[4,8])
array([[0.67011692, 0.06662612, 0.13316262, 0.80666553, 0.88362879, 0.21492319, 0.22063457, 0.90038505],
[0.87799324, 0.6486384 , 0.27700837, 0.54103365, 0.52688455, 0.93159481, 0.09245974, 0.54593494],
[0.4680346 , 0.17802325, 0.21506341, 0.95917602, 0.20481784, 0.53165515, 0.1657028 , 0.39784648],
[0.38951888, 0.03457946, 0.90076103, 0.13769038, 0.303991 , 0.57457931, 0.64236861, 0.85915101]])
Without using numpy:
import random as rd
arr = [[rd.uniform(0.0001,1) for _ in range(8)] for _ in range(arrlen)]

Python : Plotting in the same graph

I have a dataset like this but for many ids :
Information = [{'id' : 1,
'a' : array([0.7, 0.5 , 0.20 , 048 , 0.79]),
'b' : array([0.1, 0.5 , 0.96 , 08 , 0.7]))},
{'id' : 2,
'a' : array([0.37, 0.55 , 0.27 , 047 , 0.79]),
'b' : array([0.1, 0.5 , 0.9 , 087 , 0.7]))}]
I would like to plot these in one graph a on x axis and b on y axis for many different ids.
I can make one plot by doing this?
a_info = information[1]['a']
b_info = information [2]['b]
plt.scatter(a_info , b_info)
plt.show()
but how do I do it for all plots?
e = [d['id'] for d in information]
for i in e:
a_info = information[i]['a']
b_info = information [i]['b]
plt.scatter(a_info , b_info)
plt.show()
You can loop over the ids, and create plots for each substructure:
import matplotlib.pyplot as plt
from numpy import array
information = [{'id' : 1, 'a':array([0.7, 0.5 , 0.20 , 0.48 , 0.79]), 'b':array([0.1, 0.5 , 0.96 , 0.8 , 0.7])}, {'id':2, 'a':array([0.37, 0.55, 0.27 , 0.47 , 0.79]), 'b':array([0.1, 0.5 , 0.9 , 0.87 , 0.7])}]
colors = iter(['b', 'g', 'r', 'c', 'm', 'y', 'k', 'w'])
for i in information:
plt.scatter(i['a'], i['b'], label = 'id{}'.format(i['id']), color=next(colors))
plt.legend(loc='upper left')
plt.show()
You can loop over all ids and plot them:
for i in Information:
plt.scatter(i['a'], i['b'], label=i['id'])
plt.legend()
plt.show()
Output:

How to normalize data in a text file while preserving the first variable

I have a text file with this format:
1 10.0e+08 1.0e+04 1.0
2 9.0e+07 9.0e+03 0.9
2 8.0e+07 8.0e+03 0.8
3 7.0e+07 7.0e+03 0.7
I would like to preserve the first variable of every line and to then normalize the data for all lines by the data on the first line. The end result would look something like;
1 1.0 1.0 1.0
2 0.9 0.9 0.9
2 0.8 0.8 0.8
3 0.7 0.7 0.7
so essentially, we are doing the following:
1 10.0e+08/10.0e+08 1.0e+04/1.0e+04 1.0/1.0
2 9.0e+07/10.0e+08 9.0e+03/1.0e+04 0.9/1.0
2 8.0e+07/10.0e+08 8.0e+03/1.0e+04 0.8/1.0
3 7.0e+07/10.0e+08 7.0e+03/1.0e+04 0.7/1.0
I'm still researching and reading on how to do this. I'll upload my attempt shortly. Also can anyone point me to a place where I can learn more about manipulating data files?
Read your file into a numpy array and use numpy broadcast feature:
import numpy as np
data = np.loadtxt('foo.txt')
data = data / data[0]
#array([[ 1. , 1. , 1. , 1. ],
# [ 2. , 0.09, 0.9 , 0.9 ],
# [ 2. , 0.08, 0.8 , 0.8 ],
# [ 3. , 0.07, 0.7 , 0.7 ]])
np.savetxt('new.txt', data)

Categories