Values on the X-axis shifting toward to zero - python

I generated a graph in which the values on the X-axis start from 0 and go to 1000, fifty by fifty, like 0, 50, 100, 150, ..., 900, 950, 1000. However, I want to divide the values on the X-axis by 10 (I want to convert the values on the x-axis into 0, 5, 10, 15, ..., 90, 95, 100).
Index_time is 1001
index_time = len(df.index)
ax.plot(np.arange(index_time), df["SoluteHBonds"], color="blue")
ranges=(np.arange(0,index_time,50))
ax.set_xticks(ranges)
When I divide the values on the X-axis via np.true_divide(ranges, 10), all the values on the X-axis shift toward 0
On the other hand, I tried to create a list first and then divide each element by 10 but the result is still the same.
lst_range=list(range(0,int((index_time-1)/10),5))
ax.set_xticks([time/10 for time in lst_range])
What could be the problem or what is the thing that I am missing in this case?
Thanks in advance!

Related

How do I draw seaborn boxplot with two data sets of different length?

I have two data sets, NA and HG
len(NA)=267
NA = [73,49,53...]
len(HG)=176 (HG is similar list like NA)
I want to draw the two data sets into one plot like this (I got this one by drawing them independently and then modify the plot by photoshop...which I can not do the same for another data set as they have different axis):
The seaborn adopts data in forms of numpy array and panda dataframe, which all requires the array set to be of equal length, which, in my case, does not stand, because HG has 176 data points and NA has 267.
Currently, what I did is transfer the list to pandas dataframe and then plot via
HG = sns.boxplot(data=HG, width = 0.2)
I tried HG = sns.boxplot(data=HG+NA, width = 0.2) and it returned me with an empty plot so...help please.
Thank you so much!
The following creates a DataFrame where the missing HG values are just filled with NaNs. These are ignored by the boxplot. There are many ways to generate a DataFrame with the length of the longest list. An alternative to the join shown below would be itertools.zip_longest or pd.concat as suggested by #Kenan.
NA = [73, 49, 53, 20, 20, 20, 20, 20, 20, 20, 20, 20]
HG = [73, 30, 60]
df = pd.Series(NA, name="NA").to_frame().join(pd.Series(HG, name="HG"))
sns.boxplot(data=df, width = 0.2)
Or maybe you are interested in using Holoviews, which gives you fully interactive plots in a very simple manner (when used in Jupyter with a bokeh backend). For your case that would look like the following:
import holoviews as hv
NA = [73, 49, 53, 20, 20, 20, 20, 20, 20, 20, 20, 20]
HG = [73, 30, 60]
hv.BoxWhisker(NA, label="NA") * hv.BoxWhisker(HG, label="HG")
I assume NA and HG are dataframes with one column since your plotting a box plot. So you can concat them into one df and then plot, there will be NaN for the large df but seaborn will ignore those
df = pd.concat([NA, HG], axis=1)
sns.plot(data=df)

Extracting data from a histogram with custom bins in Python

I have a data set of distances between two particles, and I want to bin these data in custom bins. For example, I want to see how many distance values lay in the interval from 1 to 2 micrometers, and so on. I wrote a code about it, and it seems to work. This is my code for this part:
#Custom binning of data
bins= [0,1,2,3,4,5,6,7,8,9,10]
fig, ax = plt.subplots(n,m,figsize = (30,10)) #using this because I actually have 5 histograms, but only posted one here
ax.hist(dist_from_spacer1, bins=bins, edgecolor="k")
ax.set_xlabel('Distance from spacer 1 [µm]')
ax.set_ylabel('counts')
plt.xticks(bins)
plt.show()
However, now I wish to extract those data values from the intervals, and store them into lists. I tried to use:
np.histogram(dist_from_spacer1, bins=bins)
However, this just gives how many data points are on each bin and the bin intervals, just like this:
(array([ 0, 0, 44, 567, 481, 279, 309, 202, 117, 0]),
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
How can I get the exact data that belong to each histogram bin?
Yes, np.histogram calculates what you need for a histogram, and hence the specific data points are not necessary, just bins' boundaries and count for each bin. However, the bins' boundaries is sufficient to acheive what you want by using np.digitizr
counts, bins = np.histogram(dist_from_spacer1)
indices = np.digitize(dist_from_spacer1, bins)
lists = [[] for _ in range(len(bins))]
[lists[i].append(x) for i, x in zip(indices, dist_from_spacer1)
In your case, the bins' boundaries are predefined, so you can use np.digitize directly

Change linear x-axis to circular x-axis

I have a series of numbers from 0 to 360 that I want to plot on the x-axis. The x-axis should be "circular", ie there should be no negative numbers before zero, but 359 instead of -1, 358 instead of -2, etc.
I would like a plot whose x-axis goes from 320 to 40, something like:
https://imgur.com/k1Ss2WJ
I don't want to manually change the data and the ticks on the axes, but I'd like to know if there is a more direct way, keeping the data as it is.
It's pretty simple. You need to use %, known as the modulo operator. This is how you'll convert your x axis numbers:
# Say your numbers are like these:
xaxis = [-1, -2, 600, 200, 360, 0, 6]
mod_xaxis = [x % 360 for x in xaxis]
# mod_xaxis is now [359, 358, 240, 200, 0, 0, 6]

The bin count displayed by pyplot hist doesn't match the actual counts?

I have a list of ints--I call it 'hours1'--ranging from 0-23. Now this list is for 'hours' of a day in a 24 hour clock. I, however, want to transform it to a different time zone (move up 7 hours). This is simple enough, and I do it so that now I have 2 lists: hours1 and hours2.
I use the following code to plot a histogram:
bins = range(24)
plt.hist(hours,bins=bins, normed=0, facecolor='red', alpha=0.5)
plt.axis([0, 23, 0, 1000])
it works perfectly for hours1. For hours2 the last value (that of the bin for 23s) is too high. This is not a counting error/transformation error because when I count my hours2 list, I get 604 23s, which matches the what I expect (having 604 16s in hours1).
so this is a very long winded way of saying, the height of the bins do not match what I get when I count the data myself...
The issue was a binning one. In short, I wasn't paying attention/thinking about what I wanted to display. More specifically this was the correct code:
bins = range(25)
plt.hist(hours, normed=0, facecolor='green', alpha=0.5, bins=bins)
plt.axis([0, 24, 0, 1500])
that is, there are 23 hours in a day, which means 24 seperate 'hour bins' counting 0. but the correct edge values for this are bins = range(25) (so that 23 gets placed in 23-24) and the correct axis is 0 to 24, (so bin 23 has width of 1). simple mistake, but i guess we've all bin there and done that?

Set Y labels from a text file - matplotlib

Using matplotlib, I have plotted the following graph from a text file. I have a series of values that are stored in another text file that I want to use the represent the Y axis on this graph.
The values will therefore be this:
Essentially, instead of displaying 0, 24, 40, ....., 16 it will show the frequencies represented in the text file, between 0, 1000, ....., 6000
Any help would be greatly thankful!

Categories