I am trying to get the value of "a" like a switch-case. There are only two values for "a" which is either 0 or 18. Let's take 18 as the intial value, when the length of the name is less than or equal to 5 at first name "Devin", the value is "a" is 18 first, and when the length is greater than 5, the value has to switch to 0 and stay 0 till the next length name greater than 5 doesn't show up. The output should be 18,0,0,0,18,0. The program should flip the values only when the name with length greater than 5 appears.
This is what I tried
names = ["Devin","Ashish","Rachi","David","Dohyun","Seonhwa"]
for i in range(len(names)):
#print(len(names[i]))
if len(names[i])<=5:
a =18
else:
a=0
print(a)
names = ["Devin", "Ashish", "Rachi", "David", "Dohyun", "Seonhwa"]
# Pick a start value based on the first name
a = 18 if len(names[0]) <= 5 else 0
print(a)
for name in names[1:]:
if len(name) > 5:
a = 0 if a == 18 else 18
print(a)
Related
I am new to Pandas DataFrame and was curious why my basic thinking of adding new values to a new line doesn't work here.
I also tried using different ways with .loc[], .append(), but obciously used them in an incorrect way (still plenty to learn).
Instructions
Add a column to data named length, defined as the length of each word.
Add another column named frequency, which is defined as follows for each word in data:
If count > 10, frequency is "frequent".
If 1 < count <= 10, frequency
is "infrequent".
If count == 1, frequency is "unique".
My if sentenses record for all DataFrame only by last value of dictionary like object (Counter from pandas/numpy?). Word and count values are all returned within for cycle, so I don't understand why DataFrame cannot append values each cycle
data['length'] = ''
data['frequency'] = ''
for word, count in counted_text.items():
if count > 10:
data.length = len(word)
data.frequency = 'frequent'
if 1 < count <=10:
data.length = len(word)
data.frequency = 'infrequent'
if count == 1:
data.length = len(word)
data.frequency = 'unique'
print(word, len(word), '\n')
"""
This is working code that I googled
-----------------------------------
data = pd.DataFrame({
"word": list(counted_text.keys()),
"count": list(counted_text.values())
})
data["length"] = data["word"].apply(len)
data.loc[data["count"] > 10, "frequency"] = "frequent"
data.loc[data["count"] <= 10, "frequency"] = "infrequent"
data.loc[data["count"] == 1, "frequency"] = "unique"
"""
print(data.head(), '\n')
print(data.tail())
Output:
finis 5
word count length frequency
1 the 935 5 unique
2 tragedie 3 5 unique
3 of 576 5 unique
4 hamlet 97 5 unique
5 45513 5 unique
word count length frequency
5109 shooteexeunt 1 5 unique
5110 marching 1 5 unique
5111 peale 1 5 unique
5112 ord 1 5 unique
5113 finis 1 5 unique
Assuming you have only word and count in the data dataframe and that count will not have a value of 0, you could try the following -
import numpy as np
data['length'] = data['word'].str.len()
data['frequency'] = np.where(data['count'] > 10, 'frequent',\
np.where((data['count'] > 1) & (data['count'] <= 10),\
'infrequent', 'unique'))
After #Sajan gave a valid code, I came to a conclusion, that DataFrame doesn't need for-loop at all.
In a panda series it should go through the series and stop if one value has increased 5 times. With a simple example it works so far:
list2 = pd.Series([2,3,3,4,5,1,4,6,7,8,9,10,2,3,2,3,2,3,4])
def cut(x):
y = iter(x)
for i in y:
if x[i] < x[i+1] < x[i+2] < x[i+3] < x[i+4] < x[i+5]:
return x[i]
break
out = cut(list2)
index = list2[list2 == out].index[0]
So I get the correct Output of 1 and Index of 5.
But if I use a second list with series type and instead of (19,) which has (23999,) values then I get the Error:
pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 3489660928
You can do something like this:
# compare list2 with the previous values
s = list2.gt(list2.shift())
# looking at last 5 values
s = s.rolling(5).sum()
# select those equal 5
list2[s.eq(5)]
Output:
10 9
11 10
dtype: int64
The first index where it happens is
s.eq(5).idxmax()
# output 10
Also, you can chain them together:
(list2.gt(list2.shift())
.rolling(5).sum()
.eq(5).idxmax()
)
I'm attempting to build a dataframe that adds 1 to the prior row in a column until a condition is met. In this case, I want to continue to add rows until column 'AGE' = 100.
import pandas as pd
import numpy as np
RP = {'AGE' : pd.Series([10]),
'SI' : pd.Series([60])}
RPdata = pd.DataFrame(RP)
i = RPdata.tail(1)['AGE']
RPdata2 = pd.DataFrame()
while [i < 100]:
RPdata2['AGE'] = i + 1
RPdata2['SI'] = RPdata.tail(1)['SI']
RPdata = pd.concat([RPdata, RPdata2], axis = 0)
break
print RPdata
Results
Age SI
0 10 60
0 11 60
I understand that the break statement prevents multiple iterations, but the loop appears to be infinite without it.
I'm attempting to achieve:
Age SI
0 10 60
0 11 60
0 12 60
0 13 60
0 14 60
. . 60
0 100 60
Is there a way to accomplish this with a while loop? Should I pursue a for loop solution instead?
There may be other problems, but you're going to get in an infinite loop with while [i < 100]: since a non-empty list will always evaluate to True. Change that to while (i < 100): (parens optional) and remove your break statement, which is forcing just one iteration.
I have a code which generates either 0 or 9 randomly. This code is run 289 times...
import random
track = 0
if track < 35:
val = random.choice([0, 9])
if val == 9:
track += 1
else:
val = 0
According to this code, if 9 is generated 35 times, then 0 is generated. So there is a heavy bias at the start and in the end 0 is mostly output.
Is there a way to reduce this bias so that the 9's are spread out quite evenly in 289 times.
Thanks for any help in advance
Apparently you want 9 to occur 35 times, and 0 to occur for the remainder - but you want the 9's to be evenly distributed. This is easy to do with a shuffle.
values = [9] * 35 + [0] * (289 - 35)
random.shuffle(values)
It sounds like you want to add some bias to the numbers that are generated by your script. Accordingly, you'll want to think about how you can use probability to assign a correct bias to the numbers being assigned.
For example, let's say you want to generate a list of 289 integers where there is a maximum of 35 nines. 35 is approximately 12% of 289, and as such, you would assign a probability of .12 to the number 9. From there, you could assign some other (relatively small) probability to the numbers 1 - 8, and some relatively large probability to the number 0.
Walker's Alias Method appears to be able to do what you need for this problem.
General Example (strings A B C or D with probabilities .1 .2 .3 .4):
abcd = dict( A=1, D=4, C=3, B=2 )
# keys can be any immutables: 2d points, colors, atoms ...
wrand = Walkerrandom( abcd.values(), abcd.keys() )
wrand.random() # each call -> "A" "B" "C" or "D"
# fast: 1 randint(), 1 uniform(), table lookup
Specific Example:
numbers = dict( 1=725, 2=725, 3=725, 4=725, 5=725, 6=725, 7=725, 8=725, 9=12, 0=3 )
wrand = Walkerrandom( numbers.values(), numbers.keys() )
#Add looping logic + counting logic to keep track of 9's here
track = 0
i = 0
while i < 290
if track < 35:
val = wrand.random()
if val == 9:
track += 1
else:
val = 0
i += 1
I have this dictionary
goodDay= {'Class':[1,1,0,0,0,1,0,1,0,1], 'grade':[1,0,0,1,0,1,0,1,0,1]}
I want to traverse the values of first key and also of second and put this condition to check:
If value of K2 is 1 how many times is K1 is 1 and K1 is 0
and if K2 is 0 how many times is K1 is 0 and K1 is 1.
c = [[0,0],[0,0]]
for first, second in zip(goodDay['class'], goodDay['grade']):
c[second][first] += 1
You compare the two lists in the dictionary pairwise, since each of the lists has only two values (0 and 1), this means that together(cartesian product) we can have 4 different options (00, 01, 10, 11). So we use 2*2 list to store these. And then iterate through both lists and remember the count into the list. So at the end of execution of above lines, we can read the results from list c as follows:
c[0][0] is the number of zeros in goodDay['class'] where at the same location in goodDay['grade'] is zero
c[0][1] is the number of zeros in goodDay['class'] where at the same location in goodDay['grade'] is one
c[1][0] is the number of ones in goodDay['class'] where at the same location in goodDay['grade'] is zero
c[1][1] is the number of ones in goodDay['class'] where at the same location in goodDay['grade'] is one
Code:
good_day= {'class':[1,1,0,0,0,1,0,1,0,1], 'grade':[1,0,0,1,0,1,0,1,0,1]}
grade_class = [[0,0],
[0,0]]
for grade, class_ in zip(good_day['grade'], good_day['class']):
grade_class[grade][class_] += 1
for class_ in (0,1):
for grade in (0,1):
print 'You had', grade_class[grade][class_], 'grade', \
grade, 'when your class was', class_
Output:
You had 4 grade 0 when your class was 0
You had 1 grade 1 when your class was 0
You had 1 grade 0 when your class was 1
You had 4 grade 1 when your class was 1