I have a series of number:
5138
22498
42955
I would like add 3 numbers (decrease 1 each) to each number above:
5135
5136
5137
5138
22495
22496
22497
22498
42952
42953
42954
42955
How to do that? Thanks.
Use list comprehension with flatten new values created by range:
s = pd.Series([5138,22498,42955])
N = 3
a = pd.Series([y for x in s for y in range(x-N, x+1)])
print (a)
0 5135
1 5136
2 5137
3 5138
4 22495
5 22496
6 22497
7 22498
8 42952
9 42953
10 42954
11 42955
dtype: int64
Or is possible create ranges and flatten by Series.explode, last Series.reset_index is used for default index:
N = 3
a = s.apply(lambda x: range(x-N, x+1)).explode().reset_index(drop=True)
I did this on python:
numbers = [5138, 22498, 42955]
for number in numbers:
for i in reversed(range(number, number-4, -1)):
print(i)
Hope this helps :)
Related
How do I convert this column of values, mostly integers, and some strings to all integers.
The column looks like this,
x1
___
128455551
92571902
123125
985166
np.NaN
2241
1.50000MMM
2.5255MMM
1.2255MMMM
np.NaN
...
And I want it to look like this, where the rows with MMM, the characters are dropped and the number is multiplied by a billion (10**9) and converted to integers.
The rows where there are MMMM, the characters are dropped and the number is multiplied by a trillion (10**12) and converted to integers.
Basically each M means 1,000. There are other columns so I cannot drop the np.NaN.
x1
___
128455551
92571902
123125
985166
np.NaN
2241
1500000000
2525500000
1225500000000
np.NaN
...
I tried this,
df['x1'] =np.where(df.x1.astype(str).str.contains('MMM'), (df.x1.str.replace('MMM', '').astype(float) * 10**9).astype(int), df.x1)
When I do it with just the 2 rows it works fine, but when I do it with the whole dataframe I get this error, IntCastingNaNError: Cannot convert non-finite values (NA or inf) to integer.
How do i fix it?
A possible solution:
def f(x):
if isinstance(x, str):
ms = x.count('M')
return float(x.replace('M' * ms, '')) * 10**(3 * ms)
else:
return x
df['x1'] = df['x1'].map(f).astype('Int64')
Output:
x1
0 128455551
1 92571902
2 123125
3 985166
4 <NA>
5 2241
6 1500000000
7 2525500000
8 1225500000000
9 <NA>
You can also try the following solution:
import numpy as np
(df.x1.str.extract('([^M]+)(M+)?').replace({np.NaN : None})
.assign(power = lambda x: 10 ** (3 * x.loc[:, 1].str.count('M').fillna(0)))
.pipe(lambda d: d.loc[:, 0].replace({'np.NaN' : None}).astype(float).mul(d.power)))
0 1.284556e+08
1 9.257190e+07
2 1.231250e+05
3 9.851660e+05
4 NaN
5 2.241000e+03
6 1.500000e+09
7 2.525500e+09
8 1.225500e+12
9 NaN
dtype: float64
When considering string values containing M, sanitized value can be multiplied on 1000 as much as M occurrences (according to your condition "Basically each M means 1,000"):
df['x1'] = np.where(df.x1.str.contains('M'),
(df.x1.str.replace('M', '').astype(float) \
* pow(1000, df.x1.str.count('M'))).astype('Int64'), df.x1)
print(df)
x1
0 128455551
1 92571902
2 123125
3 985166
4 NaN
5 2241
6 1500000000
7 2525500000
8 1225500000000
9 NaN
I'm attempting to write a program called multChart(x,y) that prints a multiplication table based on two inputs, one specifying the number of rows to print and another specifying the number of columns. So it would look like this:
>>> multChart(4,5):
1: 1 2 3 4 5
2: 2 4 6 8 10
3: 3 6 9 12 15
4: 4 8 12 16 20
Here's what my current code looks like:
def multChart(x,y):
for i in range(1,x+1):
print(i,':',i*1,i*2,i*3,i*4,i*5)
I'm totally stuck on how to implement the y value. I also know there should be a better way of printing the multiplication instead of i * multiples of five, but I'm not sure what loop to use. Any help would be greatly appreciated.
You need another loop inside your print for looping over the y range:
def multChart(x, y):
for i in range(1, x+1):
print(i, ':', *[i * z for z in range(1, y+1)])
def multChart(x,y):
for i in range(1,x+1):
print(i, ':', end=" ")
for j in range(1,y+1):
print(i*j, end =" ")
print()
multChart(4,5)
produces
1 : 1 2 3 4 5
2 : 2 4 6 8 10
3 : 3 6 9 12 15
4 : 4 8 12 16 20
You can use a second for loop for the second index. Also, note that you can use end in the print statement.
def multChart(x,y):
for i in range(1,x+1):
print(i,':',*list(map(lambda y: i*y,list(range(1,y+1 ) ) ) ) )
multChart(4,5)
I want to print the following sequence of integers in a pyramid (odd rows sorted ascending, even rows sorted descending). If S=4, it must print four rows and so on.
Expected output:
1
3 2
4 5 6
10 9 8 7
I tried out the following code but it produced the wrong output.
S=int(input())
for i in range(1,S+1):
y=i+(i-1)
if i%2!=0:
print(*range(i,y+1))
elif i%2==0:
print(*range(y,i-1,-1))
# Output:
# 1
# 3 2
# 3 4 5
# 7 6 5 4
You need some way of either keeping track of where you are in the sequence when printing each row, generating the entire sequence and then chunking it into rows, or... (the list of possible approaches goes on and on).
Below is a fairly simple approach that just keeps track of a range start value, calculates the range stop value based on the row number, and reverses even rows.
rows = int(input())
start = 1
for n in range(1, rows + 1):
stop = int((n * (n + 1)) / 2) + 1
row = range(start, stop) if n % 2 else reversed(range(start, stop))
start = stop
print(*row)
# If rows input is 4, then output:
# 1
# 3 2
# 4 5 6
# 10 9 8 7
Using itertools.count and just reversing the sublist before printing on even rows
from itertools import count
s = 4
l = count(1)
for i in range(1, s+1):
temp = []
for j in range(i):
temp.append(next(l))
if i % 2:
print(' '.join(map(str, temp)))
else:
print(' '.join(map(str, temp[::-1])))
1
3 2
4 5 6
10 9 8 7
Assume an easy dataframe, for example
A B
0 1 0.810743
1 2 0.595866
2 3 0.154888
3 4 0.472721
4 5 0.894525
5 6 0.978174
6 7 0.859449
7 8 0.541247
8 9 0.232302
9 10 0.276566
How can I retrieve an index value of a row, given a condition?
For example:
dfb = df[df['A']==5].index.values.astype(int)
returns [4], but what I would like to get is just 4. This is causing me troubles later in the code.
Based on some conditions, I want to have a record of the indexes where that condition is fulfilled, and then select rows between.
I tried
dfb = df[df['A']==5].index.values.astype(int)
dfbb = df[df['A']==8].index.values.astype(int)
df.loc[dfb:dfbb,'B']
for a desired output
A B
4 5 0.894525
5 6 0.978174
6 7 0.859449
but I get TypeError: '[4]' is an invalid key
The easier is add [0] - select first value of list with one element:
dfb = df[df['A']==5].index.values.astype(int)[0]
dfbb = df[df['A']==8].index.values.astype(int)[0]
dfb = int(df[df['A']==5].index[0])
dfbb = int(df[df['A']==8].index[0])
But if possible some values not match, error is raised, because first value not exist.
Solution is use next with iter for get default parameetr if values not matched:
dfb = next(iter(df[df['A']==5].index), 'no match')
print (dfb)
4
dfb = next(iter(df[df['A']==50].index), 'no match')
print (dfb)
no match
Then it seems need substract 1:
print (df.loc[dfb:dfbb-1,'B'])
4 0.894525
5 0.978174
6 0.859449
Name: B, dtype: float64
Another solution with boolean indexing or query:
print (df[(df['A'] >= 5) & (df['A'] < 8)])
A B
4 5 0.894525
5 6 0.978174
6 7 0.859449
print (df.loc[(df['A'] >= 5) & (df['A'] < 8), 'B'])
4 0.894525
5 0.978174
6 0.859449
Name: B, dtype: float64
print (df.query('A >= 5 and A < 8'))
A B
4 5 0.894525
5 6 0.978174
6 7 0.859449
To answer the original question on how to get the index as an integer for the desired selection, the following will work :
df[df['A']==5].index.item()
Little sum up for searching by row:
This can be useful if you don't know the column values or if columns have non-numeric values
if u want get index number as integer u can also do:
item = df[4:5].index.item()
print(item)
4
it also works in numpy / list:
numpy = df[4:7].index.to_numpy()[0]
lista = df[4:7].index.to_list()[0]
in [x] u pick number in range [4:7], for example if u want 6:
numpy = df[4:7].index.to_numpy()[2]
print(numpy)
6
for DataFrame:
df[4:7]
A B
4 5 0.894525
5 6 0.978174
6 7 0.859449
or:
df[(df.index>=4) & (df.index<7)]
A B
4 5 0.894525
5 6 0.978174
6 7 0.859449
The nature of wanting to include the row where A == 5 and all rows upto but not including the row where A == 8 means we will end up using iloc (loc includes both ends of slice).
In order to get the index labels we use idxmax. This will return the first position of the maximum value. I run this on a boolean series where A == 5 (then when A == 8) which returns the index value of when A == 5 first happens (same thing for A == 8).
Then I use searchsorted to find the ordinal position of where the index label (that I found above) occurs. This is what I use in iloc.
i5, i8 = df.index.searchsorted([df.A.eq(5).idxmax(), df.A.eq(8).idxmax()])
df.iloc[i5:i8]
numpy
you can further enhance this by using the underlying numpy objects the analogous numpy functions. I wrapped it up into a handy function.
def find_between(df, col, v1, v2):
vals = df[col].values
mx1, mx2 = (vals == v1).argmax(), (vals == v2).argmax()
idx = df.index.values
i1, i2 = idx.searchsorted([mx1, mx2])
return df.iloc[i1:i2]
find_between(df, 'A', 5, 8)
timing
Or you can add a for loop
for i in dfb:
dfb = i
for j in dfbb:
dgbb = j
This way the element '4' is out of the list
I would like to create a triangle and take user input from the user. I have already created the function for creating triangles.
Function:
def triangle(rows):
PrintingList = list()
for rownum in range (rows ):
PrintingList.append([])
for iteration in range (rownum):
newValue = raw_input()
PrintingList[rownum].append(newValue)
But this takes input in this way..
3
7
4
2
4
6
8
5
9
3
I need it to take a input like this:
3
7 4
2 4 6
8 5 9 3
How do it change it to take input in this way? need some guidance on this...
for rownum in range (rows ):
PrintingList.append([])
newValues = raw_input().strip().split()
PrintingList[rownum] += newValues
I don't see here if you need or not to convert input from strings to ints.. But if you need, this will look like
for rownum in range (rows ):
PrintingList.append([])
newValues = map(int, raw_input().strip().split())
PrintingList[rownum] += newValues