From geeks for geeks Frequencies of Limited Range Array Elements problem - python

This is an exercise from gfg must do question. But my code is not passing all the testcases.
Can anyone please help me out here.
Wrong Answer. !!!Wrong Answer
Possibly your code doesn't work correctly for multiple test-cases (TCs).
The first test case where your code failed:
Input:
37349
27162 38945 3271 34209 37960 17314 13663 17082 37769 2714 19280 17626 34997 33512 29275 25207 4706 12532 34909 23823 272 29688 19312 8154 5091 26858 30814 19105 14105 11303 16863 1861 2961 36601 10157 114 11491 31810 29152 2627 14327 30116 14828 37781 38925 16319 10972 4506 18669 19366 28984 6948 15170 24135 6256 38121 3835 38031 9855 25152 19132 23573 29587 1719 33440 26311 12647 23022 34206 39955 3791 18555 336 7317 12033 7278 27508 5521 24935 15078 915 35478 37253 6863 39182 23429 33867.................
Its Correct output is:
2 4 1 2 5 2 0 4 1 3 1 2 1 3 2 4 4 1 1 0 2 0 4 1 3 5 1 0 1 2 1 3 2 0 1 1 2 0 0 2 1 2 2 1 4 2 0 1 2 2 0 1 2 0 2 4 4 5 2 5 2 1 5 1 2 1 0 1 1 2 2 1 3 1 2 0 3 4 1 2 0 2 3 5 2 2 1 3 1 4 0 3 5 1 1 3 1 2 2 3 2 2 4 1 1 3 1 4 3 4 0 2 1 4 4 2 2 3 3 0 0 0 4 1 2 1 2 4 1 3 1 2 4 0 2 1 1 1 0 3 4 3 2 0 3 0 0 0 1 1 0 0 2 2 3 0 1 2 2 2 0 2 3 2 1 1 3 0 1 5 1 1 1 0 2 0 3 1 2 1 1 1 2 3 3 1 1 3 1 4 1 3 1 1 1 2 2 0 1 0 2 2 0 2 2 2 1 4 1 0 3 1 2 0 3 1 2 1 8 3 0 0 1 1 1 1 2 1 1 4 1 3 0 3 2 1 1 1 1 2 4 2 2 1 4 2 1 3 1 0 .................
And Your Code's output is:
1 0 0 0 1 0 0 1 0 2 1 2 0 0 1 1 2 1 1 0 0 0 2 1 2 2 0 0 1 0 1 2 2 0 1 1 1 0 0 1 0 1 0 0 1 0 0 0 1 1 0 1 1 0 1 2 3 3 2 2 2 1 3 1 1 1 0 1 0 1 0 0 1 1 2 0 1 3 1 0 0 2 1 4 0 1 0 1 0 3 0 1 2 0 0 1 1 1 1 3 1 0 2 1 0 3 1 3 2 2 0 2 0 2 3 1 0 0 1 0 0 0 3 0 0 0 1 3 1 2 0 1 2 0 2 0 0 0 0 2 2 2 1 0 0 0 0 0 1 0 0 0 2 2 2 0 0 0 1 2 0 0 2 1 1 1 2 0 1 3 0 1 0 0 1 0 1 1 1 0 0 0 1 1 0 1 0 2 0 2 1 3 1 0 1 2 0 0 1 0 1 1 0 1 2 1 0 3 0 0 1 0 1 0 2 0 2 1 4 2 0 0 1 0 0 1 2 0 0 1 0 2 0 2 2 1 0 0 0 0 2 0 1 0 2 1 0 2 0 0 .................
Given an array A[] of n positive integers which can contain integers from 1 to n where elements can be repeated or can be absent from the array. Your task is to count the frequency of all elements from 1 to n.
Input:
n = 5
A[] = {2,3,2,3,5}
Output:
0 2 2 0 1
Explanation:
Counting frequencies of each array element
We have:
1 occurring 0 times.
2 occurring 2 times.
3 occurring 2 times.
4 occurring 0 times.
5 occurring 1 time.
problem link :
https://practice.geeksforgeeks.org/problems/frequency-of-array-elements-1587115620/1
class Solution:
#Function to count the frequency of all elements from 1 to N in the array.
def frequencycount(self,A,N):
s = {}
for i in A:
if i in s:
s[i] += 1
else:
s[i] = 1
for i in range(1,len(A)+1):
if i in s:
A[i-1] = s[i]
else:
A[i-1]=0
return A
#{
# Driver Code Starts
#Initial Template for Python 3
import math
if __name__=="__main__":
T=int(input())
while(T>0):
N=int(input())
A=[int(x) for x in input().strip().split()]
ob=Solution()
ob.frequencycount(A,N)
for i in range (len (A)):
print(A[i], end=" ")
print()
T-=1
# } Driver Code Ends

Suppose you have a list
l = [0,2,5,6,0,5,3]
Then you can do:
res = [l.count(e) for e in range(max(l))]
This will give an array with the number of times a letter e is present in the original list.

Related

Pandas group consecutive and label the length

I want get consecutive length labeled data
a
---
1
0
1
0
1
1
1
0
1
1
I want :
a | c
--------
1 1
0 0
1 2
1 2
0 0
1 3
1 3
1 3
0 0
1 2
1 2
then I can calculate the mean of "b" column by group "c". tried with shift and cumsum and cumcount all not work.
Use GroupBy.transform by consecutive groups and then set 0 if not 1 in a column:
df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
.transform('size')
.where(df.a.eq(1), 0))
print (df)
a b c c1
0 1 1 1 1
1 0 2 0 0
2 1 3 2 2
3 1 2 2 2
4 0 1 0 0
5 1 3 3 3
6 1 1 3 3
7 1 3 3 3
8 0 2 0 0
9 1 2 2 2
10 1 1 2 2
If there are only 0, 1 values is possible multiple by a:
df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
.transform('size')
.mul(df.a))
print (df)
a b c c1
0 1 1 1 1
1 0 2 0 0
2 1 3 2 2
3 1 2 2 2
4 0 1 0 0
5 1 3 3 3
6 1 1 3 3
7 1 3 3 3
8 0 2 0 0
9 1 2 2 2
10 1 1 2 2

Cumulative sum problem considering data till last record with multiple IDs

I have a dataset with multiple IDs and dates where I have created a column for Cumulative supply in python.
My data is as follows
SKU Date Demand Supply Cum_Supply
1 20160207 6 2 2
1 20160214 5 0 2
1 20160221 1 0 2
1 20160228 6 0 2
1 20160306 1 0 2
1 20160313 101 0 2
1 20160320 1 0 2
1 20160327 1 0 2
2 20160207 0 0 0
2 20160214 0 0 0
2 20160221 2 0 0
2 20160228 2 0 0
2 20160306 2 0 0
2 20160313 1 0 0
2 20160320 1 0 0
2 20160327 1 0 0
Where Cum_supply was calculated by
idx = pd.MultiIndex.from_product([np.unique(data.Date), data.SKU.unique()])
data2 = data.set_index(['Date', 'SKU']).reindex(idx).fillna(0)
data2 = pd.concat([data2, data2.groupby(level=1).cumsum().add_prefix('Cum_')],1).sort_index(level=1).reset_index()
I want to create a Column 'True_Demand' which is max unfulfilled demand till that date max(Demand-Supply) + Cum_supply.
So my output would be something this:
SKU Date Demand Supply Cum_Supply True_Demand
1 20160207 6 2 2 6
1 20160214 5 0 2 7
1 20160221 1 0 2 7
1 20160228 6 0 2 8
1 20160306 1 0 2 8
1 20160313 101 0 2 103
1 20160320 1 0 2 103
1 20160327 1 0 2 103
2 20160207 0 0 0 0
2 20160214 0 0 0 0
2 20160221 2 0 0 2
2 20160228 2 0 0 2
2 20160306 2 0 0 2
2 20160313 1 0 0 2
2 20160320 1 0 0 2
2 20160327 1 0 0 2
So for the 3rd record(20160221) the max unfulfilled demand before 20160221 was 5. So the True demand is 5+2 = 7 despite the unfulfilled demand on that date was 1+2.
Code for the dataframe
data = pd.DataFrame({'SKU':[1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2],
'Date':[20160207,20160214,20160221,20160228,20160306,20160313,20160320,20160327,20160207,20160214,20160221,20160228,20160306,20160313,20160320,20160327],
'Demand':[6,5,1,6,1,101,1,1,0,0,2,2,2,1,1,1],
'Supply':[2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]}
,columns=['Date', 'SKU', 'Demand', 'Supply'])
Would you try this pretty fun one-liner?
(data.groupby('SKU',
as_index=False,
group_keys=False)
.apply(lambda x:
x.assign(Cum_Supply=x.Supply.cumsum())
.pipe(lambda x:
x.assign(True_Demand = (x.Demand - x.Supply + x.Cum_Supply).cummax()))))
Output:
Date SKU Demand Supply Cum_Supply True_Demand
0 20160207 1 6 2 2 6
1 20160214 1 5 0 2 7
2 20160221 1 1 0 2 7
3 20160228 1 6 0 2 8
4 20160306 1 1 0 2 8
5 20160313 1 101 0 2 103
6 20160320 1 1 0 2 103
7 20160327 1 1 0 2 103
8 20160207 2 0 0 0 0
9 20160214 2 0 0 0 0
10 20160221 2 2 0 0 2
11 20160228 2 2 0 0 2
12 20160306 2 2 0 0 2
13 20160313 2 1 0 0 2
14 20160320 2 1 0 0 2
15 20160327 2 1 0 0 2

alternative way to construct pivot table

>>> df = pd.DataFrame({'a': [1,1,1,1,2,2,2,2,3,3,3,3],
'b': [0,0,1,1,0,0,1,1,0,0,1,1,],
'c': [5,5,5,8,9,9,6,6,7,8,9,9]})
>>> df
a b c
0 1 0 5
1 1 0 5
2 1 1 5
3 1 1 8
4 2 0 9
5 2 0 9
6 2 1 6
7 2 1 6
8 3 0 7
9 3 0 8
10 3 1 9
11 3 1 9
Is there an alternative way to get this output?
>>> pd.pivot_table(df, index=['a','b'], columns='c', aggfunc=len, fill_value=0).reset_index()
c a b 5 6 7 8 9
0 1 0 2 0 0 0 0
1 1 1 1 0 0 1 0
2 2 0 0 0 0 0 2
3 2 1 0 2 0 0 0
4 3 0 0 0 1 1 0
5 3 1 0 0 0 0 2
I have a large df (>~1m lines) with len(df.c.unique()) being 134 so pivot is taking forever.
I was thinking that, given that this result is returned within a second in my actual df:
>>> df.groupby(by = ['a', 'b', 'c']).size().reset_index()
a b c 0
0 1 0 5 2
1 1 1 5 1
2 1 1 8 1
3 2 0 9 2
4 2 1 6 2
5 3 0 7 1
6 3 0 8 1
7 3 1 9 2
whether I could manually construct the desired outcome from this output above
1. Here's one:
df.groupby(by = ['a', 'b', 'c']).size().unstack(fill_value=0).reset_index()
Output:
c a b 5 6 7 8 9
0 1 0 2 0 0 0 0
1 1 1 1 0 0 1 0
2 2 0 0 0 0 0 2
3 2 1 0 2 0 0 0
4 3 0 0 0 1 1 0
5 3 1 0 0 0 0 2
2. Here's another way:
pd.crosstab([df.a,df.b], df.c).reset_index()
Output:
c a b 5 6 7 8 9
0 1 0 2 0 0 0 0
1 1 1 1 0 0 1 0
2 2 0 0 0 0 0 2
3 2 1 0 2 0 0 0
4 3 0 0 0 1 1 0
5 3 1 0 0 0 0 2

Select the values in pop dataframe which its index in index_par dataframe

I'm buliding genetic algorithm for feature selection. And I'm having some difficulties.
I have pop dataframe (population), consist of 20 individus and 9 features:
0 1 2 3 4 5 6 7 8
0 0 1 1 1 0 0 0 0 1
1 0 0 1 1 1 0 0 1 0
2 0 1 0 0 1 0 0 0 1
3 0 0 0 1 1 0 0 1 1
4 1 0 0 1 1 1 1 1 0
5 1 1 0 0 0 1 0 1 1
6 0 0 1 1 0 1 1 1 1
7 1 1 0 0 1 1 1 1 1
8 0 0 0 0 1 0 0 1 1
9 1 0 1 1 1 1 1 1 1
10 0 0 1 1 0 1 0 1 1
11 1 1 1 0 1 1 0 0 0
12 0 0 1 0 0 0 1 1 0
13 0 0 1 1 1 1 1 1 0
14 1 1 1 1 0 0 0 1 0
15 1 1 0 1 1 1 0 1 1
16 1 0 1 0 1 1 1 0 0
17 1 1 0 0 1 1 0 0 1
18 1 0 1 0 0 0 1 0 0
19 1 1 1 1 1 1 1 0 0
And I have index_par dataframe, consist of index number:
0
0 0
1 1
2 4
3 5
4 8
5 10
6 11
7 13
8 14
9 19
The index_par dataframe is the indexes of selected parent for crossover.
How can I select the values in pop dataframe which its index in index_par dataframe? Thanks in advance.
I think you need loc by column 0 of index_par:
index_par = pd.DataFrame({0:[0,1,4,5,8,10,11,13,14,19]})
df3 = pop.loc[index_par[0]]
print (df3)
0 1 2 3 4 5 6 7 8
0 0 1 1 1 0 0 0 0 1
1 0 0 1 1 1 0 0 1 0
4 1 0 0 1 1 1 1 1 0
5 1 1 0 0 0 1 0 1 1
8 0 0 0 0 1 0 0 1 1
10 0 0 1 1 0 1 0 1 1
11 1 1 1 0 1 1 0 0 0
13 0 0 1 1 1 1 1 1 0
14 1 1 1 1 0 0 0 1 0
19 1 1 1 1 1 1 1 0 0

Calculate changes of column in Pandas

I have a dataframe with data and I want calculate changes of values during time.
UserId DateTime Value
1 1 0
1 2 0
1 3 0
1 4 1
1 6 1
1 7 1
2 1 0
2 2 1
2 3 1
2 4 0
2 6 1
2 7 1
So after script execution I want to get a column with change identifier (for user and date). Only changes from 0 to 1 is interesting.
UserId DateTime Value IsChanged
1 1 0 0
1 2 0 0
1 3 0 0
1 4 1 1 <- Value was changed from 0 to 1
1 6 1 0
1 7 1 0
2 1 0 0
2 2 1 1 <- Value was changed from 0 to 1
2 3 1 0
2 4 0 0 <- Change from 1 to 0 not interesting
2 6 1 1 <- Value was changed from 0 to 1 for the user
2 7 1 0
What about this?
# df is your dataframe
df['IsChanged'] = (df['Value'].diff()==1).astype(int)
The only case you care about is Value being 0 before and 1 after, so you can simply calculate the change in value and check if it is equal to 1.
UserId DateTime Value IsChanged
0 1 1 0 0
1 1 2 0 0
2 1 3 0 0
3 1 4 1 1
4 1 6 1 0
5 1 7 1 0
6 2 1 0 0
7 2 2 1 1
8 2 3 1 0
9 2 4 0 0
10 2 6 1 1
11 2 7 1 0

Categories