I have data frame which looks like:
Now I am comparing whether two columns (i.e. complaint and compliment) have equal value or not: I have written a function:
def col_comp(x):
return x['Complaint'].isin(x['Compliment'])
When I apply this function to dataframe i.e.
df.apply(col_comp,axis=1)
I get an error message
AttributeError: ("'float' object has no attribute 'isin'", 'occurred
at index 0')
Any suggestion where I am making the mistake.
isin requires an iterable. You are providing individual data points (floats) with apply and col_comp. What you should use is == in your function col_comp, instead of isin. Even better, you can compare the columns in one call:
df['Complaint'] == df['Compliment']
Related
I have a dataframe with product name and volumes. I also have two variables with per unit cost.
LVP_Cost=xxxx
HVP_Cost=xxxx
However, I would like to apply the per unit cost only to selected product types. To achive this I am using isin() within a user defined function.
I am getting and error message:
AttributeError: 'str' object has no attribute 'isin'
Here is my code;
LVP_list=['BACS','FP','SEPA']
HVP_list=['HVP','CLS']
def calclate_cost (row):
if row['prod_final'].isin(LVP_list):
return row['volume']*LVP_per_unit_cost
elif row['prod_final']==(HVP_list):
return row['volume']*HVP_per_unit_cost
else:
return 0
mguk['cost_usd']=mguk.apply(calclate_cost,axis=1)
Please could you help
row['prod_final'] is a string containing the value of that column in the current row, not a Pandas series. So use the regular in operator.
if row['prod_final'] in LVP_list:
I created a numpy matrix called my_stocks with two columns:
column 0 is made by specific objects that I defined
column 1 is made of integers
I try to sort it by column 1, but this causes ValueError: np.sort(my_stocks)
The full method:
def sort_stocks(self):
my_stocks = np.empty(2)
for data in self.datas:
if data.buflen() > self.p.period:
new_stock=[data,get_slope(self,data)]
my_stocks = np.vstack([my_stocks,new_stock])
self.stocks = np.sort(my_stocks)
It doesn't cause problems if instead of sorting the matrix, I print it. A print example is below:
[<backtrader.feeds.yahoo.YahooFinanceData object at 0x124ec2eb0>
1.1081551020408162]
[<backtrader.feeds.yahoo.YahooFinanceData object at 0x124ec2190>
0.20202275819418677]
[<backtrader.feeds.yahoo.YahooFinanceData object at 0x124eda610>
0.08357118119975258]
[<backtrader.feeds.yahoo.YahooFinanceData object at 0x124ecc400>
0.5487027829313539]]
I figured out that the method was sorting my matrix by raw and not by column.
This works even if it looks rather counterintuitive:
my_stocks[my_stocks[:, 1].argsort()]
Datset
I'm trying to check for a win from the WINorLOSS column, but I'm getting the following error:
Code and Error Message
The variable combined.WINorLOSS seems to be a Series type object and you can't compare an iterable (like list, dict, Series,etc) with a string type value. I think you meant to do:
for i in combined.WINorLOSS:
if i=='W':
hteamw+=1
else:
ateamw+=1
You can't compare a Series of values (like your WINorLOSS dataframe column) to a single string value. However you can use the following to counts the 'L' and 'W' in your columns:
hteamw = combined['WINorLOSS'].value_counts()['W']
hteaml = combined['WINorLOSS'].value_counts()['L']
If I pass the value of csv data following the way given below it produces the output.
data = pd.read_csv("abc.csv")
avg = data['A'].rolling(3).mean()
print(avg)
But if pass the value via following the way given below it produces error.
dff=[]
dff1=[]
dff1=abs(data['A'])
b, a = scipy.signal.butter(2, 0.05, 'highpass')
dff = scipy.signal.filtfilt(b, a, dff1)
avg = dff.rolling(3).mean()
print(avg)
Error is:
AttributeError: 'numpy.ndarray' object has no attribute 'rolling'
I can't figure it out, what is wrong with the code?
after applying dff = pd.Dataframe(dff)new problem arises. one unexpected zero is displayed at the top.
What is the reason behind this? How to get rid of this problem?
rolling is a function on Pandas Series and DataFrames. Scipy knows nothing about these, and generates Numpy ndarrays as output. It can accept dataframes and series as input, because the Pandas types can mimic ndarrays when needed.
The solution might be as simple as re-wrapping the ndarray as a dataframe using
dff = pd.Dataframe(dff)
Dataframe "name" contains the names of people's first 10 job employers.
I want to retrieve all the names of employers that contain "foundation".
My purpose is to better understand the employers' names that contains "foundation".
Here is the code that I screwed up:
name=employ[['nameCurrentEmployer',
'name2ndEmployer', 'name3thEmployer',
'name4thEmployer', 'name5thEmployer',
'name6thEmployer', 'name7thEmployer',
'name8thEmployer', 'name9thEmployer',
'name10thEmployer']]
print(name.loc[name.str.contains('foundation', case=False)][['Answer.nameCurrentEmployer',
'Answer.nameEighthEmployer', 'Answer.nameFifthEmployer',
'Answer.nameFourthEmployer', 'Answer.nameNinethEmployer',
'Answer.nameSecondEmployer', 'Answer.nameSeventhEmployer',
'Answer.nameSixthEmployer', 'Answer.nameTenthEmployer',
'Answer.nameThirdEmployer']])
And the error is:
AttributeError: 'DataFrame' object has no attribute 'str'
Thank you!
You get AttributeError: 'DataFrame' object has no attribute 'str', because str is an accessor of Series and not DataFrame.
From the docs:
Series.str can be used to access the values of the series as strings
and apply several methods to it. These can be accessed like
Series.str.<function/property>.
So if you have multiple columns like ["name6thEmployer", "name7thEmployer"] and so on in your DataFrame called name, then the naivest way to approach it would be:
columns = ["name6thEmployer", "name7thEmployer", ...]
for column in columns:
# for example, if you just want to count them up
print(name[name[column].str.contains("foundation")][column].value_counts())
Try :
foundation_serie=df['name'].str.contains('foundation', regex=True)
print(df[foundation_serie.values])