Initially, I was getting "list object is not callable" error but after "importing list " new error came in the picture as shown below.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
imort list
data_cols=['user id','movie id','rating','timestamp']
item_cols=['movie id','movie title','release date','video release date','IMDb URL','unknown','Action','Adventure','Animation','Childrens','Comedy','Crime','Documentary','Drama','Fantasy','Film-Noir','Horror','Musical','Mystery','Romance ','Sci-Fi','Thriller','War' ,'Western']
user_cols = ['user id','age','gender','occupation','zip code']
#importing the data files onto dataframes
users=pd.read_csv('u.user',sep='|',names=user_cols,encoding='latin-1')
item=pd.read_csv('u.item',sep='|',names=item_cols,encoding='latin-1')
data=pd.read_csv('u.data',sep='\t',names=data_cols,encoding='latin-1')
dataset=pd.merge(pd.merge(item,data),users)
#print(dataset.head())
rating_total=dataset.groupby('movie title').size()
rating_mean=(dataset.groupby('movie title'))['movie title','rating']
rating_mean=rating_mean.mean()
rating_total=pd.DataFrame({'movie title':rating_total.index,'total
ratings':rating_total.values})
rating_mean['movie title']=rating_mean.index
final=pd.merge(rating_mean,rating_total).sort_values(by='total
ratings',ascending=False)
pop=final[:300].sort_values(by='rating',ascending=False)
pop=pop['movie title']
pop1=list(pop.head(10))
Output
TypeError Traceback (most recent call last)
<ipython-input-57-0b36af3a9876> in <module>
30 pop=pop['movie title']
31 #print(pop.head())
---> 32 pop1=list(pop.head(10))
TypeError: 'module' object is not callable
Related
I am working on openAI, and stuck I have tried to sort this issue on my own but didn't get any resolution. I want my code to run the sentence generation operation on every row of the Input_Description_OAI column and give me the output in another column (OpenAI_Description). Can someone please help me with the completion of this task. I am new to python.
The dataset looks like:
import os
import openai
import wandb
import pandas as pd
openai.api_key = "MY-API-Key"
data=pd.read_excel("/content/OpenAI description.xlsx")
data
data["OpenAI_Description"] = data.apply(lambda _: ' ', axis=1)
data
gpt_prompt = ("Write product description for: Brand: COILCRAFT ; MPN: DO5010H-103MLD..")
response = openai.Completion.create(engine="text-curie-001", prompt=gpt_prompt,
temperature=0.7, max_tokens=1000, top_p=1.0, frequency_penalty=0.0, presence_penalty=0.0)
print(response['choices'][0]['text'])
data['OpenAI_Description'] = data.apply(gpt_prompt,response['choices'][0]['text'], axis=1)
I got the error after execution on first row as:
---------------------------------------------------------------------------
TypeError
Traceback (most recent call last)
<ipython-input-32-c798fbf9bc16> in <module>
15 print(response['choices'][0]['text'])
16 #data.add_data(gpt_prompt,response['choices'][0]['text'])
---> 17 data['OpenAI_Description'] = data.apply(gpt_prompt,response['choices'][0]['text'], axis=1)
18
TypeError: apply() got multiple values for argument 'axis'
I wish to use the class variable in order to predict the accuracy of my kmeans model. So i need it to be purely as an integer 1 or 2 for which i think i will need to convert it to string first. But im getting an error when using the decode function
from scipy.io import arff
import pandas as pd
import numpy as np
import seaborn as sb
import matplotlib.pyplot as plt
data = arff.loadarff('bnknote.arff')
df = pd.DataFrame(data[0])
df.head()
V1 V2 V3 V4 Class
0 3.62160 8.6661 -2.8073 -0.44699 b'1'
1 4.54590 8.1674 -2.4586 -1.46210 b'1'
2 3.86600 -2.6383 1.9242 0.10645 b'1'
3 3.45660 9.5228 -4.0112 -3.59440 b'1'
4 0.32924 -4.4552 4.5718 -0.98880 b'1'
import codecs
Class=df['Class']
Class=codecs.decode(Class,'UTF-8')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/opt/conda/lib/python3.6/encodings/utf_8.py in decode(input, errors)
15 def decode(input, errors='strict'):
---> 16 return codecs.utf_8_decode(input, errors, True)
17
TypeError: a bytes-like object is required, not 'Series'
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-20-face6f646db6> in <module>()
1 import codecs
2 Class=df['Class']
----> 3 Class=codecs.decode(Class,'UTF-8')
TypeError: decoding with 'UTF-8' codec failed (TypeError: a bytes-like object is required, not 'Series')
TypeError: decoding with 'UTF-8' codec failed (TypeError: a bytes-like object is required, not 'Series')
means you have tried decoding whole column of pandas.DataFrame at once. You do not have to use codecs here, it is sufficient to use .astype as int has no problem when working with ASCII-encoded representation of integer, consider following simple example
import pandas as pd
df = pd.DataFrame({'x':['A','B','C'],'y':[b'1',b'0',b'1']})
df['y'] = df['y'].astype(int)
print(df)
output
x y
0 A 1
1 B 0
2 C 1
I am trying datatable in python for first time and was following examples from this link: Grouping with by() to explore more on datatables, but am getting a NameError when attempted below code.
import numpy as np
import pandas as pd
import datatable as dt
df = dt.Frame([[1, 1, 5], [2, 3, 6]], names=['A', 'B'])
df[:, update(filter_col = count()), by('A')]
Error:
--------------------------------------------------------------------------- NameError Traceback (most recent call
last) ~\AppData\Local\Temp/ipykernel_2040/2701559568.py in <module>
----> 1 df[:, update(filter_col = count()), by('A')]
NameError: name 'update' is not defined
This is working fine in the example shown in above link but I am not sure why I am getting this error. Also tried help on this:
help(update())
But got this error:
--------------------------------------------------------------------------- NameError Traceback (most recent call
last) ~\AppData\Local\Temp/ipykernel_2040/1402169417.py in <module>
----> 1 help(update())
NameError: name 'update' is not defined
You're not using the right name to access update(). The very first example has:
from datatable import (dt, f, by, ifelse, update, sort,
count, min, max, mean, sum, rowsum)
Meaning that they can refer to datatable.update as just update.
However your import is like:
import datatable as dt
Meaning that to access datatable.update, you have to use dt.update. Same with datatable.count and datatable.by:
So the solution would look like:
df[:, dt.update(filter_col = dt.count()), dt.by('A')]
The function update is not imported directly, but through datatable (dt). You can access it withdt.update.
Problem: Importing an python file (EDA.py) into a jupyter notebook.The python file uses pandas and has an "Import pandas as pd" in it. But in Jupyter I get the error that pd is not defined.
Python file:
<EDA.py>
def eda_df(df):
import pandas as pd
print('=================Unique Values============================')
unique_series = df.apply(pd.Series.nunique).sort_values()
print(unique_series)
Jupyter Notebook:
import EDA
train = pd.read_csv(r'.\kaggle\housing\house-prices-advanced-regression-techniques\train.csv')
eda_df(train)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-269-86ee9695b171> in <module>
----> 1 eda_df(train)
~\iCloudDrive\Adnan PC\Data Science\Jupyter NB\EDA.py in eda_df(df)
13 print('Features missing more than 40% data: ',len(missing_data_list))
14 print(missing_data_list)
---> 15 print('=================Unique Values============================')
16 unique_series = df.apply(pd.Series.nunique).sort_values()
17 unique_list = unique_series[unique_series<15].index.to_list()
NameError: name 'pd' is not defined
You just need to import pandas as pd:
import pandas as pd
def eda_df(df):
unique_series = df.apply(pd.Series.nunique).sort_values()
return (unique_series)
Hi I am having problems with this code:
**import numpy as np
# Summarize the data about minutes spent in the classroom
#total_minutes = total_minutes_by_account.values()
total_minutes = list(total_minutes_by_account.values())
type(total_minutes)
# Printing out the samething converting to a list
print('Printing out the samething converting to a list ')
print(type(total_minutes))
print ('Mean:', np.mean(total_minutes))
print ('Standard deviation:', np.std(total_minutes))
print ('Minimum:', np.min(total_minutes))
print ('Maximum:', np.max(total_minutes))**
The error I get is:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-93-945375bf6098> in <module>()
3 # Summarize the data about minutes spent in the classroom
4 #total_minutes = total_minutes_by_account.values()
----> 5 total_minutes = list(total_minutes_by_account.values())
6 type(total_minutes)
7 #print(total_minutes)
AttributeError: 'list' object has no attribute 'values'
I really would lie to know how I can make this work, I can do it with pandas converitng it to a numpy array and the getting values for the statistics I want with numpy