Getting ValueError after last iteration when using .apply() method

Getting ValueError after last iteration when using .apply() method - python

I wrote a function to query emotion analysis from senpy by passing reviews. I printed every line with index to see if it just worked fine. The dataset has 5684 rows. However, when reaching the last row I get a ValueError. I also tried to add this last review to my function and I received the corresponding values successfully.
This is the function I wrote.
def query_emotion(review):
params = {'input': review}
res = requests.get('http://senpy.gsi.upm.es/api/emotion-depechemood',
params=params)
if res.status_code != 200:
raise Exception(res)
data = json.loads(res.text)
negative_fear = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][0]['onyx:hasEmotionIntensity']
amusement = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][1]['onyx:hasEmotionIntensity']
anger = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][2]['onyx:hasEmotionIntensity']
annoyance = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][3]['onyx:hasEmotionIntensity']
indifference = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][4]['onyx:hasEmotionIntensity']
joy = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][5]['onyx:hasEmotionIntensity']
awe = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][6]['onyx:hasEmotionIntensity']
sadness = data['entries'][0]['onyx:hasEmotionSet'][0]['onyx:hasEmotion'][7]['onyx:hasEmotionIntensity']
print(X[X.Text == review].index, ": ", [negative_fear, amusement, anger, annoyance, indifference, joy, awe, sadness])
return negative_fear, amusement, anger, annoyance, indifference, joy, awe, sadness
I called the function using the .apply() method:
X['Text_negative_fear'], X['Text_amusement'], X['Text_anger'], X['Text_annoyance'], X['Text_indifference'], X['Text_joy'], X['Text_awe'], X['Text_sadness'] = X.Text.apply(query_emotion)
And these are the last lines from the output after calling the function:
Int64Index([5681], dtype='int64') : [0.0792444511959933, 0.1580643288473154, 0.11423923859401869, 0.1399028217635615, 0.13737889283844476, 0.10330318175060896, 0.1746433112249919, 0.09322377378506547]
Int64Index([5682], dtype='int64') : [0.08308025773764179, 0.1820866455048511, 0.09436993092693748, 0.12984061502089508, 0.1281518690206751, 0.10726563771574184, 0.19287900349802356, 0.08232604057523404]
Int64Index([5683], dtype='int64') : [0.09470651839679665, 0.19571514056396988, 0.10608728359324908, 0.12185687329212973, 0.12744650875201016, 0.10307696316708366, 0.150327288948556, 0.10078342328620486]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-12-6f8cc4431e17> in <module>()
----> 1 X['Text_negative_fear'], X['Text_amusement'], X['Text_anger'], X['Text_annoyance'], X['Text_indifference'], X['Text_joy'], X['Text_awe'], X['Text_sadness'] = X.Text.apply(query_emotion)
ValueError: too many values to unpack (expected 8)
Thank you for any advice that might fix this issue!

Related

Setting number of lags

I am teaching myself the use of time series in Python.
I was following
https://arch.readthedocs.io/en/latest/unitroot/unitroot_examples.html
I performed
adf = ADF(default)
print(adf.summary().as_text())
It worked perfectly.
However, when I wanted to change lags like
adf.lags = 5
print(adf.summary().as_text())
or change the type of trend as in the snip below
it gives me the following error (same for the trend):
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [11], in <module>
----> 1 adf.lags = 5
2 print(adf.summary().as_text())
AttributeError: can't set attribute
Even when I follow all instructions on the exercise page and recreate it with the data used there I cannot get the lags to change. What am I doing wrong?

Although documentation (r.t.docs) confirms as you suggested, after a quick look at ADF class inside ARCH/ unitroot/unitroot.py, I believe it takes parameter lags upon creation of the class. I tried and it works.
Instead of adf.lags = 5 you may use version below.
adf = ADF(default, lags = 5 )
class ADF:
def __init__(
self,
y: ArrayLike,
lags: Optional[int] = None,
trend: UnitRootTrend = "c",
max_lags: Optional[int] = None,
method: Literal["aic", "bic", "t-stat"] = "aic",
low_memory: Optional[bool] = None,
) -> None:

How to get a list of all tokens from Lucene 8.6.1 index using PyLucene?

I have got some direction from this question. I first make the index like below.
import lucene
from org.apache.lucene.analysis.standard import StandardAnalyzer
from org.apache.lucene.index import IndexWriterConfig, IndexWriter, DirectoryReader
from org.apache.lucene.store import SimpleFSDirectory
from java.nio.file import Paths
from org.apache.lucene.document import Document, Field, TextField
from org.apache.lucene.util import BytesRefIterator
index_path = "./index"
lucene.initVM()
analyzer = StandardAnalyzer()
config = IndexWriterConfig(analyzer)
if len(os.listdir(index_path))>0:
config.setOpenMode(IndexWriterConfig.OpenMode.APPEND)
store = SimpleFSDirectory(Paths.get(index_path))
writer = IndexWriter(store, config)
doc = Document()
doc.add(Field("docid", "1", TextField.TYPE_STORED))
doc.add(Field("title", "qwe rty", TextField.TYPE_STORED))
doc.add(Field("description", "uio pas", TextField.TYPE_STORED))
writer.addDocument(doc)
writer.close()
store.close()
I then try to get all the terms in the index for one field like below.
store = SimpleFSDirectory(Paths.get(index_path))
reader = DirectoryReader.open(store)
Attempt 1: trying to use the next() as used in this question which seems to be a method of BytesRefIterator implemented by TermsEnum.
for lrc in reader.leaves():
terms = lrc.reader().terms('title')
terms_enum = terms.iterator()
while terms_enum.next():
term = terms_enum.term()
print(term.utf8ToString())
However, I can't seem to be able to access that next() method.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-47-6515079843a0> in <module>
2 terms = lrc.reader().terms('title')
3 terms_enum = terms.iterator()
----> 4 while terms_enum.next():
5 term = terms_enum.term()
6 print(term.utf8ToString())
AttributeError: 'TermsEnum' object has no attribute 'next'
Attempt 2: trying to change the while loop as suggested in the comments of this question.
while next(terms_enum):
term = terms_enum.term()
print(term.utf8ToString())
However, it seems TermsEnum is not understood to be an iterator by Python.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-48-d490ad78fb1c> in <module>
2 terms = lrc.reader().terms('title')
3 terms_enum = terms.iterator()
----> 4 while next(terms_enum):
5 term = terms_enum.term()
6 print(term.utf8ToString())
TypeError: 'TermsEnum' object is not an iterator
I am aware that my question can be answered as suggested in this question. Then I guess my question really is, how do I get all the terms in TermsEnum?

I found that the below works from here and from test_FieldEnumeration() in the test_Pylucene.py file which is in pylucene-8.6.1/test3/.
for term in BytesRefIterator.cast_(terms_enum):
print(term.utf8ToString())
Happy to accept an answer that has more explanation than this.

KeyError in Python, even though the key exists

I have been scratching my head on this for a few days now and cannot seem to find a solution that works online for my problem. I am trying to access data on zendesk and go through the pagination. For some reason, I am getting a KeyError, even though I can see that the key does exist. Here is my code :
data_users2 = [[]]
while url_users:
users_pagination = requests.get(url_users,auth=(user, pwd))
data_user_page = json.loads(users_pagination.text)
print (data_user_page.keys())
for user in data_user_page['users']:
data_users2.append(user)
url = data_user_page['next_page']
Here is the output :
dict_keys(['users', 'next_page', 'previous_page', 'count'])
dict_keys(['error'])
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-22-fab95d95ddeb> in <module>
6 data_user_page = json.loads(users_pagination.text)
7 print (data_user_page.keys())
----> 8 for user in data_user_page["users"]:
9 data_users2.append(user)
10 url = data_user_page["next_page"]
KeyError: 'users'
As you can see, users does exist. same thing happens if I try to print the next_page, I get a KeyError for next_page.
Any help would be appreciated ! Thanks!

Your code is failing in its second iteration of the loop, in that moment your keys in data_user_page are just "error" as you can see in the output you have pasted
dict_keys(['users', 'next_page', 'previous_page', 'count']) <----- FIRST ITERATION
dict_keys(['error']) <---- SECOND ITERATION, THEREFORE, YOUR KEY DOES NOT EXISTS
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-22-fab95d95ddeb> in <module>
6 data_user_page = json.loads(users_pagination.text)
7 print (data_user_page.keys())
----> 8 for user in data_user_page["users"]:
9 data_users2.append(user)
10 url = data_user_page["next_page"]
KeyError: 'users'
EDIT: This could be due to the fact that you are saving the next url in a variable called url not url_users

Python Error :'numpy.float64' object is not callable

I have written a code in python to generate a sequence of ARIMA model's and determine their AIC values to compare them.The code is as below,
p=0
q=0
d=0
for p in range(5):
for d in range(1):
for q in range(4):
arima_mod=sm.tsa.ARIMA(df,(p,d,q)).fit()
print(arima_mod.params)
print arima_mod.aic()
I am getting a error message as below,
TypeError Traceback (most recent call last)
<ipython-input-60-b662b0c42796> in <module>()
8 arima_mod=sm.tsa.ARIMA(df,(p,d,q)).fit()
9 print(arima_mod.params)
---> 10 print arima_mod.aic()
global arima_mod.aic = 1262.2449736558815
11
**TypeError: 'numpy.float64' object is not callable**

Remove the brackets after print arima_mod.aic(). As I read it, arima_mod.aic is 1262.2449736558815, and thus a float. The brackets make python think it is a function, and tries to call it. You do not want that (because it breaks), you just want that value. So remove the brackets, and you'll be fine.

Numpy Array creation causing "ValueError: invalid literal for int() with base 10: 'n'"

I'm trying to run a predictive RNN from this repo https://github.com/jgpavez/LSTM---Stock-prediction. "python lstm_forex.py"
It seems to be having trouble creating an empty Numpy array
The function giving me problems, starting with the line 'days', fourth from the bottom.
def read_data(path="full_USDJPY.csv", dir="/Users/Computer/stock/LSTM2/",
max_len=30, valid_portion=0.1, columns=4, up=False, params_file='params.npz',min=False):
'''
Reading forex data, daily or minute
'''
path = os.path.join(dir, path)
#data = read_csv(path,delimiter=delimiter)
data = genfromtxt(path, delimiter=',',skip_header=1)
# Adding data bu minute
if min == False:
date_index = 1
values_index = 3
hours = data[:,2]
else:
date_index = 0
values_index = 1
dates = data[:,date_index]
print (dates)
days = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
int(str(date)[0:-2][6:8])).weekday() for date in dates])
months = numpy.array([datetime.datetime(int(str(date)[0:-2][0:4]),int(str(date)[0:-2][4:6]),
int(str(date)[0:-2][6:8])).month for date in dates])
Gives the error...
Traceback (most recent call last):
File "lstm_forex.py", line 778, in <module>
tick=tick
File "lstm_forex.py", line 560, in train_lstm
train, valid, test, mean, std = read_data(max_len=n_iter, path=dataset, params_file=params_file,min=(tick=='minute'))
File "/Users/Computer/stock/LSTM2/forex.py", line 85, in read_data
int(str(date)[0:-2][6:8])).weekday() for date in dates])
ValueError: invalid literal for int() with base 10: 'n'
I've seen a similar problem that involded putting '.strip' at the end of something. This code is so complicated I don't quite know where to put it. I tried everywhere and got usually the same error 'has no attribute' on others. Now I'm not sure what might fix it.

You're trying to int() the string 'n' in your assertion. To get the same error:
int('n')
ValueError Traceback (most recent call last)
<ipython-input-18-35fea8808c96> in <module>()
----> 1 int('n')
ValueError: invalid literal for int() with base 10: 'n'
What exactly are you trying to pull out in that list comprehension? It looks like sort of a tuple of date information, but a bit more information about what you're trying to pull out, or comments in the code explaining the logic more clearly would help us get you to the solution.
EDIT: If you use pandas.Timestamp it may do all that conversion for you - now that I look at the code it looks like you're just trying to pull out the day of the week, and the month. It may not work if it can't cnovert the timestamp for you, but it's pretty likely that it would. A small sample of the CSV data you're using would confirm easily enough.
days = numpy.array(pandas.Timestamp(date).weekday() for date in dates])
months = numpy.array(pandas.Timestamp(date).month() for date in dates])

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting ValueError after last iteration when using .apply() method - python

Related

Setting number of lags

How to get a list of all tokens from Lucene 8.6.1 index using PyLucene?

KeyError in Python, even though the key exists

Python Error :'numpy.float64' object is not callable

Numpy Array creation causing "ValueError: invalid literal for int() with base 10: 'n'"

Categories

Resources