I have an already running python code of the document similarity server
The code runs fine from the commandline, however when I try to run from Jupyter notebook I get the following error (You can find the code below)
AttributeError Traceback (most recent call last)
in ()
----> 1 simServer.queryIndex('National Intergroup Inc said it plans to file a registration statement')
<ipython-input-2-81df834abc60> in queryIndex(self, queryText)
58 print "Querying the INDEX"
59 doc = {'tokens': utils.simple_preprocess(queryText)}
---> 60 print(self.service.find_similar(doc, min_score=0.4, max_results=50))
At first I got a different error message where the solution was to install simserver library within jupyter notebook using the command !pip install --upgrade simserver .. but now I do not think there is a missing library that needs to be downloaded
Relevant code from jupyter notebook:
Line where the issue occurs
simServer.queryIndex('National Intergroup Inc said it plans to file a registration statement')
#!/usr/bin/env python
import pickle
import os
import re
import glob
import pprint
import json
from gensim import utils
from simserver import SessionServer
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
class SimilarityServer(object):
def __init__(self):
print "Openning sesesion and setting it to true"
self.service = SessionServer('tmp/my_server/')
self.service.open_session()
self.service.set_autosession(True)
def indexDocs(self):
print "Docs indexing and training"
#train and index
print "Training"
self.service.session.train(None,method='lsi',clear_buffer=False)
print "Indexing"
self.service.session.index(None)
def queryIndex(self,queryText):
print "Querying the INDEX"
doc = {'tokens': utils.simple_preprocess(queryText)}
print(self.service.find_similar(doc, min_score=0.4, max_results=50))
simServer = SimilarityServer()
simServer.queryIndex('National Intergroup Inc said it plans to file a registration statement')
Related
I'm trying to use https://github.com/uber/orbit. I have the package installed without errors as far as I can see.
I can run everything just fine in my terminal on my mac. However, I load up vs-code and try and run there and I receive the error below. I can't figure out what the problem is or how to configure my vs-code to avoid this issue.
This causes the following error. NO issues in terminal.
from orbit.utils.dataset import load_iclaims
from orbit.models import DLT
from orbit.diagnostics.plot import plot_predicted_data
My python version is 3.9.15 that I am running this in both in terminal and in vs-code.
If anyone has an idea, can you please be specific on steps on how to fix this, as I have been hunting in vscode for a while and can't figure it out why I don't have this issue in terminal, but only in VSCODE
OperationalError Traceback (most recent call last)
Cell In[3], line 1
----> 1 from orbit.models import DLT
File ~/opt/anaconda3/lib/python3.9/site-packages/orbit/__init__.py:3
1 __all__ = ["satellite", "tle", "utilities"]
----> 3 from .satellite import satellite
File ~/opt/anaconda3/lib/python3.9/site-packages/orbit/satellite.py:3
1 from math import degrees
----> 3 from . import tle, utilities
5 class satellite:
6 def __init__(self,catnr):
File ~/opt/anaconda3/lib/python3.9/site-packages/orbit/tle.py:10
6 import ephem
8 from . import utilities
---> 10 requests_cache.install_cache(expire_after=86400)
12 def get(catnr):
13 page = html.fromstring(requests.get('http://www.celestrak.com/cgi-bin/TLE.pl?CATNR=%s' % catnr).text)
File ~/opt/anaconda3/lib/python3.9/site-packages/requests_cache/patcher.py:48, in install_cache(cache_name, backend, expire_after, urls_expire_after, allowable_codes, allowable_methods, filter_fn, stale_if_error, session_factory, **kwargs)
23 def install_cache(
24 cache_name: str = 'http_cache',
...
--> 168 self._local_context.con = sqlite3.connect(self.db_path, **self.connection_kwargs)
169 if self.fast_save:
170 self._local_context.con.execute('PRAGMA synchronous = 0;')
OperationalError: unable to open database file
I am trying to run Jupyter Notebook on AWS Lambda, created a layer with all the dependencies, the jupyter notebook is a simple code which pulls a csv file from amazon S3 and displays the data as bar graph. Below is the lambda function written to download the .ipynb file and execute the notebook with papermill. Not sure why its failing with boto3 module not found.
import json
import sys
import os
import boto3
# papermill to execute notebook
import papermill as pm
import pandas as pd
import logging
import matplotlib.pyplot as plt
sys.path.append("/opt/bin")
sys.path.append("/opt/python")
os.environ["PYTHONPATH"]='/var/task'
os.environ["PYTHONPATH"]='/opt/python/'
os.environ["MPLCONFIGDIR"] = '/tmp/'
# ipython needs a writeable directory
os.environ["IPYTHONDIR"]='/tmp/ipythondir'
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
s3 = boto3.resource('s3')
s3.meta.client.download_file('test-boto', 'testing.ipynb', '/tmp/test.ipynb')
pm.execute_notebook('/tmp/test.ipynb', '/tmp/juptest_output.ipynb', kernel_name='python3')
s3_client.upload_file('/tmp/juptest_output.ipynb', 'test-boto', 'temp/juptest_output.ipynb')
logger.info(event)
Error o/p:
START RequestId: c4da3406-c829-4f99-9fbf-b231a0d3dc06 Version: $LATEST
[INFO] 2020-08-07T17:55:16.602Z c4da3406-c829-4f99-9fbf-b231a0d3dc06 Input Notebook: /tmp/test.ipynb
[INFO] 2020-08-07T17:55:16.603Z c4da3406-c829-4f99-9fbf-b231a0d3dc06 Output Notebook: /tmp/juptest_output.ipynb
Executing: 0%| | 0/15 [00:00<?, ?cell/s][INFO] 2020-08-07T17:55:17.311Z c4da3406-c829-4f99-9fbf-b231a0d3dc06 Executing notebook with kernel: python3
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
Executing: 7%|▋ | 1/15 [00:01<00:14, 1.06s/cell]
Executing: 7%|▋ | 1/15 [00:01<00:20, 1.46s/cell]
[ERROR] PapermillExecutionError:
---------------------------------------------------------------------------
Exception encountered at "In [1]":
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-1-9c332490c231> in <module>
1 import pandas as pd
2 import os
----> 3 import boto3
4 import matplotlib.pyplot as plt
5 client = boto3.client('s3')
ModuleNotFoundError: No module named 'boto3'
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 28, in lambda_handler
pm.execute_notebook('/tmp/test.ipynb', '/tmp/juptest_output.ipynb', kernel_name='python3')
File "/opt/python/papermill/execute.py", line 110, in execute_notebook
raise_for_execution_errors(nb, output_path)
File "/opt/python/papermill/execute.py", line 222, in raise_for_execution_errors
raise errorEND RequestId: c4da3406-c829-4f99-9fbf-b231a0d3dc06
REPORT RequestId:c4da3406-c829-4f99-9fbf-b231a0d3dc06
Duration: 1624.78 ms Billed Duration: 1700 ms Memory Size: 3008 MB Max Memory Used: 293 MB
Jupyter Notebook:
import pandas as pd
import os
import boto3
import matplotlib.pyplot as plt
client = boto3.client('s3')
path = 's3://test-boto/aws-costs-Owner-Month-08.csv'
monthly_owner = pd.read_csv(path)
plt.bar(monthly_owner.Owner.head(6),monthly_owner.Amount.head(6))
plt.xlabel('Owner', fontsize=15)
plt.ylabel('Amount', fontsize=15)
plt.title('AWS Monthly Cost by Owner')
plt.show()
It looks like papermill kernel is not able to detect boto3 package even though your lambda handler is able to find it. I see you are overriding (not appending) PYTHONPATH in your lambda handler. This will remove other directories from PYTHONPATH to look for packages. Papermill child process will use this python path subsequently.
You might also find this useful. It allows you to directly deploy Jupyter Notebooks as serverless functions. It uses papermill behind the scene.
Disclaimer: I work for Clouderizer.
Iam getting the below error ModuleNotFoundError: No module named 'rasa_nlu', even though i installed rasa_nlu and rasa
My code :
from rasa_nlu.training_data import load_data
from rasa_nlu.config import RasaNLUConfig
from rasa_nlu.model import Trainer
def train_nlu(data, config, model_dir):
training_data = load_data(data)
trainer = Trainer(RasaNLUConfig(config))
trainer.train(training_data)
model_directory = trainer.persist(model_dir, fixed_model_name='weathernlu')
if __name__ == '__main__':
train_nlu('.data/data.json', 'config_spacy.json', './models/nlu')
Error message:
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-6ab2834ad68f> in <module>()
----> 1 from rasa_nlu.training_data import load_data
2 #from rasa_nlu.converters import load_data
3 from rasa_nlu.config import RasaNLUConfig
4 from rasa_nlu.model import Trainer
5
ModuleNotFoundError: No module named 'rasa_nlu'
Someone please help me
In Rasa >= 1.0, there is no separate installation of NLU. It's just rasa, and then in code you'd access rasa.nlu. Make sure you're looking at the latest version of the docs and have installed the latest version of rasa - https://rasa.com/docs/rasa/user-guide/installation/
I have a simple Python Code that uses Elasticsearch module "curator" to make snapshots.
I've tested my code locally and it works.
Now I want to run it in an AWS Lambda but I have this error :
Unable to import module 'lambda_function': No module named 'error'
Here is how I proceeded :
I created manually a Lambda and gave it a "AISA-BasicLambdaExecutionRole" role. Then I created my package with my function and the dependencies that I installed with the command :
pip install elasticsearch-curator -t /<path>/myRepository
I zipped the content (not the folder) and uploaded it in my Lambda.
I changed the Handler name to "lambda_function.lambda_handler" (my function's name is "lambda_function.py").
Did I miss something ? This is my first time working with Lambda and Python
I've seen the other questions about this error :
"errorMessage": "Unable to import module 'lambda_function'"
But nothing works for me.
EDIT :
Here is my lambda_function :
from __future__ import print_function
import curator
import time
from curator.exceptions import NoIndices
from elasticsearch import Elasticsearch
def lambda_handler(event, context):
es = Elasticsearch()
index_list = curator.IndexList(es)
index_list.filter_by_regex(kind='prefix', value="logstash-")
Number = 1
try:
while Number <= 3:
Name="snapshotLmbd_n_"+ str(Number) +""
curator.Snapshot(index_list, repository="s3-backup", name= Name , wait_for_completion=True).do_action()
Number += 1
print('Just taking a nap ! will be back soon')
time.sleep(30)
except KeyboardInterrupt:
print('My bad ! I interrupted this')
return
Thank you for your time.
Ok, since you have everything else correct, check for the permissions of the python script.
It must have executable permissions (755)
I am trying to write a Volatility plugin to extract configuration file used by a malware from memory dump. However, when I run this plugin (without 'sudo') without root privileges the plugin crashes at the line yara.compile. If I run this plugin with 'sudo', code after yara.compile line is not getting executed. I am not sure why yara.compile is causing this problem. Could someone help me with this? Following is the code I have written:
import volatility.plugins.common as common
import volatility.utils as utils
import volatility.win32.tasks as tasks
import volatility.debug as debug
import volatility.plugins.malware.malfind as malfind
import volatility.conf as conf
import volatility.plugins.taskmods as taskmods
try:
import yara
HAS_YARA = True
except ImportError:
HAS_YARA = False
YARA_SIGS = {
'malware_conf' : 'rule malware_conf {strings: $a = /<settings/ condition: $a}'
}
class malwarescan(taskmods.PSList):
def get_vad_base(self, task, address):
for vad in task.VadRoot.traverse():
if address >= vad.Start and address < vad.End:
return vad.Start
return None
def calculate(self):
if not HAS_YARA:
debug.error('Yara must be installed for this plugin')
print "in calculate function"
kernel_space = utils.load_as(self._config)
print "before yara compile"
rules = yara.compile(sources=YARA_SIGS)
print "after yara compile"
for process in tasks.pslist(kernel_space):
if "IEXPLORE.EXE".lower() == process.ImageFileName.lower():
scanner = malfind.VadYaraScanner(task=process, rules=rules)
for hit, address in scanner.scan():
vad_base_addr = self.get_vad_base(process, address)
yield process, address
def render_text(self, outfd, data):
for process, address in data:
outfd.write("Process: {0}, Pid: {1}\n".format(process.ImageFileName, process.UniqueProcessId))
So when I run this plugin with root privilege, I dont see the line "print 'after yara compile'" gets executed. What could be the reason? Thank you.
I installed "yara" through "pip". If you install yara through pip, you actually get yara-ctypes (https://github.com/mjdorma/yara-ctypes) which is a bit different than yara-python. So I uninstalled yara-ctypes and installed yara-python. Then it worked.