I'm thinking of creating a multilingual web page with fastapi-babel.
I have configured according to the documentation.
The translation from English to French was successful.
However, I created a .po file for another language, translated it, compiled it, but the translated text does not apply.
from fastapi_babel import _
from fastapi_babel.middleware import InternationalizationMiddleware as I18nMiddleware
from fastapi_babel import Babel
from fastapi_babel import BabelConfigs
configs = BabelConfigs(
ROOT_DIR=__file__,
BABEL_DEFAULT_LOCALE="en",
BABEL_TRANSLATION_DIRECTORY="lang",
)
logger.info(f"configs: {configs.__dict__}")
babel = babel(configs)
babel.install_jinja(templates)
app.add_middleware(I18nMiddleware, babel=babel)
#app.get("/items/{id}", response_class=HTMLResponse)
async def read_item(request: Request, id: str):
babel.locale = "en"
logger.info(_("Hello World"))
babel. locale = "fa"
logger.info(_("Hello World"))
babel.locale = "ja"
logger.info(_("Hello World"))
return templates.TemplateResponse('item.html', {'request': request, 'id': id})
Above, the result will be:
INFO: Hello World
INFO: Bonjour le monde
INFO: Hello World
How can the translation be applied to languages other than French?
I was using the old version 0.0.3.
When I changed the version to the latest 0.0.8, the translation was reflected in languages other than French.
pip install fastapi-babel==0.0.8
Note
You need restart FastAPI server, after pybabel compile -d lang
If BABEL_DEFAULT_LOCALE and babel.locale is same, it doesn't translate.
babel = Babel(
configs=BabelConfigs(
ROOT_DIR=__file__,
BABEL_DEFAULT_LOCALE="en",
BABEL_TRANSLATION_DIRECTORY="lang",
)
)
babel.locale = "en"
When you update translation files.
Run this 2 commands.
pybabel extract -F babel.cfg -o messages.pot .
pybabel compile -d lang
Please don't run this command after you create .po file.
pybabel init -i messages.pot -d lang -l fa
If you run, your po file will be reset. (Delete all your translations.)
Related
I am using transformers pipeline to perform sentiment analysis on sample texts from 6 different languages. I tested the code in my local Jupyterhub and it worked fine. But when I wrap it in a flask application and create a docker image out of it, the execution is hanging at the pipeline inference line and its taking forever to return the sentiment scores.
mac os catalina 10.15.7 (no GPU)
Python version : 3.8
Transformers package : 4.4.2
torch version : 1.6.0
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
classifier = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)
results = classifier(["We are very happy to show you the Transformers library.", "We hope you don't hate it."])
print([i['score'] for i in results])
The above code works fine in Jupyter notebook and it has provided me the expected result
[0.7495927810668945,0.2365245819091797]
So now if I create a docker image with flask wrapper its getting stuck at the results = classifier([input_data]) line and the execution is running forever.
My folder structure is as follows:
- src
|-- app
|--main.py
|-- Dockerfile
|-- requirements.txt
I used the below Dockerfile to create the image
FROM tiangolo/uwsgi-nginx-flask:python3.8
COPY ./requirements.txt /requirements.txt
COPY ./app /app
WORKDIR /app
RUN pip install -r /requirements.txt
RUN echo "uwsgi_read_timeout 1200s;" > /etc/nginx/conf.d/custom_timeout.conf
And my requirements.txt file is as follows:
pandas==1.1.5
transformers==4.4.2
torch==1.6.0
My main.py script look like this :
from flask import Flask, json, request, jsonify
import traceback
import pandas as pd
from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
app = Flask(__name__)
app.config["JSON_SORT_KEYS"] = False
model_name = 'nlptown/bert-base-multilingual-uncased-sentiment'
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
nlp = pipeline('sentiment-analysis', model=model_path, tokenizer=model_path)
#app.route("/")
def hello():
return "Model: Sentiment pipeline test"
#app.route("/predict", methods=['POST'])
def predict():
json_request = request.get_json(silent=True)
input_list = [i['text'] for i in json_request["input_data"]]
results = nlp(input_list) ########## Getting stuck here
for result in results:
print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
score_list = [round(i['score'], 4) for i in results]
return jsonify(score_list)
if __name__ == "__main__":
app.run(host='0.0.0.0', debug=False, port=80)
My input payload is of the form
{"input_data" : [{"text" : "We are very happy to show you the Transformers library."},
{"text" : "We hope you don't hate it."}]}
I tried looking into the transformers github issues but couldn't find one. I execution works fine even when using the flask development server but it runs forever when I wrap it and create a docker image. I am not sure if I am missing any additional dependency to be included while creating the docker image.
Thanks.
I was having a similar issue. It seems that starting the app somehow polutes the memory of transformers models. Probably something to do with how Flask does threading but no idea why. What fixed it for me was doing the things that are causing trouble (loading the models) in a different thread.
import threading
def preload_models():
"LOAD MODELS"
return 0
def start_app():
app = create_app()
register_handlers(app)
preloading = threading.Thread(target=preload_models)
preloading.start()
preloading.join()
return app
First reply here. I would be really glad if this helps.
Flask uses port 5000. In creating a docker image, it's important to make sure that the port is set up this way. Replace the last line with the following:
app.run(host="0.0.0.0", port=int(os.environ.get("PORT", 5000)))
Be also sure to import os at the top
Lastly, in Dockerfile, add
EXPOSE 5000
CMD ["python", "./main.py"]
I am using Open Semantic Search (OSS) and I would like to monitor its processes using the Flower tool. The workers that Celery needs should be given as OSS states on its website
The workers will do tasks like analysis and indexing of the queued files. The workers are implemented by etl/tasks.py and will be started automatically on boot by the service opensemanticsearch.
This tasks.py file looks as follows:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
#
# Queue tasks for batch processing and parallel processing
#
# Queue handler
from celery import Celery
# ETL connectors
from etl import ETL
from etl_delete import Delete
from etl_file import Connector_File
from etl_web import Connector_Web
from etl_rss import Connector_RSS
verbose = True
quiet = False
app = Celery('etl.tasks')
app.conf.CELERYD_MAX_TASKS_PER_CHILD = 1
etl_delete = Delete()
etl_web = Connector_Web()
etl_rss = Connector_RSS()
#
# Delete document with URI from index
#
#app.task(name='etl.delete')
def delete(uri):
etl_delete.delete(uri=uri)
#
# Index a file
#
#app.task(name='etl.index_file')
def index_file(filename, wait=0, config=None):
if wait:
time.sleep(wait)
etl_file = Connector_File()
if config:
etl_file.config = config
etl_file.index(filename=filename)
#
# Index file directory
#
#app.task(name='etl.index_filedirectory')
def index_filedirectory(filename):
from etl_filedirectory import Connector_Filedirectory
connector_filedirectory = Connector_Filedirectory()
result = connector_filedirectory.index(filename)
return result
#
# Index a webpage
#
#app.task(name='etl.index_web')
def index_web(uri, wait=0, downloaded_file=False, downloaded_headers=[]):
if wait:
time.sleep(wait)
result = etl_web.index(uri, downloaded_file=downloaded_file, downloaded_headers=downloaded_headers)
return result
#
# Index full website
#
#app.task(name='etl.index_web_crawl')
def index_web_crawl(uri, crawler_type="PATH"):
import etl_web_crawl
result = etl_web_crawl.index(uri, crawler_type)
return result
#
# Index webpages from sitemap
#
#app.task(name='etl.index_sitemap')
def index_sitemap(uri):
from etl_sitemap import Connector_Sitemap
connector_sitemap = Connector_Sitemap()
result = connector_sitemap.index(uri)
return result
#
# Index RSS Feed
#
#app.task(name='etl.index_rss')
def index_rss(uri):
result = etl_rss.index(uri)
return result
#
# Enrich with / run plugins
#
#app.task(name='etl.enrich')
def enrich(plugins, uri, wait=0):
if wait:
time.sleep(wait)
etl = ETL()
etl.read_configfile('/etc/opensemanticsearch/etl')
etl.read_configfile('/etc/opensemanticsearch/enhancer-rdf')
etl.config['plugins'] = plugins.split(',')
filename = uri
# if exist delete protocoll prefix file://
if filename.startswith("file://"):
filename = filename.replace("file://", '', 1)
parameters = etl.config.copy()
parameters['id'] = uri
parameters['filename'] = filename
parameters, data = etl.process (parameters=parameters, data={})
return data
#
# Read command line arguments and start
#
#if running (not imported to use its functions), run main function
if __name__ == "__main__":
from optparse import OptionParser
parser = OptionParser("etl-tasks [options]")
parser.add_option("-q", "--quiet", dest="quiet", action="store_true", default=False, help="Don\'t print status (filenames) while indexing")
parser.add_option("-v", "--verbose", dest="verbose", action="store_true", default=False, help="Print debug messages")
(options, args) = parser.parse_args()
if options.verbose == False or options.verbose==True:
verbose = options.verbose
etl_delete.verbose = options.verbose
etl_web.verbose = options.verbose
etl_rss.verbose = options.verbose
if options.quiet == False or options.quiet==True:
quiet = options.quiet
app.worker_main()
I read multiple tutorials about Celery and from my understanding, this line should do the job
celery -A etl.tasks flower
but it doesnt. The result is the statement
Error: Unable to load celery application. The module etl was not found.
Same for
celery -A etl.tasks worker --loglevel=debug
so Celery itself seems to be causing the trouble, not flower. I also tried e.g. celery -A etl.index_filedirectory worker --loglevel=debug but with the same result.
What am I missing? Do I have to somehow tell Celery where to find etl.tasks? Online research doesn't really show a similar case, most of the "Module not found" errors seem to occur while importing stuff. So possibly it's a silly question but I couldn't find a solution anywhere. I hope you guys can help me. Unfortunately, I won't be able to respond until Monday though, sorry in advance.
I got same issue, I installed and configured my queue as follows, and it works.
Install RabbitMQ
MacOS
brew install rabbitmq
sudo vim ~/.bash_profile
In bash_profile add the following line:
PATH=$PATH:/usr/local/sbin
Then update bash_profile:
sudo source ~/.bash_profile
Linux
sudo apt-get install rabbitmq-server
Configure RabbitMQ
Launch the queue:
sudo rabbitmq-server
In another Terminal, configure the queue:
sudo rabbitmqctl add_user myuser mypassword
sudo rabbitmqctl add_vhost myvhost
sudo rabbitmqctl set_user_tags myuser mytag
sudo rabbitmqctl set_permissions -p myvhost myuser ".*" ".*" ".*"
Launch Celery
I would suggest to go in the folder that contains task.py and use the following command:
celery -A task worker -l info -Q celery --concurrency 5
Beware that this error means two things:
The module is missing
The module exists but cannot be loaded. If it has errors in it, such as a SyntaxError for instance.
To check that it's not the latter, run:
python -c "import <myModuleContainingTasksDotPyFile>"
In the context of this question:
python -c "import etl"
If it crashes, fix this first (Unlike with celery, you'll get a detailed error message).
Solutions above did not work for me.
I had the same issue and my problem was that in main celery.py (that was in SmartCalend folder) I had:
app = Celery('proj')
but instead I must type there:
app = Celery('SmartCalend')
where SmartCalend is the actual app name where celery.py belongs (!). not any random word, but precisely app name. Thats nowhere mentioned, only in official docs here:
Try export PYTHONPATH=<parent directory> where parent directory is the folder where the etl is. Run the Celery worker, and see it if fixes your problem. This is probably one of the most common Celery "issues" (not really Celery, but Python in general). Alternatively, run the Celery worker from that folder.
Answer for MacOS Catalina:
When you install celery with pip (pip install celery), python can import celery, but you are not able to launch celery from the terminal because the terminal does not know of the celery executable.
Add celery to the path to fix:
nano ~/.bash_profile
In the file add: export PATH="/Users/gavinbelson/Library/Python/2.7/bin:$PATH"
To save the file in the nano editor: ctrl+o, then enter, then ctrl+x
To update the terminal with your change type: source ~/.bash_profile
Now you should be able to type celery in the terminal window
---- Note this is for the default python terminal command which runs version 2.7. If you are using python3 to run python, you would need to change alter the path variable accordingly
I have an i18nized Python Django application. It currently uses two languages; German (DE) and French (FR).
I have all my keys (.po-/.mo-files) translated and ready in German, however for French, some are missing.
In the Django settings I specified 'de' as the LANGUAGE_CODE.
I can switch from one language to the other just fine without issues. The routing works fine and every other feature I need is handled by the Django Middleware.
However, in the current scenario when I switch from German to French, all the keys which are missing in French, just fallback to the German values. But I would like them to just default to their keys.
E.g.
Current Scenario
Sortiment (available in French) -> Assortiment
Gratis Lieferung (not available in French) -> Gratis Lieferung
Expected Scenario
Sortiment (available in French) -> Assortiment
Gratis Lieferung (not available in French) -> free.shipping.info
What would be a clean solution to solve this? I couldn't find anything in the Django documentation. I'd like to solve this without using additional plugins.
And one solution I could come up with, would be to just add all the missing keys in the french translations and have their values also be their keys but this doesn't feel right.
E.g. in django.po
msgid "searchsuggest.placeholder"
msgstr "searchsuggest.placeholder"
Another possible solution is to not set the LANGUAGE_CODE in the settings.py which works as I would want it for french, e.g. I go to mypage.com/fr/ and all my translated keys are shown the correct corresponding value while untranslated keys are just shown as keys (See 'Expected Scenario'). But when I do this, the German version only shows the keys, no values. E.g. I go to mypage.com/ (German should be implicit) and this is what I see:
assortment.menu.title
free.shipping.info
More information
My urls.py
urlpatterns = i18n_patterns(
# app endpoints
url(r'^$', home, name='home'),
url(r'^cart', include('app.cart.urls')),
url(r'^', include('app.infra.url.urls')),
prefix_default_language=False,
)
My settings.py
TIME_ZONE = 'UTC'
LANGUAGE_CODE = 'de'
LANGUAGES = [
('de', _('German')),
('fr', _('French')),
]
LOCALE_PATHS = [
../a/dir
]
USE_I18N = True
USE_L10N = True
USE_TZ = True
# And somewhere I use this
'django.middleware.locale.LocaleMiddleware',
My jinja template global translation function:
from django.utils.translation import ugettext as _
from jinja2.ext import Extension
class ViewExtension(Extension):
def __init__(self, environment):
super(ViewExtension, self).__init__(environment)
environment.globals['trans'] = trans
# defaults back to german if not found
def trans(translation_key, **kwargs):
translation = _(translation_key) % kwargs
if translation == translation_key:
# this only happens if my LANGUAGE_CODE is not set
translation_logger.warning(f'Missing translation key "{translation_key}".')
return translation
I'm still happy if anyone has a "proper" solution for this but I ended up solving it with a shell script:
#!/bin/bash
# A simple script to make sure that all translation files contain all the msgids.
# If a msgid previously didn't exist in a file, it will be created with the msgstr set to the same as the msgid.
SCRIPTPATH=`dirname $0`
MSGIDS=`find $SCRIPTPATH -name "*.po" -type f -print0 | xargs grep -h msgid | sort | uniq | awk '{print $2}'`
find $SCRIPTPATH -name "*.po" -type f | while read FILE; do
current_msgids=`grep -h msgid $FILE | awk '{print $2}'`
for msg in $MSGIDS; do
[[ $current_msgids =~ (^|[[:space:]])"$msg"($|[[:space:]]) ]] || printf "\nmsgid $msg\nmsgstr $msg\n" >> $FILE
done
done
I just included this script before running compilemessages in our Makefile.
How to pass user_data script to Python Heat-API client.
I have the following script in a file I want to pass into an instance as user_data during creating, but I am not sure
how to go about it doing. I am using the Heat API to create the instance. The below code creates the stack with the heat template file with no user_data.
Any pointers would be appreciated.
env.yml
user_data:
#!/bin/bash
rpm install -y git vim
template_file = 'heattemplate.yaml'
template = open(template_file, 'r')
stack = heat.stacks.create(stack_name='Tutorial', template=template.read(), parameters={})
On your yaml Heat template, you should add:
parameters:
install_command:
type: string
description: Command to run from user_data
default: #!/bin/bash rpm install -y git vim
...
myserver:
type: OS::Nova::Server
properties:
...
user_data_format: RAW
user_data: { get_param: install_command }
And pass the new parameter through parameters = {}, from your create line on Python:
heat.stacks.create(stack_name='Tutorial', template=template.read(),
parameters={ 'install_command': '...' })
I am about to make a wxpython programm translatable.
I invoke python gettext with:
import gettext
languagelist = [locale.getdefaultlocale()[0], 'en_US']
t = gettext.translation('myProgram', localedir, ['de_DE','en_US'])
_ = t.ugettext
This works fine for everything like:
self.statusbar.PushStatusText(_('Connecting service ...'))
But now there is this wx.AboutBox
info = wx.AboutDialogInfo()
info.Name = swname
info.Version = swversion
info.Developers = swdevelopers
info.License = wordwrap(swlicense, 500, wx.ClientDC(self))
wx.AboutBox(info)
This AboutBox has buttons labeled "Developers" and "License" and these buttons do not get translated.
No surprise I just ran pygettext -d myProgram mainFile.py to create the .pot file.
So how do I get the text from wx.AboutBix into my .pot file?
They are already translated for you and are contained in the wxstd.pot, respectively in the wxstd.mo of the relevant language.
In the wxPython Phoenix documentation is some more information and a small sample application - http://wxpython.org/Phoenix/docs/html/internationalization.html?highlight=i18n this also works for the wxPython Classic, which you probably use.