Error loading vectors in fasttext/torchtext - python

I get the following error:
urllib.error.HTTPError: HTTP Error 403: Forbidden
When running this:
class FastText(Vectors):
url_base = 'https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.{}.vec'
def __init__(self, language="en", **kwargs):
url = self.url_base.format(language)
name = os.path.basename(url)
super(FastText, self).__init__(name, url=url, **kwargs)
How can I fix it?
Changing the url and that did not work

An HTTP 403 "Forbidden" error means that the server to which you've sent your request is refusing to let you access that URL, perhaps because it's not open to the public, or requires extra authentication.
If I try the URL https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.en.vec from a web-browser, I get the same error. So this isn't really an issue with your code, or python, or pytorch, or fasttext. You've just got an improper expectation that the given URL will return what you wnat it to.
What reference, or entity, made you think https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.en.vec would give you what you need? You may need to follow-up with whoever recommended that URL, and let them know their recommendation does not work, or find an alternate source.
Only if an URL works for you in your browser should you expect it will also work in your code, with no extra authentication steps. And, it still might break for other reasons, like the system where your code is running having a disfavored network IP-address – so before supplying such an URL to configure a custom subclass of Vectors, you might run some code from that same system to verify the URL loads at all.

Related

How to see if user is connected to internet or NOT?

I am building a BlogApp and I am stuck on a Problem.
What i am trying to do
I am trying to implement a feature that ,`If user is connected to internet then everything works fine BUT if user is not connected to internet then show a message like "You're not Connected to Internet".
What have i tried
I also tried channels but then i think internet connection are far away from Django-channels.
I also tried this :
url = "http://127.0.0.1:8000/"
timeout = 5
try:
request = requests.get(url, timeout=timeout)
print("Connected to the Internet")
except (requests.ConnectionError, requests.Timeout) as exception:
print("No INTERNET")
But it is keep showing me :
'Response' object has no attribute 'META'
I don't know what to do.
Any help would be Appreciated.
Thank You in Advance
It is not easy to know whether you're connected to the internet. In fact it is not even clear what this means. It depends a lot on the context.
In many practical cases it means, that your network setup is setup such, that you can access a DNS server and that you can access at least one machine on the internet.
You could just use one known url like for example "https://google.com" or "https://stackoverflow.com".
However this means that:
your test will fail if given service is for any reason down
you create requests to a server that isn't yours.
If you know, that the application should access your special web service, then you could use the url of your special web service:
url = "https://your_special_webservice.yourdomain"
Side information:
If you put the code in your question into a django view, that handles http requests, then you should probably write something like:
request = requests.get(url, timeout=timeout)
instead of
response = requests.get(url, timeout=timeout)
Otherwise you will overwrite the request object, of your django view
and this is probably what provoked your error message:
'Response' object has no attribute 'META'

Sending a redirect with code 401 from Flask doesn't redirect

When i do this, it work:
return redirect('/admin-login')
But when I try to pass a http code in parameter, the redirection dont work and I get to this page:`
return redirect('/admin-login', code=401)
Redirecting...
You should be redirected automatically to target URL: /admin-login. If
not click the link.
How do I fix this? Thanks
The message is actually there in both cases. Flask uses the Location header field to trigger a redirect. So when your browser sees that it skips displaying the content and immediately redirects.
The reason it doesn't work for code=401, is because 401 Unauthorized is a client error. This results in the browser ignoring the redirect (at least with Chrome).
The Flask documentation for flask.redirect mentions the following:
Supported codes are 301, 302, 303, 305, and 307.
It doesn't specifically mention why they are the only supported ones. But the best assumption is because browsers generally shouldn't redirect if it's not a 3XX status code (Redirection).
The solution is thus to use one of the supported status codes.
Solution that worked for me:
#app.errorhandler(401)
def page_not_found(e):
return redirect('http://example.com')
previously I had been trying:
#app.errorhandler(401)
def page_not_found(e):
return redirect('http://example.com', code=401)
...but it turns out the ", code=401" specification is not needed if it's up in errorhandler, and without it, it automatically redirects.
my solution was to create custom error pages, adding the redirect meta to the header. This will force the page to be redirected to the homepage after 5 seconds of the page being open =}
<meta http-equiv="refresh" content="5;url=/" />

Change URL to another URL using mitmproxy

I am trying to redirect one page to another by using mitmproxy and Python. I can run my inline script together with mitmproxy without issues, but I am stuck when it comes to changing the URL to another URL. Like if I went to google.com it would redirect to stackoverflow.com
def response(context, flow):
print("DEBUG")
if flow.request.url.startswith("http://google.com/"):
print("It does contain it")
flow.request.url = "http://stackoverflow/"
This should in theory work. I see http://google.com/ in the GUI of mitmproxy (as GET) but the print("It does contain it") never gets fired.
When I try to just put flow.request.url = "http://stackoverflow.com" right under the print("DEBUG") it won't work neither.
What am I doing wrong? I have also tried if "google.com" in flow.request.url to check if the URL contains google.com but that won't work either.
Thanks
The following mitmproxy script will
Redirect requesto from mydomain.com to newsite.mydomain.com
Change the request method path (supposed to be something like /getjson? to a new one `/getxml
Change the destination host scheme
Change the destination server port
Overwrite the request header Host to pretend to be the origi
import mitmproxy
from mitmproxy.models import HTTPResponse
from netlib.http import Headers
def request(flow):
if flow.request.pretty_host.endswith("mydomain.com"):
mitmproxy.ctx.log( flow.request.path )
method = flow.request.path.split('/')[3].split('?')[0]
flow.request.host = "newsite.mydomain.com"
flow.request.port = 8181
flow.request.scheme = 'http'
if method == 'getjson':
flow.request.path=flow.request.path.replace(method,"getxml")
flow.request.headers["Host"] = "newsite.mydomain.com"
You can set .url attribute, which will update the underlying attributes. Looking at your code, your problem is that you change the URL in the response hook, after the request has been done. You need to change the URL in the request hook, so that the change is applied before requesting resources from the upstream server.
Setting the url attribute will not help you, as it is merely constructed from underlying data. [EDIT: I was wrong, see Maximilian’s answer. The rest of my answer should still work, though.]
Depending on what exactly you want to accomplish, there are two options.
(1) You can send an actual HTTP redirection response to the client. Assuming that the client understands HTTP redirections, it will submit a new request to the URL you give it.
from mitmproxy.models import HTTPResponse
from netlib.http import Headers
def request(context, flow):
if flow.request.host == 'google.com':
flow.reply(HTTPResponse('HTTP/1.1', 302, 'Found',
Headers(Location='http://stackoverflow.com/',
Content_Length='0'),
b''))
(2) You can silently route the same request to a different host. The client will not see this, it will assume that it’s still talking to google.com.
def request(context, flow):
if flow.request.url == 'http://google.com/accounts/':
flow.request.host = 'stackoverflow.com'
flow.request.path = '/users/'
These snippets were adapted from an example found in mitmproxy’s own GitHub repo. There are many more examples there.
For some reason, I can’t seem to make these snippets work for Firefox when used with TLS (https://), but maybe you don’t need that.

How to read JSON from URL in Python?

I am trying to use Python to get a JSON file from the Web. If I open the URL in my browser (Mozilla or Chromium) I do see the JSON. But when I do the following with the Python:
response = urllib2.urlopen(url)
data = json.loads(response.read())
I get an error message that tells me the following (after translation in English): Errno 10060, a connection troughs an error, since the server after a certain time period did not react, or the connection was erroneous, or the host did not react.
ADDED
It looks like there are many people who faced the described problem. There are also some answers to the similar (or the same) question. For example here we can see the following solution:
import requests
r = requests.get("http://www.google.com", proxies={"http": "http://61.233.25.166:80"})
print(r.text)
It is already a step forward for me (I think that it is very likely that the proxy is the reason of the problem). However, I still did not get it done since I do not know URL of my proxy and I probably will need user name and password. Howe can I find them? How did it happen that my browsers have them I do not?
ADDED 2
I think I am now one step further. I have used this site to find out what my proxy is: http://www.whatismyproxy.com/
Then I have used the following code:
proxies = {'http':'my_proxy.blabla.com/'}
r = requests.get(url, proxies = proxies)
print r
As a result I get
<Response [404]>
Looks not so good, but at least I think that my proxy is correct, because when I randomly change the address of the proxy I get another error:
Cannot connect to proxy
So, I can connect to proxy but something is not found.
I think there might be something wrong, when you're trying to get the json from the online source(URL). Just to make things clear, here is a small code snippet
#!/usr/bin/env python
try:
# For Python 3+
from urllib.request import urlopen
except ImportError:
# For Python 2
from urllib2 import urlopen
import json
def get_jsonparsed_data(url):
response = urlopen(url)
data = str(response.read())
return json.loads(data)
If you still get a connection error, You can try a couple of steps:
Try to urlopen() a random site from the Interpreter (Interactive Mode). If you are able to grab the source code you're good. If not check internet conditions or try the request module. Check here
Check and see if the json in the URL is in the correct syntax. For sample json syntax check here
Try the simplejson module.
Edit 1:
if you want to access websites using a system wide proxy you will have to use a proxy handler to use loopback(local host) to connect to that proxy.. A sample code is shown below.
proxy = urllib2.ProxyHandler({
'http': '127.0.0.1',
'https': '127.0.0.1'
})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
# this way you can send both http and https request using proxies
urllib2.urlopen('http://www.google.com')
urllib2.urlopen('https://www.google.com')
I have not not worked a lot with ProxyHandler. I just know the theory and code. I am sure there are better ways to access websites through proxies; One which does not involve installing the opener everytime you run the program. But hopefully it will point you in the right direction.

Missing OAuth request token cookie error using tornado and TwitterMixin

I'm using tornado and the TwitterMixin and I use the following basic code:
class OauthTwitterHandler(BaseHandler, tornado.auth.TwitterMixin):
#tornado.web.asynchronous
def get(self):
if self.get_argument("oauth_token", None):
self.get_authenticated_user(self.async_callback(self._on_auth))
return
self.authorize_redirect()
def _on_auth(self, user):
if not user:
raise tornado.web.HTTPError(500, "Twitter auth failed")
self.write(user)
self.finish()
For me it works very well but sometimes, users of my application get a 500 error which says:
Missing OAuth request token cookie
I don't know if it comes from the browser or the twitter api callback configuration.
I've looked through the tornado code and I don't understand why this error
appears.
Two reasons why this might happen:
Some users may have cookies turned off, in which case this won't work.
The cookie hasn't authenticated. It's possible that the oauth_token argument is set, but the cookie is not. Not sure why this would happen, you'd have to log some logging to understand why.
At any rate, this isn't an "error," but rather a sign the user isn't authenticated. Maybe if you see that you should just redirect them to the authorize URL and let them try again.
I found the solution !!
It was due to my DNS.
I didn't put the redirection for www.mydomain.com and mydomain.com so sometimes the cookie was set in www. and sometimes not then my server didn't check in the good place, didn't find the cookie and then send me a 500 error.
The reason this was happening to me is that the Callback URL configuration was pointing to a different domain.
Take a look at the settings tab for your application at https://dev.twitter.com/apps/ or if the users getting the error are accessing your site from a different domain.
See: http://groups.google.com/group/python-tornado/browse_thread/thread/55aa42eef42fa1ac

Categories