How Can Change My ip When i Use Instaloader - python

I use the instaloader and I have a problem
The number of my usernames is high and when I am downloading profiles, my IP is closed whit instagram and I will not be able to make requests for a while.
Can anyone help?
If it is possible to use a proxy, I think it will be good
It must be finished in a short time for these users and I can't use Time.sleep()

Related

'406 Not Acceptable' after scaping web using python

The website i scapped blocked me out by showing 406 Not Acceptable on the browser. It might i mistakenly sent too many requests at once on phython code.
So i put time.sleep(10) for each loop to not make it look like a DDoS attack, and it seems worked out.
My questions are:
How long would it be reasonable to send between each request? Sleep 10 seconds for each loop makes my code running too slow.
How to fix the 406 Not Acceptable error on my browsers? They still block me out, only if i chance my ip address but it's not permanent solution.
Thank you all for your answers and comments. Good day!
Any rate-limit errors are all subject to which website you choose to scrape / interact with. I could set up a website that only allows you to view it once per day, before throwing HTTP errors at your screen. So to answer your first question, there is no definitive answer. You must test for yourself and see what's the fastest speed you can go, without getting blocked.
However, there is a workaround. If you use proxies, then it's almost impossible to detect and stop the requests from executing, and therefore you will not be hit by any HTTP errors. HOWEVER, JUST BECAUSE YOU CAN, DOESN'T MEAN THAT YOU SHOULD- I am a programmer, not a lawyer. I'm sure there's a rule somewhere that says that spamming a page, even after it tells you to stop, is illegal.
Your second question isn't exactly related to programming, but I will answer it anyways- try clearing your cookies or refreshing your IP (try using a VPN or such). Other than changing your IP or cookies, there's not many more ways that a page can fingerprint you (in order to block you).

How can I bypass the 429-error from www.instagram.com?

i'm solliciting you today because i've a problem with selenium.
my goal is to make a full automated bot that create an account with parsed details (mail, pass, birth date...) So far, i've managed to almost create the bot (i just need to access to gmail and get the confirmation code).
My problem is here, because i've tried a lot of things, i have a Failed to load resource: the server responded with a status of 429 ()
So, i guess, instagram is blocking me.
how could i bypass this ?
The answer is in the description of the HTTP error code. You are being blocked because you made too many requests in a short time.
Reduce the rate at which your bot makes requests and see if that helps. As far as I know there's no way to "bypass" this check by the server.
Check if the response header has a Retry-After value to tell you when you can try again.
Status code of 429 means that you've bombarded Instagram's server too many times ,and that is why Instagram has blocked your ip.
This is done mainly to prevent from DDOS attacks.
Best thing would be to try after some time ( there might be a Retry-After header in the response).
Also, increase the time interval between each request and set the specific count of number of requests made within a specified time (let's say 1 hr).
Retry-After header is the best practice. However, there's no such response header in this scenario.

Suggestion: The best way to send messages on Facebook without getting banned using python

I'm using a python script that monitors a website and sends me messages on Facebook if there is any specific updates.
I have tried a module which called 'fbchat', so simple and so easy, but the problem is that I'm using real Facebook accounts and somehow Facebook detected that it's a bot and banned that profile, even if I have made random pauses in my code.
I know that I can do make those notifications through emails, but for me Facebook messages are better... Any ideas about how can I make it possible (maybe through bots!!)?
Thank you!
First take a look at which parameters(headers and payloads) the POST method takes(using network tools in google chrome for example), and try again with as many parameters as possible, while also using a session so cookies are enabled as well.
Different websites use different methods of detecting bots, and you'll just have to test and see what works.
P.S: take a look at this answer for more info.
Multiple Accounts are not allowed on Facebook, and there is no (allowed) way to send messages between users. You can only send messages from Pages to Users, and only if the User started the conversation. You can find more information about that in the docs: https://developers.facebook.com/docs/messenger-platform/

Bypass rate limit for requests.get

I want to constantly scrape a website - once every 3-5 seconds with
requests.get('http://www.example.com', headers=headers2, timeout=35).json()
But the example website has a rate limit and I want to bypass that. How can I do so?? I thought about doing it with proxies but was hoping there were some other ways?
You would have to do some very low level stuff. Utilizing likely socket and urllib2.
First do your research. How are they limiting your query rate? Is it by IP, or session based (server side cookie) or local cookies? I suggest going to the site manually as your first step of research, and using a web-developer tool to view all headers communicated.
One you figure this out, create a plan to manipulate it.
Lets say it is session based, you could utilize multiple threads to control several individual instances of a scraper, each with unique sessions.
Now, if it is IP based, then you must spoof your IP which is much more complex.
just buy quite a lot of proxy.
and config the script to change the proxy to next after the rate limit time of the server.

How to use a Proxy with Youtube API? (Python)

I'm working a script that will upload videos to YouTube with different accounts. Is there a way to use HTTPS or SOCKS proxies to filter all the requests. My client doesn't want to leave any footprints for Google. The only way I found was to set the proxy environment variable beforehand but this seems cumbersome. Is there some way I'm missing?
Thanks :)
Setting an environment variable (e.g. import os; os.environ['BLAH']='BLUH' once at the start of your program "seems cumbersome"?! What does count as "non-cumbersome" for you, pray?

Categories