Changing IP for a scraping script - python

I am trying to a website, but my IP has been banned after a while. I have tried using Tor proxy, but it's unstable and slow. Therefor I think the best solution might be a standard proxy that would be obfuscating it's IP say once per 12 hours. Or do you have any other suggestion?

IP spoofing is useless since the server response would be delivered to some undesired address. You'll either have to ask the site owner or setup something like a botnet which won't be easy nor cheap.

Related

Is it possible to send packets using spoofed public IP?

Is it possible to send a web packet using a spoofed public IP (Custom source IP header) to a server (Raspberry Pi), and have the Pi log the packet. The response is not important, nor is the method used(TCP, UDP, HTTP), only the initial one way communication.
I have searched around on the first and second pages of google, but all examples I could find demonstrate this on local a IP such as 10.0.2.12. Will these examples work if I use a destination such as 67.70.XX.XX?
I'm a newbie to python networking, any help at all, or links to other resources is greatly appreciated.
Thanks everyone for your time! :)
It's much harder to spoof your public IP that it seems. You'll need to act as your own router.
#Number File's answer is pretty wrong. It's easy to spoof an IP on the local network (basically the src field of IP) but much harder on a public level.
Have a look at https://superuser.com/a/619483
Yes. You can do so with tools such as nmap, though the packets will go to the IP you spoof, not yours. For this reason, doing so is generally pointless unless you’re trying to make an ISP or website look bad. Note: nmap is not a python program, in order to use it in python, you need to call it with something like os.system(“nmap” + args)

Changing IP of python requests

How do I change the IP of HTTP requests in python?
My friend built an API for a website, and sometimes it blocks certain IP's and so we need to change the IP of the request... here is an example:
login_req = self.sess.post('https://www.XXX/YYY', params={...}
Now, each request that it sends, is through the computer's IP, and we need it basically to pass through an imaginary VPN.
Thanks for the help. If something isn't clear I will explain.
Short answer: you can't.
Long answer: it seems like you're misunderstanding how IP addresses work. Your IP address is the network address that corresponds to your computer - when you send a request to a server, you attach your IP as a "return address" of sorts, so that the server can send a response back to you.
However, just like a physical address, you don't get to choose what your IP address is – you live on a street, and that's your address, you don't get to change what the street is called or what your house number is. In general, when you send a request from your computer, the message passes through a chain of devices. For example:
Your computer --> Your router --> Your ISP --> The Server
In a lot of cases, each of these assigns a different IP address to whatever's below it. So, when your request passes through your router, your router records your IP address and then forwards the request through your ISP using its own IP address. Hence how several users on the same network can have the same IP address.
There are physical IP addresses, that correspond directly to devices, but there are a limited amount of these. Mostly, each Internet Service Provider has a few blocks of IP addresses that it can attach to things; an ISP can keep a specific IP address pointed to a specific computer all of the time, but they don't have to, and for many of their regular users, they don't.
Your computer has basically no power to determine what its own IP address is, basically. There's nothing python can do about that.
Your Question:
we need [the request] basically to pass through an imaginary VPN.
It'd be easier to actually requisition a real proxy or VPN from somewhere and push your request through it. You'd have to talk with your internet service provider to get them to set something like that up for you specifically, and unless you're representing a reasonably big company they're unlikely to want to put in that effort. Most python libraries that deal with HTTP can easily handle proxy servers, so once you figure it out it shouldn't be a problem.
You can use an IP address from https://www.sslproxies.org/
For example,
import requests
response=requests.get("yourURL", proxies={'https': 'https://219.121.1.93:80', 'http': http://219.121.1.93:80 "})
The IP addresses on that site are pretty crappy and sometimes don't work, so it would be best to find a way to constantly scrape IP addresses from the site so you have a couple to try. Check out this article: https://www.scrapehero.com/how-to-rotate-proxies-and-ip-addresses-using-python-3/
warning: These should not be used for sensitive information as they are not secure. Don't use those IP addresses unless you are ok with anyone in the world knowing what your're doing.

Different IP for each bot?

I'm doing a Python bot that will request an url under different IP addresses in one computer. Is there a way to change my IP address for free and apply it to the bot? I have looked around and it seems like people say that I should use proxies for this. But I'm not familiar with proxies and how to implement them in Python. It'd be great if someone can guide me.
Thanks
You can change your IP in python, but your gateway will not be able to route a different IP than one in your sub-net.
Therefore, you have to use a proxy or a diffente router.
If you have/know an active router that will forward your packages using NAT, you can it as the gateway for the IP of the URL you are going to request.
For changing routes you can use this package: https://pypi.python.org/pypi/pyroute2
For using proxies directly in your bot, assuming you are using urllib3, you can check this documentation: http://docs.python-requests.org/en/latest/user/advanced/.
Another thing you might do is to rent some VPS servers for different worldwide IPs, check this search for examples.

Handling https requests using a SOCK_STREAM proxy

I'm working on a project that allows a user to redirect his browsing through a proxy. The system works like this - a user runs this proxy on a remote PC and then also runs the proxy on his laptop. The user then changes his browser settings on the laptop to use localhost:8080 to make use of that local proxy, which in turn forwards all browser traffic to the proxy running on the remote PC.
This is where I ran into HTTPS. I was able to get normal HTTP requests working fine and dandy, but as soon as I clicked on google.com, Firefox skipped my proxy and connected to https://google.com directly.
My idea was to watch for browser requests the say CONNECT host:443 and then use the python ssl module to wrap that socket. This would give me a secure connection between the outer proxy and the target server. However, when I run wireshark to see how a browser request looks like before ssl kicks in, it's already there, meaning it looks like the browser connects to port 443 directly, which explains why it omitted my local proxy.
I would like to be able to handle to HTTPS as that would make for a complete browsing experience.
I'd really appreciate any tips that could push in the right direction.
Well, after doing a fair amount of reading on proxies, I found out that my understanding of the problem was insufficient.
For anyone else that might end up in the same spot as me, know that there's a pretty big difference between HTTP, HTTPS, and SOCKS proxies.
HTTP proxies usually take a quick look into the HTTP headers to determine where to forward the whole packet. These are quite easy to code on your own with some basic knowledge of sockets.
HTTPS proxies, on the other hand, have to work differently. They should either be able to do the whole SSL magic for the client or they could try to pass the traffic without changes, however if the latter solution is chosen, the users IP will be known. This is a wee bit more demanding when it comes to coding.
SOCKS proxies are a whole different, albeit really cool, beast. They work on the 5th layer of the OSI model and honestly, I have no clue as to where I would even begin creating one. They achieve both security and anonymity. However, I do know that a person may be able to use SSH to start a SOCKS proxy on their machine, just read this http://www.revsys.com/writings/quicktips/ssh-tunnel.html . That link also gave an idea that it should be possible to use SSH from a Python script to make it much more convenient.
Hope this helps anyone with the same question as I had. Good luck!

Overriding hostname IP address in qtwebkit request

I'm downloading a web page (with PyQt4/QtWebKit) using given hostname, but I would like to use a pre-defined IP address for that hostname. For example, I need to hit "http://www.mysite.com" but use the IP address 1.2.3.4 instead of the actual resolved IP address. Is this at all possible in QtWebKit? I've tried a couple things so far:
Hitting http://1.2.3.4/ and sending a "Host" header of "www.mysite.com". This almost works, but ends up failing for a number of reasons (I'd be happy to go into more detail here).
Using a global /etc/hosts setting. This didn't work because it is hard to automate and I will be doing multiple downloads at once.
Is there a way to either in python or in PyQt4/QtWebKit to override the IP address associated with a hostname?
This is big for me. Any help at all would be greatly appreciated.
Use custom network access manager, something like this (C++): http://ariya.blogspot.com/2010/05/qnetworkaccessmanager-tracenet-speed.html, so that you can "hijack" the network request and "redirect" it to other domain.

Categories