Cannot save a BufferedReader to a file - python

I am trying to use http-parser and write the response to a file following the example here. This is what I am trying to do, I changed the GET request to request an image and then trying to save it to a file:
open('image.jpg', 'wb').write(p.body_file().read())
But the file has zero bytes. What am I missing here?
Complete code:
#!/usr/bin/env python
import socket
from http_parser.http import HttpStream
from http_parser.reader import SocketReader
def main():
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.connect(('www.linux-mag.com', 80))
s.send("GET http://www.linux-mag.com/s/i/topics/tux.jpg HTTP/1.1\r\nHost: www.linux-mag.com\r\n\r\n")
r = SocketReader(s)
p = HttpStream(r)
print p.body_file()
open('image.jpg', 'wb').write(p.body_file().read())
finally:
s.close()
if __name__ == "__main__":
main()

It turns out that I needed to sudo the script. I did sudo python <script> it works fine.

Related

Testing-containers and clickhouse-driver error:Unexpected EOF while reading bytes

I have these libraries installed:
testcontainers==2.5
clickhouse-driver==0.1.0
This code:
from testcontainers.core.generic import GenericContainer
from clickhouse_driver import Client
def test_docker_run_clickhouse():
ch_container = GenericContainer("yandex/clickhouse-server")
ch_container.with_bind_ports(9000, 9000)
with ch_container as ch:
client = Client(host='localhost')
print(client.execute("SHOW TABLES"))
if __name__ == '__main__':
test_docker_run_clickhouse()
I am trying to get a generic container with clickhouse DB running.
But it gives me: EOFError: Unexpected EOF while reading bytes.
I am using Python 3.5.2. How to fix this?
It takes some time to run a container. Add a time delay before executing operations.
import time
with ch_container as ch:
time.sleep(3)
client = Client(host='localhost')
print(client.execute("SHOW TABLES"))

Program class is stuck/idle and does not execute remaining calls after 1st call in Anaconda/Command Line Prompt but works in Spyder

I am trying to use the anaconda prompt to run my python script. It runs smoothly on the first call but stops there. I tried on Spyder, it works but I would like it to work on anaconda prompt or command line. Any reason why?
from decompress import decompress
from reddit import reddit
from clean import clean
from wikipedia import wikipedia
def main():
dir_of_file = r"D:\Users\Jonathan\Desktop\Reddit Data\Demo\\"
print('0. Path: ' + dir_of_file)
reddit_repo = reddit()
wikipedia_repo = wikipedia()
pattern_filter = "*2007*&*2008*"
print('1. Creating data lake')
reddit_repo.download_files(pattern_filter,"https://files.pushshift.io/reddit/submissions/",dir_of_file,'s')
reddit_repo.download_files(pattern_filter,"https://files.pushshift.io/reddit/comments/",dir_of_file,'c')
if __name__ == "__main__":
main()
The RS Downloaded is this line of code being ran:
reddit_repo.download_files(pattern_filter,"https://files.pushshift.io/reddit/submissions/",dir_of_file,'s')
Update:
Added the class/function
class reddit:
def multithread_download_files_func(self,list_of_file):
filename = list_of_file[list_of_file.rfind("/")+1:]
path_to_save_filename = self.ptsf_download_files + filename
if not os.path.exists(path_to_save_filename):
data_content = None
try:
request = urllib.request.Request(list_of_file)
response = urllib.request.urlopen(request)
data_content = response.read()
except urllib.error.HTTPError:
print('HTTP Error')
except Exception as e:
print(e)
if data_content:
with open(path_to_save_filename, 'wb') as wf:
wf.write(data_content)
print(self.present_download_files + filename)
def download_files(self,filter_files_df,url_to_download_df,path_to_save_file_df,prefix):
#do some processing
matching_fnmatch_list.sort()
p = ThreadPool(200)
p.map(self.multithread_download_files_func, matching_fnmatch_list)
It was the download which was taking a lot of time. I have changed network and it worked as expected. So there is no issue with cmd or anaconda prompt

RTLO in filename in windows

I wrote simple script in python
#!/usr/bin/python
from uuid import getnode as get_mac
import socket
import requests
import datetime
#import tkMessageBox as messagebox
#import Tkinter as tk
def main():
print('start')
i = datetime.datetime.now()
headers = {"Content-Type": "text/html; charset=UTF-8"}
r = requests.post("http://michulabs.pl", data={'name' : 'CI17nH', 'ip' : getIp(), 'mac' : getMac(), 'source' : 'so', 'join_date' : i})
print(r.status_code, r.reason)
print(r.text) # TEXT/HTML
print(r.status_code, r.reason) # HTTP
"""
method to read ip from computer
it will be saved in database
"""
def getIp():
ip = socket.gethostbyname(socket.gethostname())
print 'ip: ' + str(ip)
return ip
"""
method to read mac from computer
it will be saved in database
"""
def getMac():
mac = get_mac()
print 'mac: ' + str(mac)
return mac
if __name__ == "__main__":
main()
Then by py2exe generated .exe file and tried to use RTLO character in filename which made script called moc.pdf instead of pdf.com. Actually it works good with pdf.com but it doesn't after using RTLO character. Did Windows blocked that trick in filenames, or am I doing something wrong?
PS Windows didn't block that trick because it does work with other file.

PySerial client unable to write data

I'm trying to write a python program which can communicate over a serial interface using PySerial module as follows:
import serial
if __name__ == '__main__':
port = "/dev/tnt0"
ser = serial.Serial(port, 38400)
print ser.name
print ser.isOpen()
x = ser.write('hello')
ser.close()
print "Done!"
But if I execute the above I get the following error:
/dev/tnt0
True
Traceback (most recent call last):
File "/home/root/nested/test.py", line 15, in <module>
x = ser.write('hello')
File "/usr/local/lib/python2.7/dist-packages/serial/serialposix.py", line 518, in write
raise SerialException('write failed: %s' % (v,))
serial.serialutil.SerialException: write failed: [Errno 22] Invalid argument
I referred to the pyserial documentation and according to that this should work without an issue. Please let me know what i'm doing wrong in this.
TIA!
For some reason, in order to use the module tty0tty, you need to open both /dev/tnt0 and /dev/tnt1, or any of the other pairs (e.g /dev/tnt2 and /dev/tnt3).
The code below works:
import time
import serial
def main():
vserial0 = serial.Serial(port='/dev/tnt0', baudrate=9600, bytesize=8, parity=serial.PARITY_EVEN, stopbits=1)
vserial1 = serial.Serial(port='/dev/tnt1', baudrate=9600, bytesize=8, parity=serial.PARITY_EVEN, stopbits=1)
n_bytes = 0
while n_bytes == 0:
vserial0.write('test')
n_bytes = vserial1.inWaiting()
time.sleep(0.05)
print vserial1.read(n_bytes)
if __name__ == '__main__':
main()
/dev/tntX are emulated port pairs, and to perform a successful read or write you need to open both ports from a pair.
Think of it as a pipe - if one end is closed, you will be not able to push the data through.

Use REMOTE_ADDR in an other file [duplicate]

This question already exists:
How can I display environment variable [duplicate]
Closed 8 years ago.
I code in python and I have a problem.
I have file1.py :
import os, sys, platform, getpass, tempfile
import webbrowser
import string
import json
import cgi, cgitb
def main( addr, name):
os.environ["REMOTE_ADDR"] = addr
print os.environ ["REMOTE_ADDR"]
template = open('file2.py').read()
tmpl = string.Template(template).substitute(
name = name,
addr = cgi.escape(os.environ["REMOTE_ADDR"]),
os = user_os,
user_name = user_login,
)
f = tempfile.NamedTemporaryFile(prefix='/tmp/info.html', mode='w', delete=False)
f.write(contenu)
f.close()
webbrowser.open(f.name)
if __name__ == "__main__":
addr = sys.argv[1]
name = sys.argv[2]
user_os = sys.platform
sys.argv.append(user_os)
user_login = getpass.getuser()
sys.argv.append(user_login)
main(addr, name)
in the file2.py
<form name="sD" method="get" action="${addr}">
but I have this error and I have tried to resolve it, but I don't know how can do that :(
Traceback (most recent call last):
File "./file1.py", line 47, in <module>
main(addr, name)
File "./file1.py", line 22, in main
addr = cgi.escape(os.environ["REMOTE_ADDR"])
File "/usr/lib/python2.6/UserDict.py", line 22, in __getitem__
raise KeyError(key)
KeyError: 'REMOTE_ADDR'
My problem is, I don't know how can I put a addr variable in command line and recover that IP address in an URL when I click on the OK button
Help me please :(
You have multiple problems with your code.
First, as mentioned in your previous question:
you dont (I repeat: you dont) want the IP of the client as the url for
your form's action
What, exactly, do you think this line of code is going to do?
<form name="sD" method="get" action="${addr}">
It will attempt to send the form to your end user's IP address. This will fail. This will fail because
They likely don't have a web server running
Even if they do, they likely don't have a script built to handle your form
You should be submitting the form to a page you control so that you can process it
As for your missing key error, you don't have an environment variable set. You can do this a few ways:
From outside of your python script, use this command: set REMOTE_ADDR=<value>. Replace <value> with an appropriate value.
From within your python script, use this code
Remember to import os
import os
os.environ["REMOTE_ADDR"] = "value"
Again, value should be an appropriate value.
A very simple example of what you want:
import os, sys
def main( addr, name):
os.environ["REMOTE_ADDR"] = addr
print os.environ["REMOTE_ADDR"]
if __name__ == "__main__":
addr = sys.argv[1]
name = sys.argv[2]
main(addr, name)
This outputs:
>python test.py "address" "name"
address
>python test.py "http://www.google.com" "name"
http://www.google.com
Finally, as mentioned in your previous question:
you dont (I repeat: you dont) want the IP of the client as the url for
your form's action
From your shell (i.e. command line)
$> set REMOTE_ADDR=<some url>
$> python
>>> import os
>>> print os.environ['REMOTE_ADDR']
<some url>
if you define it in the python instance it is only available to that instance
but by putting in the 'environment' before calling any module it is 'globally' available

Categories