I've been trying to load raw cookies (not json) but using a txt file instead of declaring it directly inside a variable but getting no success , Here's my code ->
cookies.txt
{"PHPSESSID": "ibd1biktq4tfm3k4j790juf19d", "security": "impossible"}
my python script
file = open("cookies.txt", "r")
file = file.readlines()
str1 = " "
cookies = str1.join(file) # for converting list into string
requests.get("http://localhost/dvwa", cookies=cookies)
output : TypeError: string indices must be integers
Also if i do print(cookies) it outputs {"PHPSESSID": "ibd1biktq4tfm3k4j790juf19d", "security": "impossible"} and declaring the same output directly into the variable "cookies" works ...
Can anyone please clarify what i am doing wrong ?
Try to parse the string to Python dictionary using ast.literal_eval:
from ast import literal_eval
with open("cookies.txt", "r") as f_in:
cookies = literal_eval(f_in.read())
requests.get("http://localhost/dvwa", cookies=cookies)
Related
I am getting a JSON file from a curl request and I want to read a specific value from it.
Suppose that I have a JSON file, like the following one. How can I insert the "result_count" value into a variable?
Currently, after getting the response from curl, I am writing the JSON objects into a txt file like this.
json_response = connect_to_endpoint(url, headers)
f.write(json.dumps(json_response, indent=4, sort_keys=True))
Your json_response isn't a JSON content (JSON is a formatted string), but a python dict, you can access it using the keys
res_count = json_response['meta']['result_count']
Use the json module from the python standard library.
data itself is just a python dictionary, and can be accessed as such.
import json
with open('path/to/file/filename.json') as f:
data = json.load(f)
result_count = data['meta']['result_count']
you can parse a JSON string using json.loads() method in json module.
response = connect_to_endpoint(url, headers)
json_response = json.load(response)
after that you can extract an element with specify element name in Brackets
result_count = ['meta']['result_count']
If my json file is huge it contains to many dictionaries and lists inside the dictionary and it is enclosed with double quotes means how can i proceed that. what is the deserialize? How to use the deserialize?
Use json module.
If you are having json in one file then you can use:
with open("json_data.json", "r") as data:
print(json.load(data))
OR
with open("json_data.json", "r") as data:
print(json.loads(data.read()))
If you are having json in any var, you can use:
jsonData = '{}'
jsonVal = json.loads(jsonData)
There is a package called json in python, which you can use to serialize and deserialize a dictionary.
If you want to serialize using the following:
with open("huge_json_file.json", "r") as data
json_str = json.dumps(data)
If you want to de-serialize using the following:
with open("huge_json_file.json", "r") as data
json_dict = json.loads(data)
I am trying to download file from GitHub(raw file) and then run this file as .sql file.
import snowflake.connector
from codecs import open
import logging
import requests
from os import getcwd
import os
import sys
#logging
logging.basicConfig(
filename='C:/Users/abc/Documents/Test.log',
level=logging.INFO
)
url = "https://github.com/raw/abc/master/file_name?token=Anvn3lJXDks5ciVaPwA%3D%3D"
directory = getcwd()
filename = os.path.join(getcwd(),'VIEWS.SQL')
r = requests.get(url)
filename.decode("utf-8")
f = open(filename,'w')
f.write(str(r.content))
with open(filename,'r') as theFile, open(filename,'w') as outFile:
data = theFile.read().split('\n')
data = theFile.read().replace('\n','')
data = theFile.read().replace("b'","")
data = theFile.read()
outFile.write(data)
However I get this error
syntax error line 1 at position 0 unexpected 'b'
My converted sql file has b at the beginning and bunch of newline \n characters in the file. Also the entire output file is in single quotes 'text'. Can anyone help me get rid of these? Looks like replace isn't working.
OS: Windows
Python Version: 3.7.0
You introduced a b'.. prefix by converting the response.content bytes value to a string with str():
>>> import requests
>>> r = requests.get("https://github.com/raw/abc/master/file_name?token=Anvn3lJXDks5ciVaPwA%3D%3D")
>>> r.content
b'Not Found'
>>> str(r.content)
"b'Not Found'"
Of course, the specific dummy URL you gave in your question produces a 404 Not Found response, hence the Not Found content of the response body:
>>> r.status_code
404
so the contents in this demonstration are not actually all that useful. However, even for your real URL you probably want to test for a 200 status code before moving to write the data to a file!
What is going wrong in the above is that str(bytesvalue) converts a bytes object to its representation. You'd normally want to decode a bytes value with a text codec, using the bytes.decode() method. But because you are writing the data to a file here, you should instead just open the file in binary mode and write the bytes object without decoding:
r = requests.get(url)
if r.status_code == 200:
with open(filename, 'wb') as f:
f.write(r.content)
The 'wb' mode opens the file for writing in binary mode. Writing binary content to a binary file is the most efficient; decoding it first then writing to a text file requires that it is encoded again. Better to avoid doing double work.
As a side note: there is no need to join a local filename with getcwd(); relative paths always end up in the current working directory, and otherwise it's better to use os.path.abspath(filename).
You could also trust that GitHub sets the correct character set in the Content-Type headers and have response decode the value to str for you in the form of the response.text attribute:
r = requests.get(url)
if r.status_code == 200:
with open(filename, 'w') as f:
f.write(r.text)
but again, that's really doing extra work for nothing, first decoding the binary content from the request, then encoding again when writing to a text file.
Finally, for larger file responses it is better to stream the data and copy it directly to a file. The shutil.copyfileobj() function can take a raw response fileobject directly, provided you enable transparent transport decompression:
import shutil
r = requests.get(url, stream=True)
if r.status_code == 200:
with open(filename, 'wb') as f:
# enable transparent transport decompression handling
r.raw.decode_content = True
shutil.copyfileobj(r.raw, f)
Depending on your version of Python/OS it could be as simple as changing the file to read/write in binary (and if they're still there then altering where you have the replaces):
with open(filename,'rb') as theFile, open(filename,'wb') as outFile:
outfile.write(str(r.content))
data = theFile.read().split('\n')
data = data.replace('\n','')
data = data.replace("b'","")
outFile.write(data)
It would help to have a copy of the file and the line the error is occurring on.
I got some data from an API with Python, and I'm trying to print it to a file. My understanding was that the indent argument lets you pretty print. Here's my code:
import urllib2, json
APIKEY_VALUE = "APIKEY"
APIKEY = "?hapikey=" + APIKEY_VALUE
HS_API_URL = "http://api.hubapi.com"
def getInfo():
xulr = "/engagements/v1/engagements/paged"
url = HS_API_URL + xulr + APIKEY + params
response = urllib2.urlopen(url).read()
with open("hubdataJS.json", "w") as outfile:
json.dump(response, outfile, sort_keys=True, indent=4, ensure_ascii=False)
getInfo()
What I expected hubdataJS.json to look like when I opened it in Sublime text is some JSON with a format like this:
{
a: some data
b: [
some list of data,
more data
]
c: some other data
}
What I got instead was all the data on one line, in quotes (I thought dumps was for outputting as a string), with lots of \s, \rs, and \ns.
Confused about what I'm doing wrong.
in your code, response is a bytestring that contains the data serialized in the json format. When you do json.dump you're serializing the string to json. You end up with a json formatted file containing a string, and in that string you have another json data, so, json inside json.
To solve that you have to decode (deserialize) the bytestring data you got from the internet, before reencoding it to json to write in the file.
response = json.load(urllib2.urlopen(url))
that will convert the serialized data from the web into a real python object.
I am getting a JSON file with following format :
// 20170407
// http://info.employeeportal.org
{
"EmployeeDataList": [
{
"EmployeeCode": "200005ABH9",
"Skill": CT70,
"Sales": 0.0,
"LostSales": 1010.4
}
]
}
Need to remove the extra comment lines present in the file.
I tried with the following code :
import json
import commentjson
with open('EmployeeDataList.json') as json_data:
employee_data = json.load(json_data)
'''employee_data = json.dump(json.load(json_data))'''
'''employee_data = commentjson.load(json_data)'''
print(employee_data)`
Still not able to remove the comments from the file and bring
the JSON file in correct format.
Not getting where things are going wrong? Any direction in this regard is highly appreciated.Thanks in advance
You're not using commentjson correctly. It has the same interface as the json module:
import commentjson
with open('EmployeeDataList.json', 'r') as handle:
employee_data = commentjson.load(handle)
print(employee_data)
Although in this case, your comments are simple enough that you probably don't need to install an extra module to remove them:
import json
with open('EmployeeDataList.json', 'r') as handle:
fixed_json = ''.join(line for line in handle if not line.startswith('//'))
employee_data = json.loads(fixed_json)
print(employee_data)
Note the difference here between the two code snippets is that json.loads is used instead of json.load, since you're parsing a string instead of a file object.
Try JSON-minify:
JSON-minify minifies blocks of JSON-like content into valid JSON by removing all whitespace and JS-style comments (single-line // and multiline /* .. */).
I usually read the JSON as a normal file, delete the comments and then parse it as a JSON string. It can be done in one line with the following snippet:
with open(path,'r') as f: jsonDict = json.loads('\n'.join(row for row in f if not row.lstrip().startswith("//")))
IMHO it is very convenient because it does not need CommentJSON or any other non standard library.
Well that's not a valid json format so just open it like you would a text document then delete anything from// to \n.
with open("EmployeeDataList.json", "r") as rf:
with open("output.json", "w") as wf:
for line in rf.readlines():
if line[0:2] == "//"
continue
wf.write(line)
Your file is parsable using HOCON.
pip install pyhocon
>>> from pyhocon import ConfigFactory
>>> conf = ConfigFactory.parse_file('data.txt')
>>> conf
ConfigTree([('EmployeeDataList',
[ConfigTree([('EmployeeCode', '200005ABH9'),
('Skill', 'CT70'),
('Sales', 0.0),
('LostSales', 1010.4)])])])
If it is the same number of lines every time you can just do:
fh = open('EmployeeDataList.NOTjson',"r")
rawText = fh.read()
json_data = rawText[rawText.index("\n",3)+1:]
This way json_data is now the string of text without the first 3 lines.