I tried to get distinct_id with request.COOKIES.get('distinct_id'). However Mixpanel saves the data in a not extractable way for me. Anyone knows why there are all these %22%3A%20%22 and how to extraxt distinct_id?
print(request.COOKIES):
{
'djdt': 'hide',
'cookie_bar': '1',
'mp_1384c4d0e46aaaaad007e3d8b5d6eda_mixpanel': '%7B%22distinct_id%22%3A%20%22165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b%22%2C%22%24initial_referrer%22%3A%20%22%24direct%22%2C%22%24initial_referring_domain%22%3A%20%22%24direct%22%2C%22__alias%22%3A%20%22maz%2B1024%40gmail.com%22%7D',
'csrftoken': 'nvWzsrp3t6Sivkrsyu0gejjjjjiTfc36ZfkH7U7fgHaI40EF',
'sessionid': '7bkel6r27ebd55x262cv9lzv61gzoemw'
}
Check this code. You can run it because use the example you shared. First you must unquote the data in the mixpanel value. I used the suffix of the cookie key to get it. Then after the unquote you must load the json to get back a dictionary.
The code here prints all the keys in the dictionary, but you can easily get the distinct_id using mixpanel_dict.get('distinct_id')
Try it.
from urllib import parse
import json
cookie = {'djdt': 'hide',
'cookie_bar': '1',
'mp_1384c4d0e46aaaaad007e3d8b5d6eda_mixpanel': '%7B%22distinct_id%22%3A%20%22165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b%22%2C%22%24initial_referrer%22%3A%20%22%24direct%22%2C%22%24initial_referring_domain%22%3A%20%22%24direct%22%2C%22__alias%22%3A%20%22maz%2B1024%40gmail.com%22%7D',
'csrftoken': 'nvWzsrp3t6Sivkrsyu0gejjjjjiTfc36ZfkH7U7fgHaI40EF',
'sessionid': '7bkel6r27ebd55x262cv9lzv61gzoemw'
}
def get_value_for_mixpanel(cookie):
mixpanel_dict = {}
for key in cookie.keys():
if '_mixpanel' in key:
value = parse.unquote(cookie.get(key))
mixpanel_dict = json.loads(value)
return mixpanel_dict
if __name__ == "__main__":
mixpanel_dict = get_value_for_mixpanel(cookie) # type: dict
for key,value in mixpanel_dict.items():
print("%s:%s" %(key, value))
Result
distinct_id:165edf326870-00fc0e7eb72ed3-34677908-fa000-165e40c268947b
$initial_referrer:$direct
$initial_referring_domain:$direct
__alias:maz+1024#gmail.com
Try unquote()
>>> s = '/path/to/my/handler/?action=query&id=112&type=vca&info=ch%3D0%26type%3Devent%26ev46[sts%3Dbegin'
>>> import urllib
>>> urllib.unquote(s)
>>> '/path/to/my/handler/?action=query&id=112&type=vca&info=ch=0&type=event&ev46[sts=begin'
Credits : https://stackoverflow.com/a/11215316/5647272
Related
I have this one problem, where I print out a message response from a website(JSON response), and the response I get is this.
Here is my model with fake data:
{"token": "MTAxOTAwNjM4NjEyMzg0OTkwMQ.8hkyLV.n0ir2UA4qFE5pXen9YnPtFzgn4xP8tHmVmmkrl", "user_settings": {"locale": "en-US", "theme": "dark"}, "user_id": "101900638614857883"}
And, if I only want the value of "token" data which are this (MTAxOTAwNjM4NjEyMzg0OTkwMQ.8hkyLV.n0ir2UA4qFE5pXen9YnPtFzgn4xP8tHmVmmkrl) and I want to store it into a txt file, is there any good way to do it?
Thank you, guys!
I tried print(r.text('token')) but it did not work, since it only works on printing the category of the data's (like : Category : {"token" : 'daefafa', "user-id" : 'er121231231', more})
In python, JSON is treated as a dictionary.
To filter it use dictionary comprehension
tokenData = {key: val for key,val in data_json.items() if key == 'token'}
Full Code Snippet :
from urllib.request import urlopen
import json
url = "enter-your-url"
response = urlopen(url)
data_json = json.loads(response.read())
print(type(data_json)) # <class 'dict'>
#use dict comprehension
jsonToken = {key: val for key,val in data_json.items() if key == 'result'}
strToken = json.dumps(jsonToken)
# Only string json can be written to files
with open('data.txt','w') as file:
file.write(strToken)
file.close()
You need to parse the JSON into a dictionary using json.loads(). Like this:
import json
# ...
# request-getting code
# ...
data = json.loads(r.text)
print(data['token'])
I am trying to pass a data back to a URL fetch request. We are using Python 3.x
user_type_data = {'user_type': 'admin',
'user_name': 'myname',
'user_check_flag': 'yes'}
return_data = json.dumps({
l_user_type_data : user_type_data
},default = date_handler)
return return_data
When we do this for a dict I am getting the following error - TypeError("unhashable type: 'dict'"). According to this, it states that we cannot use a dict that is not hashabale - but how do we do this?
How do we fix this?
A valid dictionary key string should be enveloped by quotes or double quotes.
a_dict = {'key': 'value'} # Valid
b_dict = {"key": "value"} # Valid
Or if you wish to assign string that was stored in a variable to be the dictionary key, you can do this instead:
st = "key"
a_dict = dict()
a_dict[st] = 'value'
Since json_dumps requires a valid python dictionary, you may need to rearrange your code.
If the l_user_type_data is a variable contains a string, you should do:
temp_dict = dict()
temp_dict[l_user_type_data] = user_type_data
result = json.dumps(temp_dict, default = date_handler)
Otherwise, if l_user_type_data is a string for the key, just simply enclose that with either single quote or double quotes.
return_data = json.dumps({
"l_user_type_data" : user_type_data
},default = date_handler)
The following code is giving me:
Runtime.MarshalError: Unable to marshal response: {'Yes'} is not JSON serializable
from calendar import monthrange
def time_remaining_less_than_fourteen(year, month, day):
a_year = int(input['year'])
b_month = int(input['month'])
c_day = int(input['day'])
days_in_month = monthrange(int(a_year), int(b_month))[1]
time_remaining = ""
if (days_in_month - c_day) < 14:
time_remaining = "No"
return time_remaining
else:
time_remaining = "Yes"
return time_remaining
output = {time_remaining_less_than_fourteen((input['year']), (input['month']), (input['day']))}
#print(output)
When I remove {...} it then throws: 'unicode' object has no attribute 'copy'
I encountered this issue when working with lambda transformation blueprint kinesis-firehose-process-record-python for Kinesis Firehose which led me here. Thus I will post a solution to anyone who also finds this questions when having issues with the lambda.
The blueprint is:
from __future__ import print_function
import base64
print('Loading function')
def lambda_handler(event, context):
output = []
for record in event['records']:
print(record['recordId'])
payload = base64.b64decode(record['data'])
# Do custom processing on the payload here
output_record = {
'recordId': record['recordId'],
'result': 'Ok',
'data': base64.b64encode(payload)
}
output.append(output_record)
print('Successfully processed {} records.'.format(len(event['records'])))
return {'records': output}
The thing to note is that the Firehose lambda blueprints for python provided by AWS are for Python 2.7, and they don't work with Python 3. The reason is that in Python 3, strings and byte arrays are different.
The key change to make it work with lambda powered by Python 3.x runtime was:
changing
'data': base64.b64encode(payload)
into
'data': base64.b64encode(payload).decode("utf-8")
Otherwise, the lambda had an error due to inability to serialize JSON with byte array returned from base64.b64encode.
David here, from the Zapier Platform team.
Per the docs:
output: A dictionary or list of dictionaries that will be the "return value" of this code. You can explicitly return early if you like. This must be JSON serializable!
In your case, output is a set:
>>> output = {'Yes'}
>>> type(output)
<class 'set'>
>>> json.dumps(output)
Object of type set is not JSON serializable
To be serializable, you need a dict (which has keys and values). Change your last line to include a key and it'll work like you expect:
# \ here /
output = {'result': time_remaining_less_than_fourteen((input['year']), (input['month']), (input['day']))}
I am getting JIRA data using the following python code,
how do I store the response for more than one key (my example shows only one KEY but in general I get lot of data) and print only the values corresponding to total,key, customfield_12830, summary
import requests
import json
import logging
import datetime
import base64
import urllib
serverURL = 'https://jira-stability-tools.company.com/jira'
user = 'username'
password = 'password'
query = 'project = PROJECTNAME AND "Build Info" ~ BUILDNAME AND assignee=ASSIGNEENAME'
jql = '/rest/api/2/search?jql=%s' % urllib.quote(query)
response = requests.get(serverURL + jql,verify=False,auth=(user, password))
print response.json()
response.json() OUTPUT:-
http://pastebin.com/h8R4QMgB
From the the link you pasted to pastebin and from the json that I saw, its a you issues as list containing key, fields(which holds custom fields), self, id, expand.
You can simply iterate through this response and extract values for keys you want. You can go like.
data = response.json()
issues = data.get('issues', list())
x = list()
for issue in issues:
temp = {
'key': issue['key'],
'customfield': issue['fields']['customfield_12830'],
'total': issue['fields']['progress']['total']
}
x.append(temp)
print(x)
x is list of dictionaries containing the data for fields you mentioned. Let me know if I have been unclear somewhere or what I have given is not what you are looking for.
PS: It is always advisable to use dict.get('keyname', None) to get values as you can always put a default value if key is not found. For this solution I didn't do it as I just wanted to provide approach.
Update: In the comments you(OP) mentioned that it gives attributerror.Try this code
data = response.json()
issues = data.get('issues', list())
x = list()
for issue in issues:
temp = dict()
key = issue.get('key', None)
if key:
temp['key'] = key
fields = issue.get('fields', None)
if fields:
customfield = fields.get('customfield_12830', None)
temp['customfield'] = customfield
progress = fields.get('progress', None)
if progress:
total = progress.get('total', None)
temp['total'] = total
x.append(temp)
print(x)
I have a Python script which processes a .txt file which contains report usage information. I'd like to find a way to cleanly print the attributes of an object using pprint's pprint(vars(object)) function.
The script reads the file and creates instances of a Report class. Here's the class.
class Report(object):
def __init__(self, line, headers):
self.date_added=get_column_by_header(line,headers,"Date Added")
self.user=get_column_by_header(line,headers,"Login ID")
self.report=get_column_by_header(line,headers,"Search/Report Description")
self.price=get_column_by_header(line,headers,"Price")
self.retail_price=get_column_by_header(line,headers,"Retail Price")
def __str__(self):
from pprint import pprint
return str(pprint(vars(self)))
I'd like to be able to print instances of Report cleanly a-la-pprint.
for i,line in enumerate(open(path+file_1,'r')):
line=line.strip().split("|")
if i==0:
headers=line
if i==1:
record=Report(line,headers)
print record
When I call
print record
for a single instance of Report, this is what I get in the shell.
{'date_added': '1/3/2012 14:06',
'price': '0',
'report': 'some_report',
'retail_price': '0.25',
'user': 'some_username'}
None
My question is two-fold.
First, is this a good / desired way to print an object's attributes cleanly? Is there a better way to do this with or without pprint?
Second, why does
None
print to the shell at the end? I'm confused where that's coming from.
Thanks for any tips.
Dan's solution is just wrong, and Ismail's in incomplete.
__str__() is not called, __repr__() is called.
__repr__() should return a string, as pformat does.
print normally indents only 1 character and tries to save lines. If you are trying to figure out structure, set the width low and indent high.
Here is an example
class S:
def __repr__(self):
from pprint import pformat
return pformat(vars(self), indent=4, width=1)
a = S()
a.b = 'bee'
a.c = {'cats': ['blacky', 'tiger'], 'dogs': ['rex', 'king'] }
a.d = S()
a.d.more_c = a.c
print(a)
This prints
{ 'b': 'bee',
'c': { 'cats': [ 'blacky',
'tiger'],
'dogs': [ 'rex',
'king']},
'd': { 'more_c': { 'cats': [ 'blacky',
'tiger'],
'dogs': [ 'rex',
'king']}}}
Which is not perfect, but passable.
pprint.pprint doesn't return a string; it actually does the printing (by default to stdout, but you can specify an output stream). So when you write print record, record.__str__() gets called, which calls pprint, which returns None. str(None) is 'None', and that gets printed, which is why you see None.
You should use pprint.pformat instead. (Alternatively, you can pass a StringIO instance to pprint.)
pprint is just another form of print. When you say pprint(vars(self)) it prints vars into stdout and returns none because it is a void function. So when you cast it to a string it turns None (returned by pprint) into a string which is then printed from the initial print statement. I would suggest changing your print to pprint or redefine print as print if its all you use it for.
def __str__(self):
from pprint import pprint
return str(vars(self))
for i,line in enumerate(open(path+file_1,'r')):
line = line.strip().split("|")
if i == 0:
headers = line
if i == 1:
record = Report(line,headers)
pprint record
One alternative is to use a formatted output:
def __str__(self):
return "date added: %s\nPrice: %s\nReport: %s\nretail price: %s\nuser: %s" % tuple([str(i) for i in vars(self).values()])
Hope this helped
I think beeprint is what you need.
Just pip install beeprint and change your code to:
def __str__(self):
from beeprint import pp
return pp(self, output=False)
For pretty-printing objects which contain other objects, etc. pprint is not enough. Try IPython's lib.pretty, which is based on a Ruby module.
from IPython.lib.pretty import pprint
pprint(complex_object)
#Anyany Pan way is the best.
Here I share a real case, when I deal with Azure resource
in AWS resources, I can use pprint to print the resource detail easily, but it doesn't work with Azure resource. Because they are different types.
from azure.identity import AzureCliCredential
from azure.mgmt.compute import ComputeManagementClient
#from pprint import pprint
from beeprint import pp
import os
# Acquire a credential object using CLI-based authentication.
credential = AzureCliCredential()
# Retrieve subscription ID from environment variable.
subscription_id = os.environ["AZURE_SUBSCRIPTION_ID"]
compute_client = ComputeManagementClient(credential, subscription_id)
vm_list = compute_client.virtual_machines.list_all()
for vm in vm_list:
type(vm)
# pprint(vm) # doesn't work for Azure resource
pp(vm)
output for reference by beeprint
<class 'azure.mgmt.compute.v2020_12_01.models._models_py3.VirtualMachine'>
instance(VirtualMachine):
_attribute_map: {
'additional_capabilities': {
'key': 'properties.additionalCapabilities',
'type': 'AdditionalCapabilities',
},
'availability_set': {
'key': 'properties.availabilitySet',
'type': 'SubResource',
},
'billing_profile': {
'key': 'properties.billingProfile',
...
output by pprint
<class 'azure.mgmt.compute.v2020_12_01.models._models_py3.VirtualMachine'>
<azure.mgmt.compute.v2020_12_01.models._models_py3.VirtualMachine object at 0x1047cf4f0>
<class 'azure.mgmt.compute.v2020_12_01.models._models_py3.VirtualMachine'>
<azure.mgmt.compute.v2020_12_01.models._models_py3.VirtualMachine object at 0x1047cf5b0>