Parse a json file and add the strings to a URL - python

How do I parse a json output get the list from data only and then add the output into say google.com/confidetial and the other strings in the list.
so my json out put i will name it "text"
text = {"success":true,"code":200,"data":["Confidential","L1","Secret","Secret123","foobar","maret1","maret2","posted","rontest"],"errs":[],"debugs":[]}.
What I am looking to do is get the list under data only. so far the script i got is giving me the entire json out put.
json.loads(text)
print text
output = urllib.urlopen("http://google.com" % text)
print output.geturl()
print output.read()

jsonobj = json.loads(text)
print jsonobj['data']
Will print the list in the data section of your JSON.
If you want to open each as a link after google.com, you could try this:
def processlinks(text):
output = urllib.urlopen('http://google.com/' % text)
print output.geturl()
print output.read()
map(processlinks, jsonobj['data'])

info = json.loads(text)
json_text = json.dumps(info["data"])
Using json.dumps converts the python data structure gotten from json.loads back to regular json text.
So, you could then use json_text wherever you were using text before and it should only have the selected key, in your case: "data".

Perhaps something like this where result is your JSON data:
from itertools import product
base_domains = ['http://www.google.com', 'http://www.example.com']
result = {"success":True,"code":200,"data":["Confidential","L1","Secret","Secret123","foobar","maret1","maret2","posted","rontest"],"errs":[],"debugs":[]}
for path in product(base_domains, result['data']):
print '/'.join(path) # do whatever
http://www.google.com/Confidential
http://www.google.com/L1
http://www.google.com/Secret
http://www.google.com/Secret123
http://www.google.com/foobar
http://www.google.com/maret1
http://www.google.com/maret2
http://www.google.com/posted
http://www.google.com/rontest
http://www.example.com/Confidential
http://www.example.com/L1
http://www.example.com/Secret
http://www.example.com/Secret123
http://www.example.com/foobar
http://www.example.com/maret1
http://www.example.com/maret2
http://www.example.com/posted
http://www.example.com/rontest

Related

How do I import an external text file from a website and use a list from inside it

I'm having trouble trying to make this work
import requests
import random
response = requests.get("https://cdn.discordapp.com/attachments/480168592164257792/557872162661335040/aaaaa.txt")
data = response.text
for line in data:
print(line)
I am trying to pull a txt file from the internet, and be able to use the list inside of the text file.
Right now all it does is assume each letter is a different string(?)
response.text seems to be characters, if you loop over them you get each string. (Read about how Python handles strings).
In this case Python doesn't know what a "line" is. So split the data with newlines and try again:
import requests
import random
response = requests.get("https://cdn.discordapp.com/attachments/480168592164257792/557872162661335040/aaaaa.txt")
data = response.text
for line in data.split("\n"):
print(line)
The attribute response.text is a string, so iterating over it will give you individual chars. You can split the string by spaces (or maybe be newlines) to get what you need (I also added a few print statements to show the steps):
import requests
response = requests.get(
"https://cdn.discordapp.com/attachments/480168592164257792/557872162661335040/aaaaa.txt")
print('response.text type:', type(response.text))
print('response.text len:', len(response.text))
print(response.text)
print()
print('splitting by spaces:')
for i, s in enumerate(response.text.split()):
print(i, s)
print()
print('splitting by newlines:')
for i, line in enumerate(response.text.split('\n')):
print(i, line)
The code gives this output:
response.text type: <class 'str'>
response.text len: 21
a = ["please","work"]
splitting by spaces:
0 a
1 =
2 ["please","work"]
splitting by newlines:
0 a = ["please","work"]
#bruno suggested in a comment to use str.splitlines(); this will work even if the response is bytes, since there also exists the method bytes.splitlines().

Python removing a comma from last line within a for loop

Id like to know what it takes to remove a comma from the last line within a for loop in python. When I run the script it gives me the below output(after code section). I want to remove the comma at the end of fourth line "{"{#MACRO}":"queue4"}," Please can someone help?
By the way if there is a better way to construct the block please share the ideas. I'm a beginner and like to learn. :)
Code:
import json
import urllib
import string
Url= "http://guest:guest#localhost:55672/api/queues"
Response = urllib.urlopen(Url)
Data = Response.read()
def Qlist(Name):
Text = ''' {{"{{#MACRO}}":"{Name}"}},'''.format(Name=Name)
print Text
X_json = json.loads(Data)
print '''{
"data":['''
for i in X_json:
VV = i['name']
Qlist(VV)
print ''']
}'''
Below is the Output:
{
"data":[
{"{#MACRO}":"queue1"},
{"{#MACRO}":"queue2"},
{"{#MACRO}":"queue3"},
{"{#MACRO}":"queue4"},
]
}
Thanks so much
You can modify your loop as follows.
# Create and initialize a dictionary (Associative Array)
# data['data'] is an empty list.
# Variable name (data in this case) can be anything you want.
# 'data' is a key. notice the quotations around. it's not a variable.
# I used 'data' as the key, becasue you wanted your final output to include that part.
data = {"data": []}
for i in X_json:
# We are not calling the data dictionary here.
# We are accessing the empty list we have created inside the `data` dict (above) using data['data'] syntax.
# We can use the append function to add an item to a list.
# We create a new dictionary for every `name` item found in your json array and
# append that new dictionary to the data['data'] list.
data['data'].append({"{#MACRO}": i['name']})
print(json.dumps(data))
# or print json.dumps(data, indent=True)
Read more about json.dumps() here. You can read more about python's list and dictionary here
Don't print inside Qlist - instead return a value; then you can join all returned values using comma as separator:
def Qlist(Name):
Text = ''' {{"{{#MACRO}}":"{Name}"}}'''.format(Name=Name)
return Text
print '''{
"data":[''' +
',\n'.join([ Qlist(i['name']) for i in X_json ]) +
''']
}'''
And anyway, using json.dumps is likely a better idea.

Extract values from within strings in a list - python

I have a list in my python code with the following structure:
file_info = ['{file:C:\\samples\\123.exe, directory:C:\\}','{file:C:\\samples\\345.exe, directory:C:\\}',...]
I want to extract just the file and directory values for every value of the list and print it. With the following code, I am able to extract the directory values:
for item in file_info:
print item.split('directory:')[1].strip('}')
But I am not able to figure out a way to extract the 'file' values. The following doesn't work:
print item.split('file:')[1].strip(', directory:C:\}')
Suggestions? If there is any better method to extract the file and directory values other than this, that would be great too. Thanks in advance.
If the format is exactly the same you've provided, you'd better go with using re:
import re
file_info = ['{file:file1, directory:dir1}', '{file:file2, directory:directory2}']
pattern = re.compile(r'\w+:(\w+)')
for item in file_info:
print re.findall(pattern, item)
or, using string replace(), strip() and split() (a bit hackish and fragile):
file_info = ['{file:file1, directory:dir1}', '{file:file2, directory:directory2}']
for item in file_info:
item = item.strip('}{').replace('file:', '').replace('directory:', '')
print item.split(', ')
both code snippets print:
['file1', 'dir1']
['file2', 'directory2']
If the file_info items are just dumped json items (watch the double quotes), you can use json to load them into dictionaries:
import json
file_info = ['{"file":"file1", "directory":"dir1"}', '{"file":"file2", "directory":"directory2"}']
for item in file_info:
item = json.loads(item)
print item['file'], item['directory']
or, literal_eval():
from ast import literal_eval
file_info = ['{"file":"file1", "directory":"dir1"}', '{"file":"file2", "directory":"directory2"}']
for item in file_info:
item = literal_eval(item)
print item['file'], item['directory']
both code snippets print:
file1 dir1
file2 directory2
Hope that helps.
I would do:
import re
regx = re.compile('{\s*file\s*:\s*([^,\s]+)\s*'
','
'\s*directory\s*:\s*([^}\s]+)\s*}')
file_info = ['{file:C:\\samples\\123.exe, directory : C:\\}',
'{ file: C:\\samples\\345.exe,directory:C:\\}'
]
for item in file_info:
print '%r\n%s\n' % (item,
regx.search(item).groups())
result
'{file:C:\\samples\\123.exe, directory : C:\\}'
('C:\\samples\\123.exe', 'C:\\')
'{ file: C:\\samples\\345.exe,directory:C:\\}'
('C:\\samples\\345.exe', 'C:\\')

Manipulate string data

I'm new to python and trying to create a script to modify the output of a JS file to match what is required to send data to an API. The JS file is being read via urllib2.
def getPage():
url = "http://url:port/min_day.js"
req = urllib2.Request(url)
response = urllib2.urlopen(req)
return response.read()
# JS Data
# m[mi++]="19.12.12 09:30:00|1964;2121;3440;293;60"
# m[mi++]="19.12.12 09:25:00|1911;2060;3277;293;59"
# Required format for API
# addbatchstatus.jsp?data=20121219,09:25,3277.0,1911,-1,-1,59.0,293.0;20121219,09:30,3440.0,1964,-1,-1,60.0,293.0
As a breakdown (Required values are bold)
m[mi++]="19.12.12 09:30:00|1964;2121;3440;293;60"
and need to add values of -1,-1 into the string
I've managed to get the date into the correct format and replace characters and line breaks to make the output look as such, but I have a feeling I'm heading down the wrong track if I need to be able to reorder this string values. Although it looks like the order is in reverse in regards to time as well.
20121219,09:30:00,1964,2121,3440,293,60;20121219,09:25:00,1911,2060,3277,293,59
Any help would be greatly appreciated! I'm thinking along the lines of regex might be what I need.
Here's a Regex pattern to strip out the bits you don't want
m\[mi\+\+\]="(?P<day>\d{2})\.(?P<month>\d{2})\.(?P<year>\d{2}) (?P<time>[\d:]{8})\|(?P<v1>\d+);(?P<v2>\d+);(?P<v3>\d+);(?P<v4>\d+);(?P<v5>\d+).+
and replace with
20\P<year>\P<month>\P<day>,\P<time>,\P<v3>,\P<v1>,-1,-1,\P<v5>,\P<v4>
This pattern assumes that the characters before the date are constant. You can replace m\[mi\+\+\]=" with [^\d]+ if you want more general handling of that bit.
So to put this in practice in python:
import re
def getPage():
url = "http://url:port/min_day.js"
req = urllib2.Request(url)
response = urllib2.urlopen(req)
return response.read()
def repl(match):
return '20%s%s%s,%s,%s,%s,-1,-1,%s,%s'%(match.group('year'),
match.group('month'),
match.group('day'),
match.group('time'),
match.group('v3'),
match.group('v1'),
match.group('v5'),
match.group('v4'))
pattern = re.compile(r'm\[mi\+\+\]="(?P<day>\d{2})\.(?P<month>\d{2})\.(?P<year>\d{2}) (?P<time>[\d:]{8})\|(?P<v1>\d+);(?P<v2>\d+);(?P<v3>\d+);(?P<v4>\d+);(?P<v5>\d+).+')
data = [re.sub(pattern, repl, line).split(',') for line in getPage().split('\n')]
# If you want to sort your data
data = sorted(data, key=lambda x:x[0], reverse=True)
# If you want to write your data back to a formatted string
new_string = ';'.join(','.join(x) for x in data)
# If you want to write it back to file
with open('new/file.txt', 'w') as f:
f.write(new_string)
Hope that helps!

how can I get python to append var and dont change it?

import twitter
import unicodedata
import string
def get_tweets(user):
resultado=[]
temp=[]
api=twitter.Api()#
statuses=api.GetUserTimeline(user)
for tweet in statuses:
var = unicodedata.normalize('NFKD', tweet.text).encode('utf-8', 'replace')
print var# HoroĢscopo when i dont append it
resultado.append(var)
print resultado# Horo\xcc\x81scopo, mie\xcc\x81rcoles i get these when i append them
get_tweets('HoroscopoDeHoy')
I assume you want to put the unicode into a list:
var = unicodedata.normalize('NFKD', tweet.text)
resultado.append( var )
temp.append(var.encode('utf-8', 'ignore'))
I think the issue is with the print command. Print runs string conversion on the list which escapes any "funny" characters before writing them to standard output. If you want to display each item on the same line I would suggest doing:
for item in resultado:
print item,
This should bypass the string conversion on the list.
Sources:
http://docs.python.org/reference/simple_stmts.html#the-print-statement
http://docs.python.org/reference/expressions.html#string-conversions

Categories