How to specify Python API results - python

Trying to use Lobbying Disclosure API from the US Congress database in Python.
I want to get only the contributions for specific members of congress along with who contributed (lobbyist or organization).
import requests
import re
import json
parameters = {
"contribution_payee": "Rashida Tlaib for Congress"
}
response = requests.get("https://lda.senate.gov/api/v1/contributions", params=parameters)
# print(response.json())
def jprint(obj):
lines = json.dumps(obj, sort_keys=True, indent=4)
with open('Test.txt', 'w') as f:
for line in lines:
f.write(line)
jprint(response.json())
I am getting dictionary lists of not just Rashida Tlaib for example, but from everyone on the LDA-203 form that also received a donation from that lobbyist or organization.
Output in text file, only want data from Rashida Tlaib and not John Thune or Ted Lieu for example. But I still want the name of org and lobbyist who donated.
Example of what each LDA-203 contribution form looks like: includes all candidates who received donation from specific org or lobbyist. Using Python to narrow down data for specific members of congress rather than just sift through it by hand. https://lda.senate.gov/filings/public/contribution/6285c999-2ec6-4d27-8963-b40bab7def55/print/
Is there a way I can narrow down my results to only include certain members of congress that I pass as a parameter, while excluding the information of everyone else who received a donation from that lobbyist or org?
Was thinking regular expressions could do the trick, but I am not very good at implementing them. Should I try to do this in R instead of Python?
Thank you!

Related

How do I get the base of a synonym/plural of a word in python?

I would like to use python to convert all synonyms and plural forms of words to the base version of the word.
e.g. Babies would become baby and so would infant and infants.
I tried creating a naive version of plural to root code but it has the issue that it doesn't always function correctly and can't detect a large amount of cases.
contents = ["buying", "stalls", "responsibilities"]
for token in contents:
if token.endswith("ies"):
token = token.replace('ies','y')
elif token.endswith('s'):
token = token[:-1]
elif token.endswith("ed"):
token = token[:-2]
elif token.endswith("ing"):
token = token[:-3]
print(contents)
I have not used this library before, so that this with a grain of salt. However, NodeBox Linguistics seems to be a reasonable set of scripts that will do exactly what you are looking for if you are on MacOS. Check the link here: https://www.nodebox.net/code/index.php/Linguistics
Based on their documentation, it looks like you will be able to use lines like so:
print( en.noun.singular("people") )
>>> person
print( en.verb.infinitive("swimming") )
>>> swim
etc.
In addition to the example above, another to consider is a natural language processing library like NLTK. The reason why I recommend using an external library is because English has a lot of exceptions. As mentioned in my comment, consider words like: class, fling, red, geese, etc., which would trip up the rules that was mentioned in the original question.
I build a python library - Plurals and Countable, which is open source on github. The main purpose is to get plurals (yes, mutliple plurals for some words), but it also solves this particular problem.
import plurals_counterable as pluc
pluc.pluc_lookup_plurals('men', strict_level='dictionary')
will return a dictionary of the following.
{
'query': 'men',
'base': 'man',
'plural': ['men'],
'countable': 'countable'
}
The base field is what you need.
The library actually looks up the words in dictionaries, so it takes some time to request, parse and return. Alternatively, you might use REST API provided by Dictionary.video. You'll need contact admin#dictionary.video to get an API key. The call will be like
import requests
import json
import logging
url = 'https://dictionary.video/api/noun/plurals/men?key=YOUR_API_KEY'
response = requests.get(url)
if response.status_code == 200:
return json.loads(response.text)['base']
else:
logging.error(url + ' response: status_code[%d]' % response.status_code)
return None

Output of python code is one character per line

I'm new to Python and having some trouble with an API scraping I'm attempting. What I want to do is pull a list of book titles using this code:
r = requests.get('https://api.dp.la/v2/items?q=magic+AND+wizard&api_key=09a0efa145eaa3c80f6acf7c3b14b588')
data = json.loads(r.text)
for doc in data["docs"]:
for title in doc["sourceResource"]["title"]:
print (title)
Which works to pull the titles, but most (not all) titles are outputting as one character per line. I've tried adding .splitlines() but this doesn't fix the problem. Any advice would be appreciated!
The problem is that you have two types of title in the response, some are plain strings "Germain the wizard" and some others are arrays of string ['Joe Strong, the boy wizard : or, The mysteries of magic exposed /']. It seems like in this particular case, all lists have length one, but I guess that will not always be the case. To illustrate what you might need to do I added a join here instead of just taking title[0].
import requests
import json
r = requests.get('https://api.dp.la/v2/items?q=magic+AND+wizard&api_key=09a0efa145eaa3c80f6acf7c3b14b588')
data = json.loads(r.text)
for doc in data["docs"]:
title = doc["sourceResource"]["title"]
if isinstance(title, list):
print(" ".join(title))
else:
print(title)
In my opinion that should never happen, an API should return predictable types, otherwise it looks messy on the users' side.

How to Return Nested Values from Complicated JSON API

I am setting up a weather camera which will provide a live stream of the current conditions outside, but I also would like to overlay continuously updated weather conditions (temperature, wind speed/direction, current weather) from a local National Weather Service weather station, from a browser API source provided in JSON format.
I have had success extracting the desired values from a different API source using a Python script I wrote; however long story short that API source is unreliable. Therefore I am using API from the official National Weather Service ASOS station at my nearby airport. The output from the new API source I am polling from is rather complicated, however, with various tiers of indentation. I have not worked with Python very long and tutorials and guides online have either been for other languages (Java or C++ mostly) or have not worked for my specific case.
First off, here is the structure of the JSON that I am receiving:
I underlined the values I am trying to extract. They are listed under the OBSERVATIONS section, associated with precip_accum_24_hour_value_1, wind_gust_value_1, wind_cardinal_direction_value_1d, and so on. The issue is there are two values underneath each observation so the script I have tried isn't returning the values I want. Here is the code I have tried:
import urllib.request
import json
f = urllib.request.urlopen('https://api.synopticdata.com/v2/stations/latest?token=8c96805fbf854373bc4b492bb3439a67&stid=KSTC&complete=1&units=english&output=json')
json_string = f.read()
parsed_json = json.loads(json_string)
for each in parsed_json['STATION']:
observations = each['OBSERVATIONS']
print(observations)
This prints out everything underneath the OBSERVATIONS in the JSON as expected, as one long string.
{'precip_accum_24_hour_value_1': {'date_time': '2018-12-06T11:53:00Z', 'value': 0.01}, 'wind_gust_value_1': {'date_time': '2018-12-12T01:35:00Z', 'value': 14.0},
to show a small snippet of the output I am receiving. I was hoping I could individually extract the values I want from this string, but everything I have attempted is not working. I would really appreciate some guidance for finishing this piece of code so I can return the values I am looking for. I realize it may be some kind of loop or special syntax.
Try something like this:
for each in parsed_json['STATION']:
observations = each['OBSERVATIONS']
for k, v in observations.items():
print(k, v["value"])
JSON maps well into python's dictionary and list types, so accessing substructures can be done with a[<index-or-key>] syntax. Iteration over key-value pairs of a dictionary can be done as I've shown above. If you're not familiar with dictionaries in python yet, I'd recommend reading about them. Searching online should yield a lot of good tutorials.
Does this help?
When you say the JSON is complicated, it really is just nested dictionaries within the main JSON response. You would access them in the same way as you would the initial JSON blob:
import urllib.request
import json
f = urllib.request.urlopen('https://api.synopticdata.com/v2/stations/latest?token=8c96805fbf854373bc4b492bb3439a67&stid=KSTC&complete=1&units=english&output=json')
json_string = f.read()
parsed_json = json.loads(json_string)
for each in parsed_json['STATION']:
for value in each:
print(value, each[value])

Tracing Canadian borders with a function. (longitude and latitude boundaries of Canada)

I'm writing a python script that generates random addresses in Canada. To do this, I have to generate random tuples (longitude,latitude) that are within Canadian borders (not in the ocean).
I figured that I can approximate the borders with small rectangles (just like in calculus). Which does the job, but it is not optimal/accurate.
I couldn't find any academic paper/discussion on the web, maybe my searches did not contain the right keywords. Can you help me find the right resources or even answer this question? The programming part is fine, I just need the math!
Thank you
You are talking about reverse geocoding. The easiest way to do this is to make use of the google maps geocoding API.
You can do this without registering, but you are limited to like 4-5 calls per day. You can register for free for a relatively high number of calls (20-25k last I checked) and if you exceed that you have to pay.
import requests
import json
def getplace(lat, lon):
url = "http://maps.googleapis.com/maps/api/geocode/json?"
url += "latlng=%s,%s&sensor=false" % (lat, lon)
data = {'key': 'your-api-key-goes-here'} # If using your free 5 calls, include no data and just doa get request on the url
v = requests.post(url=url, data=data)
j = json.loads(v.text)
components = j['results'][0]['address_components']
country = town = None
for c in components:
if "country" in c['types']:
country = c['long_name']
if "locality" in c['types']:
town = c['long_name']
return town, country
print(getplace(45.425533, -75.69248))
print(getplace(45.525533, -77.69248))
The above outputs:
('Ottawa', 'Canada')
("Barry's Bay", 'Canada')
You can print out the raw response print(v.text to see the data object and find the fields you actually care about

How to generate summary from JSON data Using watson discovery news servies

How to generate summary like IBM from json using discovery news services with python
qopts = {'nested':'(enriched_text.entities)','filter':'(enriched_text.entities.type::Person)','term':'(enriched_text.entities.text,count:10)','filter':'(enriched_text.concepts.text:infosys)','filter':'(enriched_text.concepts.text:ceo)'}
my_query = discovery.query('system', 'news', qopts)
print(json.dumps(my_query, indent=2))
This query is proper or not for find ceo of Infosys ?
Output came in large json format the how I identify answer or create summary like top ten ceo or people.
How to generate summary from json using discovery news services with python. I fire query then output became large json format ..how to find proper summary from that json file my query is correct or not
I believe there are two questions here.
In order to answer a question like "Who is the CEO of Infosys?" I would instead make use of the natural_language_query parameter as follows:
qopts = {'natural_language_query':'Who is the CEO of Infosys?','count':'5'}
response = discovery.query(environment_id='system',collection_id='news',query_options=qopts)
print(json.dumps(response,indent=2))
In order to make use of aggregations, they must be specified in a single aggregation parameter combined with filter aggregations in the query options as follows:
qopts = {'aggregation': 'nested(enriched_text.entities).filter(enriched_text.entities.type::Person).term(enriched_text.entities.text,count:10)', 'filter':'enriched_text.entities:(text:Infosys,type:Company)','count':'0'}
response = discovery.query(environment_id='system',collection_id='news',query_options=qopts}
print(json.dumps(response,indent=2))
Notice that aggregations are chained/combined with the . symbol.

Categories