Access online data using python [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am trying to access the federal reserve bank data at https://fred.stlouisfed.org/series/FEDFUNDS
what is the code I can write to access this database and then put it in a dictionary? Or do I have to download the file first and save it on my computer?

The easiest way to pull that data in would be to download and parse the CSV file listed under the "Download" button.
You can use the Requests library to download the file, then use the native CSV library.

See https://stackoverflow.com/a/32400969/9214517 for how to do it.
Let's say you allow to keep the data in a pandas DataFrame (as the link above do), this is the code:
import pandas as pd
import requests
import io
url = "https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=968&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=FEDFUNDS&scale=left&cosd=1954-07-01&coed=2018-10-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2009-06-01&line_index=1&transformation=lin&vintage_date=2018-11-28&revision_date=2018-11-28&nd=1954-07-01"
s = requests.get(url).content.decode("utf-8")
df = pd.read_csv(io.StringIO(s)
Then your df will be:
DATE FEDFUNDS
0 1954-07-01 0.80
1 1954-08-01 1.22
2 1954-09-01 1.06
3 1954-10-01 0.85
4 1954-11-01 0.83
....
And if you insist on a dict, use this instead of the last line above to convert your CSV data s:
mydict = dict([line.split(",") for line in s.splitlines()])
The key is how to get the URL: Hit the download button on the page you quoted, and copy the link to CSV.

Related

What is the best way to fetch all data (posts and their comments) from Reddit? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a requirement to analyse all the comments about a subreddit, (e.g. r/dogs, say from 2015 onward) using the Python Reddit API Wrapper (PRAW). I'd like the data to be stored in a JSON.
I have found a way to get all of the top, new, hot, etc. submissions as a JSON using PRAW:
new_submissions = reddit.subreddit("dogs").new()
top_submissions = reddit.subreddit("dogs").top("all")
However, these are only the submissions for the subreddit, and do not include comments. How can I get the comments for these posts as well?
You can use the Python Pushshift.io API Wrapper (PSAW) to get all the most recent submissions and comments from a specific subreddit, and can even do more complex queries (such as searching for specific text inside a comment). The docs are available here.
For example, you can use the get_submissions() function to get the top 1000 submissions from r/dogs from 2015:
import datetime as dt
import praw
from psaw import PushshiftAPI
r = praw.Reddit(...)
api = PushshiftAPI(r)
start_epoch=int(dt.datetime(2015, 1, 1).timestamp()) # Could be any date
submissions_generator = api.search_submissions(after=start_epoch, subreddit='dogs', limit=1000) # Returns a generator object
submissions = list(submissions_generator) # You can then use this, store it in mongoDB, etc.
Alternatively, to get the first 1000 comments from r/dogs in 2015, you can use the search_comments() function:
start_epoch=int(dt.datetime(2015, 1, 1).timestamp()) # Could be any date
comments_generator = api.search_comments(after=start_epoch, subreddit='dogs', limit=1000) # Returns a generator object
comments = list(comments_generator)
As you can see, PSAW still uses PRAW, and so returns PRAW objects for submissions and comments, which may be handy.

Saving csv in specific json format [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
How do i save my csv into json , where column ="questions"
Particularly in this format
[
"What is dewa?" , "what is regulations?" ,"What is the fire rating for building having more than 2 basements?"
]
Right now I am getting my json is in this format
{"Question":{"0":"what is dewa?","1":"what is regulations?","2":"What is the fire rating for building having more than 2 basements?"}}
Code , for csv too json
import pandas as pd
read_csv = pd.read_csv(r'C:\Users\heba.fatima\Desktop\final-fire/answers.csv') # or delimiter = ';'
read_csv=read_csv[["Question"]]
read_csv.head()
read_csv.to_json (r'C:\Users\heba.fatima\Desktop\flaskapi\data\answers.json')
You can use orient argument for to_json:
read_csv['Question'].to_json(orient='values')
output:
["what is dewa?", "what is regulations?", "What is the fire rating for building having more than 2 basements?"]

Filter csv file without using panda or csv module PYTHON [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Without using any modules (panda, csv, etc) I need to filter this csv file (https://data.world/prasert/rotten-tomatoes-top-movies-by-genre/workspace/file?filename=rotten_tomatoes_top_movies_2019-01-15.csv), I would like to filter through ONLY the movies that are in the animation genre and drop the other movies.
I have used open, split and the for loop to read the data, but I am struggling to filter the movies into genres.
I have created a list called genres, and then appended it with genres.append(line.split(",") [4]), but this only gives a list of genres from the genre column rather than giving me info of each movie in a particular genre.
I know it is crazy to attempt this without the modules(this is for school), but is it even possible to do this without them?
Thanks in advance.
try this.
f = open("file_name", "r",encoding="utf-8")
new_list=[]
header=0
for line in f.readlines():
#if header is present in the file
if header==0:
new_list.append(line)
header=1
continue
#add Genre Name to filter
if line.split(',')[4]=='genre_name':
new_list.append(line)
#writing filtered list to output file.
out_flie=open('output.txt','w',encoding="utf-8")
for element in new_list:
out_flie.write(element)
out_flie.write('\n')
out_flie.close()

Dynamically scrape JSON values [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I wanted to scrape some data from a JSON response. Here is the link
I need the values in lessonTypes. I want to export all the values separated with a comma.
So Theorieopleidingen has 4 values Beroepsopleidingen has 8 and so on.
I want to dynamically scrape so even if the num of values is changing, it always scrapes all with comma seperated.
Sorry if my explanation is week.
Since it's a JSON object, why don't you use just requests and do what(ever) you want (with the data).
For example:
import requests
url = "https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=3780&examtype=B"
for value in requests.get(url).json()['lessonTypes'].values():
print(value)
Output:
['Motor', 'Auto', 'Bromfiets', 'Tractor']
['Bus', 'Aanhangwagen achter bus', 'Vrachtauto', 'Aanhangwagen achter vrachtauto', 'Heftruck', 'ADR', 'Taxi', 'Tractor']
['Aangepaste auto', 'Automaat personenauto']
['Motor', 'Auto', 'Aanhangwagen achter auto', 'Bromfiets', 'Brommobiel']
EDIT:
To access individual keys and their values you might want to try this for example:
import requests
url = "https://www.cbr.nl/web/show?id=289168&langid=43&channel=json&cachetimeout=-1&elementHolder=289170&ssiObjectClassName=nl.gx.webmanager.cms.layout.PagePart&ssiObjectId=285674&contentid=3780&examtype=B"
lesson_types = requests.get(url).json()['lessonTypes']
print(list(lesson_types.keys()))
print("\n".join(lesson_types['Theorieopleidingen']))
Output:
['Theorieopleidingen', 'Beroepsopleidingen', 'Bijzonderheden', 'Praktijkopleidingen']
Motor
Auto
Bromfiets
Tractor

Python youtube api status.rejectionReason [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I've looked through the documentation here: https://developers.google.com/youtube/v3/docs/videos
But am unsure how to access status.rejectionReason with python. I've been using youtube-uploader for my uploading, and I dont believe there is any commands to return the reason a video was rejected. The ideal scenario would be to get a list of all my videos, check which ones have been rejected, then return the links of the ones that have been rejected.
From what I can see the rejectionReason is within a JSON "videos resource" format. You can access this with Python's built-in JSON library:
from json import load
with open('video.json') as file: # Opens the JSON file and assigns it to the variable 'file' within the loop
data = load(f) # Loads the file into a dictionary which you can access with key:value pairs
The JSON file provided as a sample on the site follows this format for the rejectionReason:
"status": {
"uploadStatus": string,
"failureReason": string,
"rejectionReason": string,
"privacyStatus": string,
"publishAt": datetime,
"license": string,
"embeddable": boolean,
"publicStatsViewable": boolean
}
So your final script would look like this I believe:
from json import *
def get_rejection_reason(file):
with open(file) as f:
data = load(f)
return data["status"]["rejectionReason"]
get_rejection_reason("video.json")

Categories