Extracting data from python wsdl client using suds - python

I have created the following Python code that reads a method from a webservice:
def GetWeatherParameters():
""""""
client = Client('www.address.asmx?wsdl')
#WebServiceClient.GetWeatherParameters()
return client.service.GetWeatherParameters()
It works fine and I get the data returned and can print it, however the data returned contains mutltiple columns and this code just prints out everything at once.
Does anybody know how I can extract the returned data column by column?

It all depends on the returned data - a handy way to display it nicely is to use pprint:
from pprint import pprint
pprint(your_data)
That'll format it nicely so it's easier to see the structure. Then if it's a list or similar, to get the first row you can do your_data[0] to get the first one, or loop, to print it row by row:
for row in your_data:
print row
print row[0] # could be the first column...
And go from there...

Related

Looping through json.loads(response.text) with Python

i'm learning and would appreciate any help in this code.
The issue is trying to print the values in the data that are contained in one line of the JSON using Python.
import json
import requests
data = json.loads(response.text)
print(len(data)) #showing correct value
#where i'm going wrong below obviously this will print the first value then the second as it's indexed. Q how do I print all values when using seperate print statements when the total indexed value is unknown?
for item in data:
print(data[0]['full_name'])
print(data[1]['full_name'])
I tried without the index value this gave me the first value multiple times depending on the length.
I expect to be able to access from the JSON file each indexed value separately even though they are named the same thing "full_name" for example.
import json
import requests
data = json.loads(response.text)
print(len(data)) #showing correct value
for item in data:
print(item['full_name'])
#the below code will throw error.. because python index starts with 0
print(data[0]['full_name'])
print(data[1]['full_name'])
hope this help
Presuming data is a list of dictionaries, where each dictionary contains a full_name key:
for item in data:
print(item['full_name'])
This code sample from your post makes no sense:
for item in data:
print(data[0]['full_name'])
print(data[1]['full_name'])
Firstly it's a syntax error because there is nothing indented underneath the loop.
Secondly it's a logic error, because the loop variable is item but then you never refer to that variable.

Reading a dictionary from within a dictionary

I have a json file for tweet data. The data that I want to look at is the text of the tweet. For some reason, some of the tweets are too long to put into the normal text part of the dictionary.
It seems like there is a dictionary within another dictionary and I can't figure out how to access it very well.
Basically, what I want in the end is one column of a data frame that will have all of the text from each individual tweet. Here is a link to a small sample of the data that contains a problem tweet.
Here is the code I have so far:
import json
import pandas as pd
tweets = []
#This writes the json file so that I can work with it. This part works correctly.
with open("filelocation.txt") as source
for line in source:
if line.strip():
tweets.append(json.loads(line))
print(len(tweets)
df = pd.DataFrame.from_dict(tweets)
df.info()
When looking at the info you can see that there will be a column called extended_tweet that only encompasses one of the two sample tweets. Within this column, there seems to be another dictionary with one of those keys being full_text.
I want to add another column to the dataframe that just has this information along with the normal text column when the full_text is null.
My first thought was to try and read that specific column of the dataframe as a dictionary again using:
d = pd.DataFrame.from_dict(tweets['extended_tweet]['full_text])
But this doesn't work. I don't really understand why that doesn't work as that is how I read the data the first time.
My guess is that I can't look at the specific names because I am going back to the list and it would have to read all or none. The error it gives me says "KeyError: 'full_text' "
I also tried using the recommendation provided by this website. But this gave me a None value no matter what.
Thanks in advance!
I tried to do what #Dan D. suggested, however, this still gave me errors. But it gave me the idea to try this:
tweet[0]['extended_tweet']['full_text']
This works and gives me the value that I am looking for. But I need to run through the whole thing. So I tried this:
df['full'] = [tweet[i]['extended_tweet']['full_text'] for i in range(len(tweet))
This gives me "Key Error: 'extended_tweet' "
Does it seem like I am on the right track?
I would suggest to flatten out the dictionaries like this:
tweet = json.loads(line)
tweet['full_text'] = tweet['extended_tweet']['full_text']
tweets.append(tweet)
I don't know if the answer suggested earlier works. I never got that successfully. But I did figure out something else that works well for me.
What I really needed was a way to display the full text of a tweet. I first loaded the tweets from the json with what I posted above. Then I noticed that in the data file, there is something called truncated. If this value is true, the tweet is cut short and the full tweet is placed within the
tweet[i]['extended_tweet]['full_text]
In order to access it, I used this:
tweet_list = []
for i in range(len(tweets)):
if tweets[i]['truncated'] == 'True':
tweet_list.append(tweets[i]['extended_tweet']['full_text']
else:
tweet_list.append(tweets[i]['text']
Then I can work with the data using the whol text from each tweet.

Web scraping with Python3, and string formatting ?

this script is meant to parse Bloomberg finance to find the GBP value during the day, this following script does that however when it returns you get this:
{'dateTime': '2017-01-17T22:00:00Z', 'value': 1.6406}
I don't want the dateTime, or the value text there. I don't know how to get rid of it. and when I try it gives me errors like this: list index out of range.
any answers will be greatly appreciated. here is the script (in python3):
import urllib.request
import json
htmltext = urllib.request.urlopen('https://www.bloomberg.com/markets/api/bulk- time-series/price/GBPAUD%3ACUR?timeFrame=1_DAY').read().decode('utf8')
data = json.loads(htmltext)
datapoints = data[1]['price']
print(datapoints)
This should work for you.
print (data[0]['price'][-1]['value'])
EDIT: To get all the values,
for data_row in data[0]['price']:
print data_row['value']
EXPLANATION: data[0] gets the first and only element of the list, which is a dict. ['price'] gets the list corresponding to the price key. [-1] gets the last element of the list, which is presumably the data you'll be looking for as it's the latest data point.
Finally, ['value'] gets the value of the currency conversion from the dict we obtained earlier.

Python, make a list for a for loop when parsing a CSV

I'm working on parsing a CSV from an export of my company's database. This is a slimmed down version has around 15 columns, the actual CSV has over 400 columns of data (all necessary). The below works perfectly:
inv = csv.reader(open('inventory_report.txt', 'rU'), dialect='excel', delimiter="\t")
for PART_CODE,MODEL_NUMBER,PRODUCT_NAME,COLOR,TOTAL_ONHAND,TOTAL_ON_ORDER,TOTAL_SALES,\
SALES_YEAR_TO_DATE,SALES_LASTYEAR_TO_DATE,TOTAL_NUMBER_OF_QTYsSOLD,TOTAL_PURCHASES,\
PURCHASES_YEAR_TO_DATE,PURCHASES_LASTYEAR_TO_DATE,TOTAL_NUMBER_OF_QTYpurchased,\
DATE_LAST_SOLD,DATE_FIRST_SOLD in inv:
print ('%-20s %-90s OnHand: %-10s OnOrder: %-10s') % (MODEL_NUMBER,PRODUCT_NAME,\
TOTAL_ONHAND,TOTAL_ON_ORDER)
As you can already tell, it will be very painful to read when the 'for' loop has 400+ names attached to it for each of item of the row in the CSV. However annoying, it is however very handy for being able to access the output I'm after by this method. I can easily get specific items and perform calculations within the common names we're already familiar with in our point of sale database.
I've been attempting to make this more readable. Trying to figure out a way where I could define a list of all these names in the for loop but still be able to call for them by name when it's time to do calculations and print the output.
Any thoughts?
you can use csv.DictReader. Elements are read as dict. Assuming u have first line as column name.
inv = csv.DictReader(open('file.csv')):
for i in inv:
print ('%-20s %-90s OnHand: %-10s OnOrder: %-10s') % (i['MODEL_NUMBER'],i['PRODUCT_NAME'],i['TOTAL_ONHAND'],i['TOTAL_ON_ORDER'])
And if you want the i[MODEL_NUMBER] to come from list. Define a list with all column name. Assuming, l = ['MODEL_NUMBER','PRODUCT_NAME','TOTAL_ONHAND','TOTAL_ON_ORDER']. Then my print statement in above code will be,
print ('%-20s %-90s OnHand: %-10s OnOrder: %-10s') % (i[l[0]],i[l[1]],i[l[2]],i[l[3]])
Code not checked.. :)
To make your code more readable and easier to reuse, you should read the name of the columns dynamically. CSV files use to have a header with this information on top of the file, so you can read the first line and store it in a tuple or a list.

Search a single column for a particular value in a CSV file and return an entire row

Issue
The code does not correctly identify the input (item). It simply dumps to my failure message even if such a value exists in the CSV file. Can anyone help me determine what I am doing wrong?
Background
I am working on a small program that asks for user input (function not given here), searches a specific column in a CSV file (Item) and returns the entire row. The CSV data format is shown below. I have shortened the data from the actual amount (49 field names, 18000+ rows).
Code
import csv
from collections import namedtuple
from contextlib import closing
def search():
item = 1000001
raw_data = 'active_sanitized.csv'
failure = 'No matching item could be found with that item code. Please try again.'
check = False
with closing(open(raw_data, newline='')) as open_data:
read_data = csv.DictReader(open_data, delimiter=';')
item_data = namedtuple('item_data', read_data.fieldnames)
while check == False:
for row in map(item_data._make, read_data):
if row.Item == item:
return row
else:
return failure
CSV structure
active_sanitized.csv
Item;Name;Cost;Qty;Price;Description
1000001;Name here:1;1001;1;11;Item description here:1
1000002;Name here:2;1002;2;22;Item description here:2
1000003;Name here:3;1003;3;33;Item description here:3
1000004;Name here:4;1004;4;44;Item description here:4
1000005;Name here:5;1005;5;55;Item description here:5
1000006;Name here:6;1006;6;66;Item description here:6
1000007;Name here:7;1007;7;77;Item description here:7
1000008;Name here:8;1008;8;88;Item description here:8
1000009;Name here:9;1009;9;99;Item description here:9
Notes
My experience with Python is relatively little, but I thought this would be a good problem to start with in order to learn more.
I determined the methods to open (and wrap in a close function) the CSV file, read the data via DictReader (to get the field names), and then create a named tuple to be able to quickly select the desired columns for the output (Item, Cost, Price, Name). Column order is important, hence the use of DictReader and namedtuple.
While there is the possibility of hard-coding each of the field names, I felt that if the program can read them on file open, it would be much more helpful when working on similar files that have the same column names but different column organization.
Research
CSV Header and named tuple:
What is the pythonic way to read CSV file data as rows of namedtuples?
Converting CSV data to tuple: How to split a CSV row so row[0] is the name and any remaining items are a tuple?
There were additional links of research, but I cannot post more than two.
You have three problems with this:
You return on the first failure, so it will never get past the first line.
You are reading strings from the file, and comparing to an int.
_make iterates over the dictionary keys, not the values, producing the wrong result (item_data(Item='Name', Name='Price', Cost='Qty', Qty='Item', Price='Cost', Description='Description')).
for row in (item_data(**data) for data in read_data):
if row.Item == str(item):
return row
return failure
This fixes the issues at hand - we check against a string, and we only return if none of the items matched (although you might want to begin converting the strings to ints in the data rather than this hackish fix for the string/int issue).
I have also changed the way you are looping - using a generator expression makes for a more natural syntax, using the normal construction syntax for named attributes from a dict. This is cleaner and more readable than using _make and map(). It also fixes problem 3.

Categories