unable to ssign python dataframe to values properly - python

I'm trying to get data result from a function. the type is dataframe. Then assign the result into variables, which will then be shown in API call response.
However the variable I got have [''] around each value.
result = callreport_sentiment2('','i love this send payments')
print(result)
print(type(result))
ROW_ID, KEYWORDS ,line_Sentence, LINE_POS_SCORE=callreport_sentiment2('','i love this send
payments').values.T
I use this code to show variable value in to the response
response = {
"KEYWORDS": KEYWORDS,
"operation": 'sentiment'
}
Here is the result I got:
PAR_ROW_ID KEYWORDS line_Sentence LINE_POS_SCORE
0 Missing_ROW_ID send payments i love this send payments 0
<class 'pandas.core.frame.DataFrame'>
['Missing_ROW_ID'] ['send payments'] ['i love this send payments'] ['0']
...
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type ndarray is not JSON serializable

Related

PySpark: Converting config string items for dataframe operation

Input:
source_dataframe = spark.createDataFrame(
[
(1,"1", "2020-01-01",10),
(1,"2", "2020-01-01",20),
(1,"2", "2020-02-01",30)
],
("country_code", "cust_id","day","value")
)
Config:
input_config = """
[ {
"source":"source_dataframe",
"opearation":"max",
"group":["country_code", "cust_id"]
}
]
"""
import json
config_dict = json.loads(input_config)
print(config_dict)
read from the config and apply operation on the input dataframe: Here I have hardcoded dataframe (source_dataframe) and operation (max).this works fine
for each in config_dict:
result = source_dataframe.groupBy(["country_code", "cust_id"]).agg(max("value"))
result.show()
However instead of harcoding, if I try to read dataframe from config dynamically and apply the operation on input , I am running into different errors. This could be because, while reading they are converted as string. How do I convert the string object so that they work?
Error: 'str' object has no attribute 'groupBy'
result = each['source'].groupBy(["country_code", "cust_id"]).agg(max("value"))
Error: TypeError: 'str' object is not callable
result = source_dataframe.groupBy(["country_code", "cust_id"]).agg(each['opearation']("value"))
This section where I read groupBy dynamically works fine.
result = source_dataframe.groupBy(each["group"]).agg(max("value"))
tried looking other posts, but could not figure out a solution. Can anyone please help.
Maybe you should evaluate the string, which would grant you access to the underlying dataframe.
result = eval(each['source']).groupBy(["country_code", cust_id"]).agg(max("value"))
Can't verify since i got an error from your first part.

Import JSON key value into pandas dataframe error

I'm trying to capture a json and put it on a json dataframe. The obtained json has the following format:
{"#odata.context":"https://was-p.bcnet.bcb.gov.br/olinda/servico/PTAX/versao/v1/odata$metadata#_CotacaoDolarPeriodo(cotacaoCompra,cotacaoVenda,dataHoraCotacao)","value":[{"cotacaoCompra":1.80030,"cotacaoVenda":1.80110,"dataHoraCotacao":"2000-01-03 19:43:00.0"},{"cotacaoCompra":1.83290,"cotacaoVenda":1.83370,"dataHoraCotacao":"2000-01-04 19:13:00.0"},{"cotacaoCompra":1.85360,"cotacaoVenda":1.85440,"dataHoraCotacao":"2000-01-05 19:11:00.0"},...{"cotacaoCompra":1.83840,"cotacaoVenda":1.83920,"dataHoraCotacao":"2000-05-25 19:14:00.0"}]}
However, I want to get only his 'value' key in the following format in dataframe pandas:
cotacaoCompra
cotacaoVenda
dataHoraCotacao
1.80030
1.80110
2000-01-03 19:43:00.0
1.83290
1.83370
2000-01-04 19:13:00.0
:----
:------:
-----:
1.83840
1.83920
2000-05-25 19:14:00.0
But when I run the code, the following error appears: ValueError: Invalid file path or buffer object type: <class 'dict'>
Here is the code I am using:
url = 'https://olinda.bcb.gov.br/olinda/servico/PTAX/versao/v1/odata/CotacaoDolarPeriodo(dataInicial=#dataInicial,dataFinalCotacao=#dataFinalCotacao)?#dataInicial=%2701-01-2000%27&#dataFinalCotacao=%2711-01-2020%27&$top=100&$format=json&$select=cotacaoCompra,cotacaoVenda,dataHoraCotacao'
resp = requests.get(url=url)
data = resp.json()
json.dumps(data['value'])[1:-1]
json.loads(json.dumps(data['value']))
df = pd.read_json(data)
df.head()
Can you help me move json to the dataframe in the format above?

How to pass row of dataframe into API if System.ArgumentException being raised?

I Have df that looks like this:
id
1
2
3
I need to iterate through the dataframe (only 1 column in the frame df)and pass each ID into an API, where it says &leadId= after the equal sign. I.e &leadId=1
I have been trying this code:
lst = []
for index,row in df.iterrows():
url = 'https://url.com/path/to?username=string&password=string&leadId=index'
xml_data1 = requests.get(url).text
lst.append(xml_data1)
print(xml_data1)
but I get error:
System.ArgumentException: Cannot convert index to System.Int32.
What am I doing wrong in my code to not pass the index value into the parameter in the api? I have also tried passing row into the API and get the same error.
Thank you in advance.
edit:
converted dataframe to int by this line of code:
df = df.astype(int)
changed following to row instead of index in API parameters.
for index,row in df.iterrows():
url = 'https://url.com/path/to?username=string&password=string&leadId=row'
xml_data1 = requests.get(url).text
lst.append(xml_data1)
print(xml_data1)
getting same error.
edit2:
full traceback:
System.ArgumentException: Cannot convert index to System.Int32.
Parameter name: type ---> System.FormatException: Input string was not in a correct format.
at System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
at System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info)
at System.String.System.IConvertible.ToInt32(IFormatProvider provider)
at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
at System.Web.Services.Protocols.ScalarFormatter.FromString(String value, Type type)
--- End of inner exception stack trace ---
at System.Web.Services.Protocols.ScalarFormatter.FromString(String value, Type type)
at System.Web.Services.Protocols.ValueCollectionParameterReader.Read(NameValueCollection collection)
at System.Web.Services.Protocols.HttpServerProtocol.ReadParameters()
at System.Web.Services.Protocols.WebServiceHandler.CoreProcessRequest()
You should convert your int to the type which the API expects
df.id=df.id.astype('int32')
With your current url string API is trying to convert 'row' which is a string to an integer which is not possible.
update your url string
url = 'https://url.com/path/to?username=string&password=string&leadId={}'.format(row)

pandas read_json: "If using all scalar values, you must pass an index"

I have some difficulty in importing a JSON file with pandas.
import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
This is the error that I get:
ValueError: If using all scalar values, you must pass an index
The file structure is simplified like this:
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
It is from the machine learning course of University of Washington on Coursera. You can find the file here.
Try
ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')
That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').
You can also do something like this:
import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
data = json.load(f)
Now data is a dictionary. You can pass it to a dataframe constructor like this:
df = pd.DataFrame({'count': data})
You can do as #ayhan mention which will give you a column base format
Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.
df = pd.DataFrame([data])
I think what is happening is that the data in
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
is being read as a string instead of a json
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
is actually
'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'
Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing
The best way is to do a json loads on the string to convert it to a dict and load it into pandas
myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)
{
"biennials": 522004,
"lb915": 116290
}
df = pd.read_json('values.json')
As pd.read_json expects a list
{
"biennials": [522004],
"lb915": [116290]
}
for a particular key, it returns an error saying
If using all scalar values, you must pass an index.
So you can resolve this by specifying 'typ' arg in pd.read_json
map_index_to_word = pd.read_json('Datasets/people_wiki_map_index_to_word.json', typ='dictionary')
For newer pandas, 0.19.0 and later, use the lines parameter, set it to True.
The file is read as a json object per line.
import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json', lines=True)
If fixed the following errors I encountered especially when some of the json files have only one value:
ValueError: If using all scalar values, you must pass an index
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
ValueError: Trailing data
For example
cat values.json
{
name: "Snow",
age: "31"
}
df = pd.read_json('values.json')
Chances are you might end up with this
Error: if using all scalar values, you must pass an index
Pandas looks up for a list or dictionary in the value. Something like
cat values.json
{
name: ["Snow"],
age: ["31"]
}
So try doing this. Later on to convert to html tohtml()
df = pd.DataFrame([pd.read_json(report_file, typ='series')])
result = df.to_html()
I solved this by converting it into an array like so
[{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}]

Python coinbase API prices as float

I am trying to work with Coinbase's API, and would like to use their prices as a float, but the object returns an API object and I don't know how to convert it.
For example if I call client.get_spot_price() it will return this:
{
"amount": "316.08",
"currency": "USD"
}
And I just want the 316.08. How can I solve it?
data = {
"amount": "316.08",
"currency": "USD"
}
price = float(text['amount'])
With API use JSON parser
import json
data = client.get_spot_price()
price = float(json.loads(data)['amount'])
print price
It looks like a json output. You could import json library in python and read it using the loads method, for sample:
import json
# get data from the API's method
response = client.get_spot_price()
# parse the content of the response using json format
data = json.loads(response )
# get the amount and convert to float
amount = float(data['amount'])
print(amount)
At first you put your returned object into a variable and check the type of the returned value.
just like this:
print type(your_returned object/variable)
If this is dictionary You can access data from a dictionary via dictionary key. A dictionary structure is:
dict = {key_1 : value_1 , key_2 : value_2, ......key_n : value_n}
1.You can access all values of dictionary.
like below:
print dict[key_1] #output will be returned value_1
Then you can convert returned data to integer or float.
For converting to integer:
int(your_data)
convert to float:
float(your_data)
If it is not a dictionary you need to convert it to a dictionary or json via :
json.loads(your returned object)
In your case you can do:
variable = client.get_spot_price()
print type(variable) #if it is dictionary or not
print float(variable["amount"]) #will return your price in float

Categories