python : append to data use json.dumps

python : append to data use json.dumps - python

I am trying to create JSON file. using json.dumps and success printing.
I have a question.
The format I wanted was
channel_info = OrderedDict()
table = OrderedDict()
table2 = OrderedDict()
channel_info["KIND1"] = pkind[2].text
table[ptime[10].text] = pnk[11].text
table[ptime[11].text] = pnk[12].text
channel_info["TABLE1"] = table
channel_info["KIND2"] = pkind[2].text
table2[ptime[10].text] = pnk[11].text
table2[ptime[11].text] = pnk[12].text
channel_info["TABLE2"] = table2
result:
{
"KIND1": "xxxx",
"TABLE1": {
"09:10": "aaaa",
"10:10": "bbbb"
},
"KIND2": "yyyy",
"TABLE2": {
"09:10": "cccc",
"10:10": "dddd"
}
}
How to output the same format using a while loop?
The names of the JSON objects? KIND1, TABLE1, KIND2, TABLE2 and so on ...
I wonder how you can change these names dynamically using a while loop.
thank you.

You could do something like this (assuming table dictionary is static over each loop, as it seems in the example you give):
channel_info = dict()
# n_tables is the number of iterations you need
for i in range(n_tables):
table = dict()
channel_info["KIND%s" % (i+1)] = pkind[1].text
table[ptime[10].text] = pnk[11].text
table[ptime[11].text] = pnk[12].text
channel_info["TABLE%s" % (i+1)] = table
You don't need the table name dynamically since you assign it to a dictionary key.

Basically, if I understood your question right:
...
i=0
no_of_tables = 4
while i<=no_of_tables:
table_counter = i+1
table_counter = str(table_counter)
kind = 'KIND' + table_counter
table = 'TABLE' + table_counter
channel_info[kind] = pkind[2].text
table[ptime[10].text] = pnk[11].text
table[ptime[11].text] = pnk[12].text
channel_info[table] = table
Note: I know it can be optimized, but for the sake of simplicity I left it as is.

Related

How to convert Hive schema to Bigquery schema using Python?

What i get from api:
"name":"reports"
"col_type":"array<struct<imageUrl:string,reportedBy:string>>"
So in hive schema I got:
reports array<struct<imageUrl:string,reportedBy:string>>
Note: I got hive array schema as string from api
My target:
bigquery.SchemaField("reports", "RECORD", mode="NULLABLE",
fields=(
bigquery.SchemaField('imageUrl', 'STRING'),
bigquery.SchemaField('reportedBy', 'STRING')
)
)
Note: I would like to create universal code that can handle when i receive any number of struct inside of the array.
Any tips are welcome.

I tried creating a script that parses your input which is reports array<struct<imageUrl:string,reportedBy:string>>. This converts your input to a dictionary that could be used as schema when creating a table. The main idea of the apporach is instead of using SchemaField(), you can create a dictionary which is much easier than creating SchemaField() objects with parameters using your example input.
NOTE: The script is only tested based on your input and it can parse more fields if added in struct<.
import re
from google.cloud import bigquery
def is_even(number):
if (number % 2) == 0:
return True
else:
return False
def clean_string(str_value):
return re.sub(r'[\W_]+', '', str_value)
def convert_to_bqdict(api_string):
"""
This only works for a struct with multiple fields
This could give you an idea on constructing a schema dict for BigQuery
"""
num_even = True
main_dict = {}
struct_dict = {}
field_arr = []
schema_arr = []
# Hard coded this since not sure what the string will look like if there are more inputs
init_struct = sample.split(' ')
main_dict["name"] = init_struct[0]
main_dict["type"] = "RECORD"
main_dict["mode"] = "NULLABLE"
cont_struct = init_struct[1].split('<')
num_elem = len(cont_struct)
# parse fields inside of struct<
for i in range(0,num_elem):
num_even = is_even(i)
# fields are seen on even indices
if num_even and i != 0:
temp = list(filter(None,cont_struct[i].split(','))) # remove blank elements
for elem in temp:
fields = list(filter(None,elem.split(':')))
struct_dict["name"] = clean_string(fields[0])
# "type" works for STRING as of the moment refer to
# https://cloud.google.com/bigquery/docs/schemas#standard_sql_data_types
# for the accepted data types
struct_dict["type"] = clean_string(fields[1]).upper()
struct_dict["mode"] = "NULLABLE"
field_arr.append(struct_dict)
struct_dict = {}
main_dict["fields"] = field_arr # assign dict to array of fields
schema_arr.append(main_dict)
return schema_arr
sample = "reports array<struct<imageUrl:string,reportedBy:string,newfield:bool>>"
bq_dict = convert_to_bqdict(sample)
client = bigquery.Client()
project = client.project
dataset_ref = bigquery.DatasetReference(project, '20211228')
table_ref = dataset_ref.table("20220203")
table = bigquery.Table(table_ref, schema=bq_dict)
table = client.create_table(table)
Output:

Passing multiple values from dictionary into method using loop

i have a question regarding adding dictionary key and value to method using loop
This is what i was thinking to write but it doesn't work how i want because it creates a packet just with one key/value every time
for key in packetData:
for name in packetData[key]:
packets = Ether()/IP()/UDP()/createsPacket(key, name=packetData[key][name])
print ("as name " + name + " \n as value " + str(packetData[key][name]))
Instead of writing this manually like that :
packets1 = Ether()/IP()/UDP()/createsPacket("65", UserID = "name", Password = "pass123", ETX = 123)
packets2 = Ether()/IP()/UDP()/createsPacket("72", PriceID = 123, Side = 12, MaxAmount = 123, MinAmount = 123, Price = 123000)
    json then converted to dictionary in python , this is data that i want to pass in
{
"65":{
"UserID":"vcjazfan",
"Password":"ejujwlhk",
"SessionID":115,
"ETX":192
},
"66":{
"UserID":"dzmtrssy",
"SessionID":35,
"Reason":"zbwivjcv",
"ETX":43
},
"72":{
"InstrumentIndex":171,
"PriceID":217,
"Side":226,
"MaxAmount":210,
"MinAmount":219,
"Price":47,
"PriceProvider":207,
"ETX":78
},
Made more generic for easier understanding, hoping it helps
Generic code
dictionary = {"65":{ "UserID":"vcjazfan", "Password":"ejujwlhk", "ETX":192} , "72":{ "InstrumentIndex":171, "PriceID":217, } }
#This is what i was thinking to write but it doesn't work how i want because it creates a packet just with one key/value every time
for key in dictionary:
for name in dictionary[key]:
value=dictionary[key][name]
packets = method(key, name=value) # in first iteration when key is 65 , name = "UserID" , value = "vcjazfan"
# in second iteration when key is 65 , name = "Password" , value = "ejujwlhk"
#Instead of writing this manually like that :
packets1 = method("65", UserID = "name", Password = "pass123", ETX = 123)
packets2 = method("72", InstrumentIndex = 123, PriceID = 12,)

This question solved my problem : How to pass dictionary items as function arguments in python?
solution to my original code:
Allpackets= []
for key in packetData:
Allpackets.append(packets/createsPacket(key, **packetData[key]))
Solution to generic one:
dictionary = {"65":{ "UserID":"vcjazfan", "Password":"ejujwlhk", "ETX":192} , "72":{ "InstrumentIndex":171, "PriceID":217, } }
Allpackets = []
for key in dictionary:
Allpackets.append( method(key, **dictionary))
#Instead of writing this manually like that :
packets1 = method("65", UserID = "name", Password = "pass123", ETX = 123)
packets2 = method("72", InstrumentIndex = 123, PriceID = 12,)

Python sql returning list

got some functions with sqlstatements. My first func is fine because i get only 1 result.
My second function returns a large list of errorcodes and i dont know how to get them back for response.
TypeError: <sqlalchemy.engine.result.ResultProxy object at 0x7f98b85ef910> is not JSON serializable
Tried everything need help.
My Code:
def topalarms():
customer_name = request.args.get('customer_name')
machine_serial = request.args.get('machine_serial')
#ts = request.args.get('ts')
#ts_start = request.args.get('ts')
if (customer_name is None) or (machine_serial is None):
return missing_param()
# def form_response(response, session):
# response['customer'] = customer_name
# response['serial'] = machine_serial
# return do_response(customer_name, form_response)
def form_response(response, session):
result_machine_id = machine_id(session, machine_serial)
if not result_machine_id:
response['Error'] = 'Seriennummer nicht vorhanden/gefunden'
return
#response[''] = result_machine_id[0]["id"]
machineid = result_machine_id[0]["id"]
result_errorcodes = error_codes(session, machineid)
response['ErrorCodes'] = result_errorcodes
return do_response(customer_name, form_response)
def machine_id(session, machine_serial):
stmt_raw = '''
SELECT
id
FROM
machine
WHERE
machine.serial = :machine_serial_arg
'''
utc_now = datetime.datetime.utcnow()
utc_now_iso = pytz.utc.localize(utc_now).isoformat()
utc_start = datetime.datetime.utcnow() - datetime.timedelta(days = 30)
utc_start_iso = pytz.utc.localize(utc_start).isoformat()
stmt_args = {
'machine_serial_arg': machine_serial,
}
stmt = text(stmt_raw).columns(
#ts_insert = ISODateTime
)
result = session.execute(stmt, stmt_args)
ts = utc_now_iso
ts_start = utc_start_iso
ID = []
for row in result:
ID.append({
'id': row[0],
'ts': ts,
'ts_start': ts_start,
})
return ID
def error_codes(session, machineid):
stmt_raw = '''
SELECT
name
FROM
identifier
WHERE
identifier.machine_id = :machineid_arg
'''
stmt_args = {
'machineid_arg': machineid,
}
stmt = text(stmt_raw).columns(
#ts_insert = ISODateTime
)
result = session.execute(stmt, stmt_args)
errors = []
for row in result:
errors.append(result)
#({'result': [dict(row) for row in result]})
#errors = {i: result[i] for i in range(0, len(result))}
#errors = dict(result)
return errors
My problem is func error_codes somethiing is wrong with my result.
my Output should be like this:
ABCNormal
ABCSafety
Alarm_G01N01
Alarm_G01N02
Alarm_G01N03
Alarm_G01N04
Alarm_G01N05

I think you need to take a closer look at what you are doing correctly with your working function and compare that to your non-working function.
Firstly, what do you think this code does?
for row in result:
errors.append(result)
This adds to errors one copy of the result object for each row in result. So if you have six rows in result, errors contains six copies of result. I suspect this isn't what you are looking for. You want to be doing something with the row variable.
Taking a closer look at your working function, you are taking the first value out of the row, using row[0]. So, you probably want to do the same in your non-working function:
for row in result:
errors.append(row[0])
I don't have SQLAlchemy set up so I haven't tested this: I have provided this answer based solely on the differences between your working function and your non-working function.

You need a json serializer. I suggest using Marshmallow: https://marshmallow.readthedocs.io/en/stable/
There are some great tutorials online on how to do this.

Google Sheets API - Python - Constructing Body for BatchUpdate

I need to create the body for multiple updates to a Google Spreadsheet using Python.
I used the Python dictionary dict() but that doesn't work for multiple values that are repeated as dict() doesn't allow multiple keys.
My code snippet is:
body = {
}
for i in range (0,len(deltaListcolNames) ):
rangeItem = deltaListcolNames[i]
batch_input_value = deltaListcolVals[i]
body["range"] = rangeItem
body["majorDimension"] = "ROWS"
body["values"] = "[["+str(batch_input_value)+"]]"
batch_update_values_request_body = {
# How the input data should be interpreted.
'value_input_option': 'USER_ENTERED',
# The new values for the input sheet... to apply to the spreadsheet.
'data': [
dict(body)
]
}
print(batch_update_values_request_body)
request = service.spreadsheets().values().batchUpdate(
spreadsheetId=spreadsheetId,
body=batch_update_values_request_body)
response = request.execute()

Thanks for the answer, Graham.
I doubled back and went away from using the dict paradigm and found that by using this grid, I was able to make the data structure. Here is how I coded it...
perhaps a bit quirky but it works nicely:
range_value_data_list = []
width = 1
#
height = 1
for i in range (0,len(deltaListcolNames) ):
rangeItem = deltaListcolNames[i]
# print(" the value for rangeItem is : ", rangeItem)
batch_input_value = str(deltaListcolVals[i])
print(" the value for batch_input_value is : ", batch_input_value)
# construct the data structure for the value
grid = [[None] * width for i in range(height)]
grid[0][0] = batch_input_value
range_value_item_str = { 'range': rangeItem, 'values': (grid) }
range_value_data_list.append(range_value_item_str)

Review the documentation for the Python client library methods: The data portion is a list of dict objects.
So your construct is close, you just need a loop that fills the data list:
data = []
for i in range(0, len(deltaListcolNames)):
body = {}
# fill out the body
rangeItem = deltaListcolNames[i]
....
# Add this update's body to the array with the other update bodies.
data.append(body)
# build the rest of the request
...
# send the request
...

Python container troubles

Basically what I am trying to do is generate a json list of SSH keys (public and private) on a server using Python. I am using nested dictionaries and while it does work to an extent, the issue lies with it displaying every other user's keys; I need it to list only the keys that belong to the user for each user.
Below is my code:
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f) # gets the creation time of file (f)
username_list = f.split('/') # splits on the / character
user = username_list[2] # assigns the 2nd field frome the above spilt to the user variable
key_length_cmd = check_output(['ssh-keygen','-l','-f', f]) # Run the ssh-keygen command on the file (f)
attr_dict = {}
attr_dict['Date Created'] = str(datetime.datetime.fromtimestamp(c_time)) # converts file create time to string
attr_dict['Key_Length]'] = key_length_cmd[0:5] # assigns the first 5 characters of the key_length_cmd variable
ssh_user_key_dict[f] = attr_dict
user_dict['SSH_Keys'] = ssh_user_key_dict
main_dict[user] = user_dict
A list containing the absolute path of the keys (/home/user/.ssh/id_rsa for example) is passed to the function. Below is an example of what I receive:
{
"user1": {
"SSH_Keys": {
"/home/user1/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:20.995862",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa": {
"Date Created": "2017-03-09 01:03:21.457867",
"Key_Length]": "2048 "
},
"/home/user2/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:21.423867",
"Key_Length]": "2048 "
},
"/home/user1/.ssh/id_rsa.pub": {
"Date Created": "2017-03-09 01:03:20.956862",
"Key_Length]": "2048 "
}
}
},
As can be seen, user2's key files are included in user1's output. I may be going about this completely wrong, so any pointers are welcomed.

Thanks for the replies, I read up on nested dictionaries and found that the best answer on this post, helped me solve the issue: What is the best way to implement nested dictionaries?
Instead of all the dictionaries, I simplfied the code and just have one dictionary now. This is the working code:
class Vividict(dict):
def __missing__(self, key): # Sets and return a new instance
value = self[key] = type(self)() # retain local pointer to value
return value # faster to return than dict lookup
main_dict = Vividict()
def ssh_key_info(key_files):
for f in key_files:
c_time = os.path.getctime(f)
username_list = f.split('/')
user = username_list[2]
key_bit_cmd = check_output(['ssh-keygen','-l','-f', f])
date_created = str(datetime.datetime.fromtimestamp(c_time))
key_type = key_bit_cmd[-5:-2]
key_bits = key_bit_cmd[0:5]
main_dict[user]['SSH Keys'][f]['Date Created'] = date_created
main_dict[user]['SSH Keys'][f]['Key Type'] = key_type
main_dict[user]['SSH Keys'][f]['Bits'] = key_bits

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

python : append to data use json.dumps - python

Related

How to convert Hive schema to Bigquery schema using Python?

Passing multiple values from dictionary into method using loop

Python sql returning list

Google Sheets API - Python - Constructing Body for BatchUpdate

Python container troubles

Categories

Resources