I have a code:
def replaceJSONFilesList(JSONFilePath, JSONsDataPath, newJSONData):
JSONFileHandleOpen = open(JSONFilePath, 'r')
ReadedJSONObjects = json.load(JSONFileHandleOpen)
JSONFileHandleOpen.close()
ReadedJSONObjectsModifyingSector = ReadedJSONObjects[JSONsDataPath]
for newData in newJSONData:
ReadedJSONObjectsModifyingSector.append(newData)
JSONFileHandleWrite = open(JSONFilePath, 'w')
json.dump(ReadedJSONObjects, JSONFileHandleWrite)
JSONFileHandleWrite.close()
def modifyJSONFile(Path):
JSONFilePath = '/path/file'
JSONsDataPath = "['first']['second']"
newJSONData = 'somedata'
replaceJSONFilesList(JSONFilePath, JSONsDataPath, newJSONData)
Now I have an error:
KeyError: "['first']['second']"
But if I try:
ReadedJSONObjectsModifyingSector = ReadedJSONObjects['first']['second']
Everything is okay.
How I should send the path to the list from the JSON's dictionary — from one function to other?
You cannot pass language syntax elements as if they were data strings. Similarly, you could not pass the string "2 > 1 and False", and expect the function to be able to insert that into an if condition.
Instead, extract the data items and pass them as separate strings (which matches their syntax in the calling routine), or as a tuple of strings. For instance:
JSONsDataPath = ('first', 'second')
...
Then, inside the function ...
ReadedJSONObjects[JSONsDataPath[0]][JSONsDataPath[1]]
If you have a variable sequence of indices, then you need to write code to handle that case; research that on Stack Overflow.
The iterative way to handle an unknown quantity of indices is like this:
obj = ReadedJSONObjects
for index in JSONsDataPath:
obj = obj[index]
Related
I am having an issue to where I am attempting to yield a dictionary back to the caller and casting the returned generator to a list, however when i print event_list, it still states that it is a generator object.
My goal is to multi-process a function by a list of files that will create a local dictionary and return said dictionary to the caller so that I can make a single list containing the returned dictionaries from that method. Not entirely sure where I am going wrong.
import multiprocessing as mp
import json
class Events(object):
def __init__(self):
self._parse_events()
def _parse_events(self):
my_list = ['file1', 'file2', 'file3']
event_results = list()
with mp.Pool() as pool:
results = list(pool.map(self._get_event, my_list))
for result in results:
event_results.append(result)
print(event_results) # <------- this somehow returns a generator although I thought i casted the return to a list
print(sum(event_results, [])) # <--------- this doesn't work now that im dealing with a generator rather than the original list
def _get_event(self, filename):
key_identifier = 'role'
with open(filename, 'r') as data:
for line in data:
if key_identifier in line:
temp_dict = dict()
try:
contents = json.loads(line)
temp_dict['UTC'] = contents.get('utc', 'None')
temp_dict['ServiceID'] = contents[key_identifier].get('ServiceID', 'None')
except (KeyError, ValueError):
continue
if temp_dict: yield temp_dict
Your code is creating a list of generators. It's not the top level object that's not of the right type, it's the inner values, and you're not casting those at all. It may be that you intended to, as you currently have a mostly pointless extra loop where you move the generator objects from results into event_results without doing anything else to them.
You could change that loop to put the inner values into the list:
for result in results:
event_results.extend(result) # extend consumes an iterable
Or if you want a list of lists, rather than just one single flat list, you could do:
for result in results:
event_results.append(list(result)) # convert each generator into a list
I have a string 'request.context.user_id' and I want to split the string by '.' and use each element in the list as a dictionary key. Is there a way to do this for lists of varying lengths without trying to hard code all the different possible list lengths after the split?
parts = string.split('.')
if len(parts)==1:
data = [x for x in logData if x[parts[0]] in listX]
elif len(parts)==2:
data = [x for x in logData if x[parts[0]][parts[1]] in listX]
else:
print("Add more hard code")
listX is a list of string values that should be retrieved by x[parts[0]][parts[1]
logData is a list obtained from reading a json file and then the list can be read into a dataframe using json_normalize... the df portion is provided to give some context about its structure.. a list of dicts:
import json
from pandas.io.json import json_normalize
with open(project_root+"filename") as f:
logData = json.load(f)
df = json_normalize(logData)
If you want arbitrary counts, that means you need a loop. You can use get repeatedly to drill through layers of dictionaries.
parts = "request.context.user_id".split(".")
logData = [{"request": {"context": {"user_id": "jim"}}}]
listX = "jim"
def generate(logData, parts):
for x in logData:
ref = x
# ref will be, successively, x, then the 'request' dictionary, then the
# 'context' dictionary, then the 'user_id' value 'jim'.
for key in parts:
ref = ref[key]
if ref in listX:
yield x
data = list(generate(logData, parts))) # ['jim']
I just realized in the comments you said that you didn't want to create a new dictionary but access an existing one x via chaining up the parts in the list.
(3.b) use a for loop to get/set the value in the key the path
In case you want to only read the value at the end of the path in
import copy
def get_val(key_list, dict_):
reduced = copy.deepcopy(dict_)
for i in range(len(key_list)):
reduced = reduced[key_list[i]]
return reduced
# this solution isn't mine, see the link below
def set_val(dict_, key_list, value_):
for key in key_list[:-1]:
dict_ = dict_.setdefault(key, {})
dict_[key_list[-1]] = value_
get_val()
Where the key_list is the result of string.slit('.') and dict_ is the x dictionary in your case.
You can leave out the copy.deepcopy() part, that's just for paranoid peeps like me - the reason is the python dict is not immutable, thus working on a deepcopy (a separate but exact copy in the memory) is a solution.
set_val() As I said it's not my idea, credit to #Bakuriu
dict.setdefault(key, default_value) will take care of non-existing keys in x.
(3) evaluating a string as code with eval() and/or exec()
So here's an ugly unsafe solution:
def chainer(key_list):
new_str = ''
for key in key_list:
new_str = "{}['{}']".format(new_str, key)
return new_str
x = {'request': {'context': {'user_id': 'is this what you are looking for?'}}}
keys = 'request.context.user_id'.split('.')
chained_keys = chainer(keys)
# quite dirty but you may use eval() to evaluate a string
print( eval("x{}".format(chained_keys)) )
# will print
is this what you are looking for?
which is the innermost value of the mockup x dict
I assume you could use this in your code like this
data = [x for x in logData if eval("x{}".format(chained_keys)) in listX]
# or in python 3.x with f-string
data = [x for x in logData if eval(f"x{chained_keys}") in listX]
...or something similar.
Similarly, you can use exec() to execute a string as code if you wanted to write to x, though it's just as dirty and unsafe.
exec("x{} = '...or this, maybe?'".format(chained_keys))
print(x)
# will print
{'request': {'context': {'user_id': '...or this, maybe?'}}}
(2) An actual solution could be a recursive function as so:
def nester(key_list):
if len(key_list) == 0:
return 'value' # can change this to whatever you like
else:
return {key_list.pop(0): nester(key_list)}
keys = 'request.context.user_id'.split('.')
# ['request', 'context', 'user_id']
data = nester(keys)
print(data)
# will result
{'request': {'context': {'user_id': 'value'}}}
(1) A solution with list comprehension for split the string by '.' and use each element in the list as a dictionary key
data = {}
parts = 'request.context.user_id'.split('.')
if parts: # one or more items
[data.update({part: 'value'}) for part in parts]
print(data)
# the result
{'request': 'value', 'context': 'value', 'user_id': 'value'}
You can overwrite the values in data afterwards.
Say you have a piece of code that accepts either a list or a file name, and must filter through each item of either one provided by applying the same criteria:
import argparse
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required = True)
group.add_argument('-n', '--name', help = 'single name', action = 'append')
group.add_argument('-N', '--names', help = 'text file of names')
args = parser.parse_args()
results = []
if args.name:
# We are dealing with a list.
for name in args.name:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
else:
# We are dealing with a file name.
with open(args.names) as f:
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
I'd like to remove as much redundancy as possible in the above code. I tried creating the following function for strip and lower, but it didn't remove much repeat code:
def getFilteredName(name):
return name.strip().lower()
Is there any way to iterate over both a list and a file in the same function? How should I go about reducing as much code as possible?
You have duplicate code that you can simplify: list and file-objects are both iterables - if you create a method that takes an iterable and returns the correct output you have less code duplication (DRY).
Choice of datastructure:
You do not want duplicate items, meaning set() or dict() are better suited to collect the data you want to parse - they eliminate duplicates by design which is faster then looking if an item is already in a list:
if the order of names matter use
a OrderedDict from collections when on python 3.6 or less or
a normal dict for 3.7 or more (dicts gurantee input order)
more info: Are dictionaries ordered in Python 3.6+?
if name order is not important, use a set()
Either one of the above choices removes duplicates for you.
import argparse
from collections import OrderedDict # use normal dict on 3.7+ it hasinput order
def get_names(args):
"""Takes an iterable and returns a list of all unique lower cased elements, that
have at least length 6."""
seen = OrderedDict() # or dict or set
def add_names(iterable):
"""Takes care of adding the stuff to your return collection."""
k = [n.strip().lower() for n in iterable] # do the strip().split()ing only once
# using generator comp to update - use .add() for set()
seen.update( ((n,None) for n in k if len(n)>6))
if args.name:
# We are dealing with a list:
add_names(args.name)
elif args.names:
# We are dealing with a file name:
with open(args.names) as f:
add_names(f)
# return as list
return list(seen)
Testcode:
parser = argparse.ArgumentParser()
group = parser.add_mutually_exclusive_group(required = True)
group.add_argument('-n', '--name', help = 'single name', action = 'append')
group.add_argument('-N', '--names', help = 'text file of names')
args = parser.parse_args()
results = get_names(args)
print(results)
Output for -n Joh3333n -n Ji3333m -n joh3333n -n Bo3333b -n bo3333b -n jim:
['joh3333n', 'ji3333m', 'bo3333b']
Input file:
with open("names.txt","w") as names:
for n in ["a"*k for k in range(1,10)]:
names.write( f"{n}\n")
Output for -N names.txt:
['aaaaaaa', 'aaaaaaaa', 'aaaaaaaaa']
Subclass list and make the subclass a context manager:
class F(list):
def __enter__(self):
return self
def __exit__(self,*args,**kwargs):
pass
Then the conditional can decide what to iterate over
if args.name:
# We are dealing with a list.
thing = F(args.name)
else:
# We are dealing with a file name.
thing = open(args.names)
And the iteration code can be factored out.
results = []
with thing as f:
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
Here is a similar solution that makes an io.StringIO object from either the file or the list then uses a single set of instructions to process them.
import io
if args.name:
# We are dealing with a list.
f = io.StringIO('\n'.join(args.name))
else:
# We are dealing with a file name.
with open(args.names) as fileobj:
f = io.StringIO(fileobj.read())
results = []
for name in f:
name = name.strip().lower()
if name not in results and len(name) > 6: results.append(name)
If the file is huge and memory is scarce, this has the disadvantage of reading the entire file into memory.
Suppose I have the following function:
def function3(start, end):
"""Read MO information."""
config_found = False
var = []
for line in v['molecular orbital primitive coefficients']:
if line.strip() == end:
config_found = False
elif config_found:
i = line.rstrip()
var.append(i)
elif line.strip() == start:
config_found = True
var1 = [elem.strip() for elem in var]
var2 = var1[1:-1]
var3 = np.array([line.split() for line in var2])
var3 = np.asarray([list(map(float, item)) for item in var3])
return var3
And suppose I store its output in variables like so:
monumber1=function3('1','2')
monumber2=function3('2','3')
monumber3=function3('3','4')
etc.
Is there a way for me to execute this function a set number of times and store the output in a set number of variables without manually setting the variable name and function arguments every time? Maybe using a for loop? This is my attempt, but I'm struggling to make it functional:
for i in xrange(70):
monumber[x] = function3([i],[i+1])
Thank you!
The problem is your use of square brackets. Here is code that should work:
monumber = [] # make it an empty list
for i in xrange(70):
monumber.append(function3(str(i),str(i+1))) # you had string integers, so cast
For the more Pythonic one-liner, you can use a list comprehension:
monumber = [function3(str(i),str(i+1)) for i in xrange(70)]
Now that the monumber variable has been created, I can access the element at any given index i using the syntax monumber[i]. Some examples:
first = monumber[0] # gets the first element of monumber
last = monumber[-1] # gets the last index of monumber
for i in xrange(10,20): # starts at i = 10 and ends at i = 19
print(monumber[i]) # print the i-th element of monumber
You've almost got it. Except you should use i on the left hand side, too:
monumber[i] = function3([i],[i+1])
Now, this is the basic idea, but the code will only work if monumber is already a list with enough elements, otherwise an IndexError will occur.
Instead of creating a list and filling it with placeholders in advance, we can dynamically append new values to it:
monumber = []
for i in xrange(70):
monumber.append(function3([i],[i+1]))
Another problem is that you seem to be confusing different types of arguments that your function works with. In the function body, it looks like start and end are strings, but in your code, you give to lists with one integer each. Without changing the function, you can do:
monumber = []
for i in xrange(70):
monumber.append(function3(str(i),str(i+1)))
So im running into an issue trying to get my dictionary to change within a function without returning anything here is my code:
def load_twitter_dicts_from_file(filename, emoticons_to_ids, ids_to_emoticons):
in_file = open(filename, 'r')
emoticons_to_ids = {}
ids_to_emoticons = {}
for line in in_file:
data = line.split()
if len(data) > 0:
emoticon = data[0].strip('"')
id = data[2].strip('"')
if emoticon not in emoticons_to_ids:
emoticons_to_ids[emoticon] = []
if id not in ids_to_emoticons:
ids_to_emoticons[id] = []
emoticons_to_ids[emoticon].append(id)
ids_to_emoticons[id].append(emoticon)
basically what im trying to do is to pass in two dictionaries and fill them with information from the file which works out fine but after i call it in the main and try to print the two dictionaries it says they are empty. Any ideas?
def load_twitter_dicts_from_file(filename, emoticons_to_ids, ids_to_emoticons):
…
emoticons_to_ids = {}
ids_to_emoticons ={}
These two lines replace whatever you pass to the function. So if you passed two dictionaries to the function, those dictionaries are never touched. Instead, you create two new dictionaries which are never passed to the outside.
If you want to mutate the dictionaries you pass to the function, then remove those two lines and create the dictionaries first.
Alternatively, you could also return those two dictionaries from the function at the end:
return emoticons_to_ids, ids_to_emoticons