This has taken me over a day of trial and error. I am trying to keep a dictionary of queries and their respective matches in a search. My problem is that there can be one or more matches. My current solution is:
match5[query_site] will already have the first match but if it finds another match it will append it using the code below.
temp5=[] #temporary variable to create array
if isinstance(match5[query_site],list): #check if already a list
temp5.extend(match5[query_site])
temp5.append(match_site)
else:
temp5.append(match5[query_site])
match5[query_site]=temp5 #add new location
That if statement is literally to prevent extend converting my str element into an array of letters. If I try to initialize the first match as a single element array I get None if I try to directly append. I feel like there should be a more pythonic method to achieve this without a temporary variable and conditional statement.
Update: Here is an example of my output when it works
5'flank: ['8_73793824', '6_133347883', '4_167491131', '18_535703', '14_48370386']
3'flank: X_11731384
There's 5 matches for my "5'flank" and only 1 match for my "3'flank".
So what about this:
if query_site not in match5: # here for the first time
match5[query_site] = [match_site]
elif isinstance(match5[query_site], str): # was already here, a single occurrence
match5[query_site] = [match5[query_site], match_site] # make it a list of strings
else: # already a list, so just append
match5[query_site].append(match_site)
I like using setdefault() for cases like this.
temp5 = match5.setdefault(query_site, [])
temp5.append(match_site)
It's sort of like get() in that it returns an existing value if the key exists but you can provide a default value. The difference is that if the key doesn't exist already setdefault inserts the default value into the dict.
This is all you need to do
if query_site not in match5:
match5[query_site] = []
temp5 = match5[query_site]
temp5.append(match_site)
You could also do
temp5 = match5.setdefault(query_site, [])
temp5.append(match_site)
Assuming match5 is a dictionary, what about this:
if query_site not in match5: # first match ever
match5[query_site] = [match_site]
else: # entry already there, just append
match5[query_site].append(temp5)
Make the entries of the dictionary to be always a list, and just append to it.
Related
I have below list where i would like to segregate based on condition where all strings that starts with same string would become a newlist
Eg:-
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h", "test-3.7.10/asm/posix_types.h", "test-3.7.10/dsm/posix_types.h"]
Here is my try:-
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h"]
element = list1[0].split("/")[0]
newlist = []
for i in list1:
if i.startswith(element):
newlist.append(i)
print newlist
o/p:- ['glibc-2.11.3/include/sys/file.h', 'glibc-2.11.3/include/sys/ioctl.h', 'glibc-2.11.3/lib/crtn.o']
I get the 1st set of paths that starts with same string. I need to loop over for other remaining sets.
Basically What i am looking is , for a 1st iteration i am expecting to get all paths that starts with glibc-2.11.3 and for 2nd iteration all paths that starts with linux-libc-headers-2.6.32..so on. Actually i need to perform some check on set of same paths (starts with same string) that gets returned. Please help!
Use a dictionary to keep track of your filepaths
list1 = ["glibc-2.11.3/include/sys/file.h", "glibc-2.11.3/include/sys/ioctl.h", "glibc-2.11.3/lib/crtn.o", "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h" , "linux-libc-headers-2.6.32/asm-generic/bitsperlong.h", "test-3.7.10/asm/posix_types.h", "test-3.7.10/dsm/posix_types.h"]
directories = {}
for filepath in list1:
key = filepath.split("/")[0]
directories.setdefault(key, []).append(filepath)
print(directories)
Outputs:
{'glibc-2.11.3': ['glibc-2.11.3/include/sys/file.h',
'glibc-2.11.3/include/sys/ioctl.h',
'glibc-2.11.3/lib/crtn.o'],
'linux-libc-headers-2.6.32': ['linux-libc-headers-2.6.32/asm-generic/bitsperlong.h',
'linux-libc-headers-2.6.32/asm-generic/bitsperlong.h'],
'test-3.7.10': ['test-3.7.10/asm/posix_types.h',
'test-3.7.10/dsm/posix_types.h']}
list(directories.items()) would give you the list of lists you were trying to create, but instead of doing that you can just use directories.items() the exact same way you would use a list of lists.
dictionary.setdefault(key, []) is a quirky way of saying give me the list at this dictionary key or if there is not already a list there, create a new list and save it in the dictionary under this dictionary key and then give me that. documentation.
I am writing a python script that reads a large JSON file containing data from an API, and iterates through all the objects. I want to extract all objects that have a specific matching/duplicate "key:value", and save it to a separate JSON file.
Currently, I have it almost doing this, however the one flaw in my code that I cannot fix is that it skips the first occurrence of the duplicate object, and does not add it to my dupObjects list. I have an OrderedDict keeping track of unique objects, and a regular list for duplicate objects. I know this means that when I add the second occurrence, I must add the first (unique) object, but how would I create a conditional statement that only does this once per unique object?
This is my code at the moment:
import collections import OrderedDict
import json
with open('input.json') as data:
data = json.load(data)
uniqueObjects = OrderedDict()
dupObjects = list()
for d in data:
value = d["key"]
if value in uniqueObjects:
# dupObjects.append(uniqueObjects[hostname])
dupHostnames.append(d)
if value not in uniqueObjects:
uniqueObjects[value] = d
with open('duplicates.json', 'w') as g:
json.dump(dupObjects, g, indent=4)
Where you see that one commented line is where I tried to just add the object from the OrderedList to my list, but that causes it to add it as many times as there are duplicates. I only want it to add it one time.
Edit:
There are several unique objects that have duplicates. I'm looking for some conditional statement that can add the first occurrence of an object that has duplicates, once per unique object.
You could group by key.
Using itertools:
def by_key(element):
return ["key"]
grouped_by_key = itertools.groupby(data, key_func=by_key)
Then is just a matter of finding groups that have more than one element.
For details check: https://docs.python.org/3/howto/functional.html#grouping-elements
In this line you forgot .keys(), so you skip need values
if value in uniqueObjects.keys():
And this line
if value not in uniqueObjects.keys():
Edit #1
My mistake :)
You need to add first duplicate object from uniqueObjects in first if
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1
dupHostnames.append(d)
Edit #2
Try this option, it will write only the first occurrence in duplicates
if value in uniqueObjects:
if uniqueObjects[value] != -1:
dupObjects.append(uniqueObjects[value])
uniqueObjects[value] = -1
I have list like below.
test = ['firstvalue', 'thirdvalue']
I want to insert the some values to the list.
secondvalue at index 1 and fourthvalue at index 3
so the output list looks like below
test = ['firstvalue', 'secondvalue', 'thirdvalue', 'fourthvalue']
I tried the below way but it doesn't work for me
print test.insert(1, "secondvalue")
Is there any alternate way to do this?
The insert function does not return a value, but rather modifies the array used on it. Here's an example:
test = ['firstvalue', 'thirdvalue']
test.insert(1, "secondvalue")
print test
I've run into the following issue, with my code below. Basically I have a list of objects with an id and a corresponding weight, and I have another list of id's. I want to use only the weights of the objects matching the id's in the second list.
d_weights = [{'d_id':'foo', 'weight': -0.7427}, ...]
d_ids = ['foo', ...]
for dtc_id in d_ids:
d_weight = next((d['weight'] for d in d_weights if d['d_id'] == dtc_id), "")
print str(d_weight)
if str(d_weight) != "":
print "not empty string! "+str(d_weight)
The output for this is:
-0.7427
0.0789
-0.0039
-0.2436
-0.0417
not empty string! -0.0417
Why is only the last one not empty when I can print them fine and they are obviously not equal to an empty string? How do I check that the next() actually returned something before using it?
You haven't correct algorithm.
So d_weight = next((d['weight'] for d in d_weights if d['d_id'] == dtc_id), "") iterate only once.
On every cycle for weight_dict in d_weights: you've got only first dict of d_weights.
Without more data, i can't reproduce your output.
In my case it works fine:
-0.7427
not empty string! -0.7427
-0.327
not empty string! -0.327
Correct code you can find in DhiaTN's answer.
just iterate of the list of the keys and get the values from each dict :
for weight_dict in d_weights
for key in d_ids:
print weight_dict.get(key, "")
I've been programming for less than four weeks and have run into a problem that I cannot figure out. I'm trying to append a string value to an existing key with an existing string stored in it but if any value already exists in the key I get "str object has no attribute 'append'.
I've tried turning the value to list but this also does not work. I need to use the .append() attribute because update simply replaces the value in clientKey instead of appending to whatever value is already stored. After doing some more research, I understand now that I need to somehow split the value stored in clientKey.
Any help would be greatly appreciated.
data = {}
while True:
clientKey = input().upper()
refDate = strftime("%Y%m%d%H%M%S", gmtime())
refDate = refDate[2 : ]
ref = clientKey + refDate
if clientKey not in data:
data[clientKey] = ref
elif ref in data[clientKey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
break
You can't .append() to a string because a string is not mutable. If you want your dictionary value to be able to contain multiple items, it should be a container type such as a list. The easiest way to do this is just to add the single item as a list in the first place.
if clientKey not in data:
data[clientKey] = [ref] # single-item list
Now you can data[clientkey].append() all day long.
A simpler approach for this problem is to use collections.defaultdict. This automatically creates the item when it's not there, making your code much simpler.
from collections import defaultdict
data = defaultdict(list)
# ... same as before up to your if
if clientkey in data and ref in data[clientkey]:
print("That invoice already exists")
else:
data[clientKey].append(ref)
You started with a string value, and you cannot call .append() on a string. Start with a list value instead:
if clientKey not in data:
data[clientKey] = [ref]
Now data[clientKey] references a list object with one string in it. List objects do have an append() method.
If you want to keep appending to the string you can use data[clientKey]+= ref