"ValueError" when writing the rows to CSV file - python

writer = csv.DictWriter(result, fieldnames=Fnames)
for val in Fnames:
for row in List:
if str(row[1]) == str(val):
dic = {str(val): row[2]}
print dic.items()
writer.writerows(dic)
I'm getting:
Error: ValueError: dict contains fields not in fieldnames: S, c, h, o, o, l
I am writing the dictionary values to the CSV file but I am getting the following error. I have tried different methods but with no success. What I have to do to write rows to the CSV?

Fnames needs to be a list of field names to use. It looks like you have passed it the string "School" and it is iterating over each letter individually.
Check here for some documentation / examples.

Error message states that your dictionary contains keys that don't have a corresponding entry in your fieldnames parameter. Assuming that these are just extra fields, you can ignore them by using the extrasaction parameter during construction of your DictWriter object:

The specific error you're getting is because you're calling writer.writerows with a single dictionary as its argument. writerows expects to be given an iterable of dictionaries, not just one, and so it's misinterpreting things. It iterates over the dictionary (getting the keys) but expecting to be getting the inner dicts. When it iterates over those (to make sure the dictionaries have the expected keys), it is seeing the letters of the dictionary's key.
I'm not really sure what your logic is supposed to be though. Probably you shouldn't be iterating over Fnames at the top level, since every dictionary you write out needs to have every field in it (unless you specify a non-default value for the extrasaction parameter of DictWriter). Perhaps loop first over the List items, then over the Fname fields afterwards, adding a new key and value to the dictionary each time?

Related

Accessing Unicode Values in a Python Dictionary

I have a dictionary full of unicode keys/values due to importing JSON through json.loads().
dictionaryName = {u'keyName' : u'valueName'}
I'm trying to access values inside the dictionary as follows:
accessValueName = dictionaryName.get('keyName')
This returns None, assumedly because it is looking for the String 'keyName' and the list is full of unicode values. I tried sticking a 'u' in front of my keyName when making the call, but it still returns none.
accessValueName = dictionaryName.get(u'keyName')
I also found several seemingly outdated methods to convert the entire dictionary to string values instead of unicode, however, they did not work, and I am not sure that I need the entire thing converted.
How can I either convert the entire dictionary from Unicode to String or just access the values using the keyname?
EDIT:
I just realized that I was trying to access a value from a nested dictionary that I did not notice was nested.
The solution is indeed:
accessValueName = dictionaryName.get('keyName')
Dictionaries store values in a hash table using the hash values of the object.
print(hash(u"example"))
print(hash("example"))
Yields the same result. Therefore the same dictionary value should be accessible with both.

How to facilitate dict record filtering by dict key value?

I want to interface with rocksdb in my python application and store arbitrary dicts in it. I gather that for that I can use something like pickle to for serialisation. But I need to be able to filter the records based on values of their keys. What's the proper approach here?
so let's say you have a list of keys named dict_keys and you have a dict named big_dict and you want to filter out only the values from dict_keys. You can write a dict comprehension that iterates through the list grabbing the items from the dict if they exist like this:
new_dict = {key: big_dict.get(key) for key in dict_keys}
RocksDB is a key-value store, and both key and value are binary strings.
If you want to filter by given keys, just use the Get interface to search the DB.
If you want to filter by given key patterns, you have to use the Iterator interface to iterating the whole DB, and filter the records with keys that match the pattern.
If you want to filter by values or value patterns, you still need to iterating the whole DB. For each key-value pair, deserialize the value, and check if it equals to the give value or matches the give pattern.
For case 1 and case 2, you don't need to deserialize all values, but only values that equal to the give key or match the pattern. However, for case 3, you have to deserialize all values.
Both case 2 and case 3 are inefficient, since they need to iterate the whole key space.
You can configure RocksDB's key to be ordered, and RocksDB have a good support for prefix indexing. So you can efficiently do range query and prefix query by key. Check the documentation for details.
In order to efficiently do value filter/search, you have to create a value index with RocksDB.

Storing a nested dictionary of tuples in python

I have a dictionary of dictionaries that uses tuples as it's keys and values. I would like to write this dictionary and have tried json and pickle but neither of them seem to work. Is there a better alternative?
https://github.com/jgv7/markov-generator/blob/master/sentence-generator.py
json expects the key of the Key value pair to be a string or a number that can be properly converted to a string. bottom line - cant do a json.dumps on a dict with tuples as keys.
pickle should work unless the dictionary object is not properly serialized.
From your code:
with open(filename, 'rb') as df:
pickle.load(df)
print mapping
You don't bind the result of the load() call to a name, so that line has no effect (other than consuming processor time and moving the file pointer). That should read:
with open(filename, 'rb') as df:
mapping = pickle.load(df)
print mapping

Ordered Dictionary with list as values

I want to create an ordered dictionary with a List as the value type.
I tried to call this method:
ordered = collections.OrderedDict(list)
but I get the error:
TypeError: 'type' object is not iterable
Is there any other data structure I can use for an ordered dictionary?
Later in the program I need to get the first key/value pair that was inserted; that's why I need the list ordered. After that point order does not matter.
Python is dynamically typed. You don't need to specify the value type in advance, you just insert objects of that type when needed.
For example:
ordered = collections.OrderedDict()
ordered[123] = [1,2,3]
You can get the first inserted key/value with next(ordered.iteritems()) (Python 2) or next(ordered.items()) (Python 3).

Unpacking an OrderedDict of data frames into many data frames in python

I want to read and prepare data from a excel spreadsheet containing many sheets with data.
I first read the data from the excel file using pd.read_excel with sheetname=None so that all the sheets can be written into the price_data object.
price_data = pd.read_excel('price_data.xlsx', sheetname=None)
This gives me an OrderedDict object with 5 dataframes.
Afterwards I need to obtain the different dataframes which compose the object price_data. I thought of using a for iteration for this, which gives me the opportunity to do other needed iterative operations such as setting the index of the dataframes.
This is the approach I tried
for key, df in price_data.items():
df.set_index('DeliveryStart', inplace=True)
key = df
With this code I would expect that each dataframe would be written into an object named by the key iterator, and at the end I would have as many dataframes as those inside my original data_price object. However I end up with two identical dataframes, one named key and one named value.
Suggestions?
Reason for current behaviour:
In your example, the variables key and df will be created (if not already existing) and overwritten in each iteration of the loop. In each iteration, you are setting key to point towards the object df (which also remains set in df, as Python allows multiple pointers to the same object). However, the key object is then overwritten in the next loop and set to the new value of df. At the end of the loop, the variables will remain in their last state.
To illustrate:
from collections import OrderedDict
od = OrderedDict()
od["first"] = "foo"
od["second"] = "bar"
# I've added an extra layer of `enumerate` just to display the loop progress.
# This isn't required in your actual code.
for loop, (key, val) in enumerate(od.items()):
print("Iteration: {}".format(loop))
print(key, val)
key = val
print(key,val)
print("Final output:", key, val)
Output:
Iteration: 0
first foo
foo foo
Iteration: 1
second bar
bar bar
Final output: bar bar
Solution:
It looks like you want to dynamically set the variables to be named the same as the value of key, which isn't considered a good idea (even though it can be done). See Dynamically set local variable for more discussion.
It sounds like a dict, or OrderedDict is actually a good format for you to store the DataFrames alongside the name of the sheet it originated from. Essentially, you have a container with the named attributes you want to use. You can then iterate over the items to do work like concatenation, filtering or similar.
If there's a different reason you wanted the DataFrames to be in standalone objects, leave a comment and I will try and make a follow-up suggestion.
If you are happy to set index of the DataFrames in-place, you could try this:
for key in price_data:
price_data[key].set_index('DeliveryStart', inplace=True)

Categories