I have a dictionary like such:
dict = {'x':[2,6,4],'y':[56,5,1]}
I would like to pass one of these lists into the query method:
new_df = df.query('col3 == #dict["x"]')
But I get a UndefinedVariableError. Is there any way to do what I want without the roundabout step of setting a new variable and then using "#" with that one
new_v = dict['x']
new_df = df.query('col3 == #new_v')
Do you definetly need the query function?
Otherwise following the example in the query docs you could try something like:
df.loc[df['col3'].isin(dict['x'])]
Note that the isin, might be required since your dictionary returns a list
So you want something like ?
df.query('col3=={}'.format(dict['x']))
Related
I have a knot in my head. I wasn't even sure what to google for. (Or how to formulate my title)
I want to do the following: I want to write a function that takes a term that occurs in the name of a .csv, but at the same time I want a df to be named after it.
Like so:
def read_data_into_df(name):
df_{name} = pd.read_csv(f"file_{name}.csv")
Of course the df_{name} part is not working. But I hope you get the idea.
Is this possible without hard coding?
Thanks!
IIUC, you can use globals :
def read_data_into_df(name):
globals()[f"df_{name}"] = pd.read_csv(f"file_{name}.csv")
If I were you I would create a dictionary and create keys with
dictionary = f"df_{name}: {whatever_you_want}"
If there are only a couple of dataframes, just accept the minimal code repetition:
def read_data_into_df(name):
return pd.read_csv(f"file_{name}.csv")
...
df_ham = read_data_into_df('ham')
df_spam = read_data_into_df('spam')
df_bacon = read_data_into_df('bacon')
...
# Use df_ham, df_spam and df_bacon
If there's a lot of them, or the exact data frames are generated, I would use a dictionary to keep track of the dataframes:
dataframes = {}
def read_data_into_df(name):
return pd.read_csv(f"file_{name}.csv")
...
for name in ['ham', 'spam', 'bacon']:
dataframes[name] = read_data_into_df('name')
...
# Use dataframes['ham'], dataframes['spam'] and dataframes['bacon']
# Or iterate over dataframes.values() or dataframes.items()!
I successfully imported from the web this json file, which looks like:
[{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"},
etc ...
I want to extract the values of the key moid_au, later compare moid_au with the key values of h_mag.
This works: print(data[1]['moid_au']), but if I try to ask all the elements of the list it won't, I tried: print(data[:]['moid_au']).
I tried iterators and a lambda function but still has not work yet, mostly because I'm new in data manipulation. It works when I have one dictionary, not with a list of dictionaries.
Thanks in advance for other tips. Some links were confusing.
Sounds like you are using lambda wrong because you need map as well:
c = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"},{"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
list(map(lambda rec: rec.get('moid_au'), c))
['0.035', '0.028']
Each lambda grabs a record from your list and you map your function to that.
Using print(data[:]['moid_au']) equals to print(data['moid_au']), and you can see that it won't work, as data has no key named 'moid_au'.
Try working with a loop:
for item in data:
print(item['moid_au'])
using your approach to iterate over the whole array to get all the instances of a key,this method might work for you
a = [data[i]['moid_au']for i in range(len(data))]
print(a)
In which exact way do you want to compare them?
Would it be useful getting the values in a way like this?
list_of_dicts = [{"h_mag":"19.7","i_deg":"9.65","moid_au":"0.035"}, {"h_mag":"20.5","i_deg":"14.52","moid_au":"0.028"}]
mod_au_values = [d["moid_au"] for d in list_of_dicts]
h_mag_values = [d["h_mag"] for d in list_of_dicts]
for key, value in my_list.items ():
print key
print value
for value in my_list.values ():
print value
for key in my_list.keys():
print key
This is probably a very basic question but I haven't been able to figure this out.
I'm currently using the following to append values to an empty list
shoes = {'groups':['running','walking']}
df_shoes_group_names = pd.DataFrame(shoes)
shoes_group_name=[]
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
shoes_group_name
['running', 'walking']
I'm trying to accomplish the same using a for loop, however, when I execute the loop the list comes back as blank
shoes_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
list_builder(df_shoes_group_names)
shoes_group_name
[]
Reason for the function is that eventually I'll have multiple DF's with different product's so i'd like to just have if statements within the function to handle the creation of each list
so for example future examples could look like this:
df_shoes_group_names
df_boots_group_names
df_sandals_group_names
shoes_group_name=[]
boots_group_name=[]
sandals_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
elif 'boots' in dataframe_name:
for type in df_boots_group_names['groups']:
boots_group_name.append(type)
elif 'sandals' in dataframe_name:
for type in df_sandals_group_names['groups']:
sandals_group_name.append(type)
list_builder(df_shoes_group_names)
list_builder(df_boots_group_names)
list_builder(df_sandals_group_names)
Not sure if I'm approaching this the right way so any advice would be appreciated.
Best,
You should never call or search a variable name as if it were a string.
Instead, use a dictionary to store a variable number of variables.
Bad practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
def foo(x):
if shoes in df_shoes_group_names: # <-- THIS WILL NOT WORK
# do something with x
Good practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
dfs = {'shoes': df_shoes_group_names,
'boots': df_boots_group_names,
'sandals': df_sandals_group_names}
def foo(key):
if 'shoes' in key: # <-- THIS WILL WORK
# do something with dfs[key]
What I want to achieve
value = 'a.b.c.d.e'
new_value = value.split('.')
new_value[-1] = F
''.join(new_value)
Now I would like to achieve this in one line. something like below
''.join(value[-1] = F in value.split('.'))
my above expression throws error because it is kind of wrong so is there a possible way to achieve this
Value should be "1.2.3.4.5" and join doesn't work for int. You should use new_value[-1]='10' instead.
''.join(value.split('.')[:-1]+['10'])
Hi. I have a function in my views in which I have a variable and I use it in template.
imagesj[str(i)]=str(j[0])
from which I get a value similar to this
{'Adele-1-Fuchsia-9': 'product/adele_1_fuchsia_1.jpg',
'Jealyn-37-Brown-10': 'product/jealyn_37_brown_1.jpg'}
I need to get only product/adele_1_fuchsia_1.jpg and product/jealyn_37_brown_1.jpg which are dynamic values and they will be changed according to product. I need to replace the front part before product in string. How can I do this?
I donĀ“t know what your original data structure looks like, but If you have a dictionary like the one below:
dic = {'Adele-1-Fuchsia-9': 'product/adele_1_fuchsia_1.jpg', 'Jealyn-37-Brown-10': 'product/jealyn_37_brown_1.jpg'}
you can get the products-element by iterating the values:
print [x for x in dic.values() if 'product' in x]
this prints:
['product/adele_1_fuchsia_1.jpg', 'product/jealyn_37_brown_1.jpg']