Change part of csv string panda

Change part of csv string panda - python

I would like to change the word consolidation (two times in the following string) with an other value with a variable or ? (ex. breakout/outofconsolidation/inside)
Can I help me to achieve this, please?
dfconsolidationcsv.to_csv(r'symbols\stocks_consolidation_sp500.csv', index = False)
a = ('breakout')
df{a}csv.to_csv(r'symbols\stocks_{a}_sp500.csv', index = False)

Unless there is a justifiable reason to be creating dynamic variable assignments, I would avoid doing so. In this case, defining your DataFrame variables in a dict is probably sufficient:
# store df in a dict instead of separate variables
df_dict = dict()
df_dict['consolidation'] = dfconslidationcv
df_dict['breakout'] = dfbreakoutcv
...
# invoke command for a specific variable
a = 'breakout'
df_dict[a].to_csv(r'symbols\stocks_%s_sp500.csv' % a, index = False)
Now, if there is an overwhelming reason why you HAVE to use pre-existing variable names that need to be changed dynamically, I think you can do something like this:
a = 'breakout'
exec("df%scsv.to_csv(r'symbols\stocks_%s_sp500.csv', index=False)" % (a, a))

Related

Doing calculations while creating a List (in Python)

I'm getting data from an API and storing it on Python dictionary (and then a list of dictionaries).
I need to do calculations (max, sum, divisions...) on the dictionary data to create extra data to add to the same dictionary/list.
My current code looks like this:
stream = whatever (whatever, whatever)
keywords = []
for batch in stream:
for row in batch.results:
max_clicks = max(data_keywords["keywords_clicks"])
weighted_clicks = sum(data_keywords["keywords_weighted"])/sum(data_keywords["keywords_clicks"])
data_keywords = {}
data_keywords["keywords_text"] = row.ad_group_criterion.keyword.text
data_keywords["keywords_clicks"] = row.metrics.clicks
data_keywords["keywords_conversion_rate"] = row.metrics.conversions_from_interactions_rate
data_keywords["keywords_weighted"] = row.metrics.clicks * row.metrics.conversions_from_interactions_rate
data_keywords["etv"] = (data_keywords["keywords_clicks"]/max_clicks*data_keywords["keywords_conversion_rate"])+((1-data_keywords["keywords_clicks"]/max_clicks)*weighted_clicks)
keywords.append(data_keywords)
This doesn't work, it gives UnboundLocalError (local variable 'data_keywords' referenced before assignment). I've tried different options and got different errors.
data_keywords["etv"] is what I want to calculate ("max_clicks", "weighted_clicks" and data_keywords["keywords_weighted"] are intermediate calculations for that)
The main problem is that I need to calculate max and sum for all values inside the dictionary, then do a calculation using that max and sum for each value and then store the results in the dictionary itself.
So I don't know where to put the code to do the calculations (before the dictionary, inside the dictionary, after the dictionary or a mix)
I guess it should be possible, but I'm a Python/programming newbie and can't figure this out.
It's probably not relevant, but in case you are wondering, I'm trying to create a weighted sort (https://moz.com/blog/build-your-own-weighted-sort). And I can't use models/database to store data.
Thanks!
EDIT: Some extra info, in case it helps understand better what I need: The results that the keywords list gives without the calculations is something like this:
[{'keywords_text': 'whatever', 'keywords_clicks': 5, 'keywords_conversion_rate': 6.3}, {'keywords_text': 'whatever2', 'keywords_clicks': 50, 'keywords_conversion_rate': 2.3}, {'keywords_text': 'whatever3', 'keywords_clicks': 20, 'keywords_conversion_rate': 2.0}]
I want basically to add to this keywords list a new key/value of 'etv': 8.5 or whatever for each keyword. That etv should come from the formula that I put on my code (data_keywords["etv"] = ...) but maybe it needs changes to work in Python.
The info from this "original" keywords list comes directly from the API (I don't have that data stored anywhere) and it works perfectly if I just request the info and store it in that list. But when the problems come when I introduce the calculations (specially using sum and max inside a loop I guess).

The UnboundLocalError is because you are trying to access data_keywords["keywords_clicks"] before you have declared data_keywords or set the value for "keywords_clicks".
Also, I think you need to be clearer about what data structure you are trying to create. You mention "a list of dictionaries" which I don't see. Maybe you are trying to create a dictionary of lists, but it looks like you overwrite the dictionary values each time you go through your loop.

adding my response as an answer, as I do not have enough reputation to comment
To get rid of assignment error just move the line data_keywords = {} above max_clicks = max(data_keywords["keywords_clicks"])
Here you are trying to access a local variable before its declaration. The code in this case is trying to access a global variable which doesn't seems to exist.
stream = whatever (whatever, whatever)
keywords = []
for batch in stream:
for row in batch.results:
data_keywords = {}
max_clicks = max(data_keywords["keywords_clicks"])
weighted_clicks = sum(data_keywords["keywords_weighted"])/sum(data_keywords["keywords_clicks"])
data_keywords["keywords_text"] = row.ad_group_criterion.keyword.text
data_keywords["keywords_clicks"] = row.metrics.clicks
data_keywords["keywords_conversion_rate"] = row.metrics.conversions_from_interactions_rate
data_keywords["keywords_weighted"] = row.metrics.clicks * row.metrics.conversions_from_interactions_rate
data_keywords["etv"] = (data_keywords["keywords_clicks"]/max_clicks*data_keywords["keywords_conversion_rate"])+((1-data_keywords["keywords_clicks"]/max_clicks)*weighted_clicks)
keywords.append(data_keywords)
More on that here

You can't refer to elements of the dictionary before you create it. Move those variable assignments down to after you assign the dictionary elements.
for batch in stream:
for row in batch.results:
data_keywords = {}
data_keywords["keywords_text"] = row.ad_group_criterion.keyword.text
data_keywords["keywords_clicks"] = row.metrics.clicks
data_keywords["keywords_conversion_rate"] = row.metrics.conversions_from_interactions_rate
data_keywords["keywords_weighted"] = row.metrics.clicks * row.metrics.conversions_from_interactions_rate
max_clicks = max(data_keywords["keywords_clicks"])
weighted_clicks = sum(data_keywords["keywords_weighted"])/sum(data_keywords["keywords_clicks"])
data_keywords["etv"] = (data_keywords["keywords_clicks"]/max_clicks*data_keywords["keywords_conversion_rate"])+((1-data_keywords["keywords_clicks"]/max_clicks)*weighted_clicks)
keywords.append(data_keywords)

Shortest way to to write split function and assign variables?

I'm looking for a shorter but still clean and flexible way to write what I have below.
Variable to work with (length varying)
drpfile_exportname = '1911_CocaCola_XMasNow_TVC30sec_03_Roughcut_Tv10_PV01_Ov01_200319_prev_for_approval_H264'
Long way of doing it but clean
# Split string by "_"
drpfile_exportname_list = drpfile_exportname.split("_")
# Set variables
ul_date = drpfile_exportname_list[0]
up_client = drpfile_exportname_list[1]
up_cprojname = drpfile_exportname_list[2]
# Join variables to create desired name
upload_projname = "_".join((ul_date, up_client, up_cprojname))
Alternative oneliner not so flexible as no variables are assigned and in my opinion not a beautiful way to solve it
upload_projname = ("_".join(drpfile_exportname.split('_')[0:3]))
Thought something like this would work but always had problems with it
ul_date, up_client, up_cprojname = drpfile_exportname.split('_', 2)
Print:
print("\nProject name: {}".format(upload_projname))
Result that should be stored in a variable:
Project name: 1911_CocaCola_XMasNow

You can slice the result of split.
ul_date, up_client, up_cprojname = drpfile_exportname.split('_')[:3]
Or you can assign a dummy variable to the part you want to ignore
ul_date, up_client, up_cprojname, *_ = drpfile_exportname.split('_')

Python: Running function to append values to an empty list returns no values

This is probably a very basic question but I haven't been able to figure this out.
I'm currently using the following to append values to an empty list
shoes = {'groups':['running','walking']}
df_shoes_group_names = pd.DataFrame(shoes)
shoes_group_name=[]
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
shoes_group_name
['running', 'walking']
I'm trying to accomplish the same using a for loop, however, when I execute the loop the list comes back as blank
shoes_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
list_builder(df_shoes_group_names)
shoes_group_name
[]
Reason for the function is that eventually I'll have multiple DF's with different product's so i'd like to just have if statements within the function to handle the creation of each list
so for example future examples could look like this:
df_shoes_group_names
df_boots_group_names
df_sandals_group_names
shoes_group_name=[]
boots_group_name=[]
sandals_group_name=[]
def list_builder(dataframe_name):
if 'shoes' in dataframe_name:
for type in df_shoes_group_names['groups']:
shoes_group_name.append(type)
elif 'boots' in dataframe_name:
for type in df_boots_group_names['groups']:
boots_group_name.append(type)
elif 'sandals' in dataframe_name:
for type in df_sandals_group_names['groups']:
sandals_group_name.append(type)
list_builder(df_shoes_group_names)
list_builder(df_boots_group_names)
list_builder(df_sandals_group_names)
Not sure if I'm approaching this the right way so any advice would be appreciated.
Best,

You should never call or search a variable name as if it were a string.
Instead, use a dictionary to store a variable number of variables.
Bad practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
def foo(x):
if shoes in df_shoes_group_names: # <-- THIS WILL NOT WORK
# do something with x
Good practice
# dataframes
df_shoes_group_names = pd.DataFrame(...)
df_boots_group_names = pd.DataFrame(...)
df_sandals_group_names = pd.DataFrame(...)
dfs = {'shoes': df_shoes_group_names,
'boots': df_boots_group_names,
'sandals': df_sandals_group_names}
def foo(key):
if 'shoes' in key: # <-- THIS WILL WORK
# do something with dfs[key]

Python references to references in python

I have a function that takes given initial conditions for a set of variables and puts the result into another global variable. For example, let's say two of these variables is x and y. Note that x and y must be global variables (because it is too messy/inconvenient to be passing large amounts of references between many functions).
x = 1
y = 2
def myFunction():
global x,y,solution
print(x)
< some code that evaluates using a while loop >
solution = <the result from many iterations of the while loop>
I want to see how the result changes given a change in the initial condition of x and y (and other variables). For flexibility and scalability, I want to do something like this:
varSet = {'genericName0':x, 'genericName1':y} # Dict contains all variables that I wish to alter initial conditions for
R = list(range(10))
for r in R:
varSet['genericName0'] = r #This doesn't work the way I want...
myFunction()
Such that the 'print' line in 'myFunction' outputs the values 0,1,2,...,9 on successive calls.
So basically I'm asking how do you map a key to a value, where the value isn't a standard data type (like an int) but is instead a reference to another value? And having done that, how do you reference that value?
If it's not possible to do it the way I intend: What is the best way to change the value of any given variable by changing the name (of the variable that you wish to set) only?
I'm using Python 3.4, so would prefer a solution that works for Python 3.
EDIT: Fixed up minor syntax problems.
EDIT2: I think maybe a clearer way to ask my question is this:
Consider that you have two dictionaries, one which contains round objects and the other contains fruit. Members of one dictionary can also belong to the other (apples are fruit and round). Now consider that you have the key 'apple' in both dictionaries, and the value refers to the number of apples. When updating the number of apples in one set, you want this number to also transfer to the round objects dictionary, under the key 'apple' without manually updating the dictionary yourself. What's the most pythonic way to handle this?

Instead of making x and y global variables with a separate dictionary to refer to them, make the dictionary directly contain "x" and "y" as keys.
varSet = {'x': 1, 'y': 2}
Then, in your code, whenever you want to refer to these parameters, use varSet['x'] and varSet['y']. When you want to update them use varSet['x'] = newValue and so on. This way the dictionary will always be "up to date" and you don't need to store references to anything.

we are going to take an example of fruits as given in your 2nd edit:
def set_round_val(fruit_dict,round_dict):
fruit_set = set(fruit_dict)
round_set = set(round_dict)
common_set = fruit_set.intersection(round_set) # get common key
for key in common_set:
round_dict[key] = fruit_dict[key] # set modified value in round_dict
return round_dict
fruit_dict = {'apple':34,'orange':30,'mango':20}
round_dict = {'bamboo':10,'apple':34,'orange':20} # values can even be same as fruit_dict
for r in range(1,10):
fruit_set['apple'] = r
round_dict = set_round_val(fruit_dict,round_dict)
print round_dict
Hope this helps.

From what I've gathered from the responses from #BrenBarn and #ebarr, this is the best way to go about the problem (and directly answer EDIT2).
Create a class which encapsulates the common variable:
class Count:
__init__(self,value):
self.value = value
Create the instance of that class:
import Count
no_of_apples = Count.Count(1)
no_of_tennis_balls = Count.Count(5)
no_of_bananas = Count.Count(7)
Create dictionaries with the common variable in both of them:
round = {'tennis_ball':no_of_tennis_balls,'apple':no_of_apples}
fruit = {'banana':no_of_bananas,'apple':no_of_apples}
print(round['apple'].value) #prints 1
fruit['apple'].value = 2
print(round['apple'].value) #prints 2

how to increase variable name in Python?

I'm making a simulation program.
I manually write some initial conditions of particles with python list before starting program, such as
var1 = [mass_a, velocity_a, velocity_a]
var2 = [mass_b, velocity_b, velocity_b]
...
then how do I change that number in variable in for loop? Something I tried was
for i in range(2):
print(var+str(i))
but they don't work

Always remember
If you ever have to name variables suffixed by numbers as in your example, you should consider a sequential indexable data structure like array or list. In Python to create a List we do
var = [[mass_a, velocity_a, velocity_a],
[mass_b, velocity_b, velocity_b]]
If you ever have to name variables with varying suffixes like
var_A=[mass_a, velocity_a, velocity_a]
var_B=[mass_b, velocity_b, velocity_b]
you should consider a non-sequential indexable data structure like hashmap or dictionary. The key of this dictionary should be the varying suffix and the values should be the values assigned to the respective variable In Python to create a dictionary we do
var = {'A':[mass_a, velocity_a, velocity_a],
'B':[mass_b, velocity_b, velocity_b]}

Just to be the devil's advocate here, you can make this approach work as below.
for i in range(2):
print( globals()["var"+str(i+1)] )

You can put your variables in a list and iterate on it like,
var_list = [var_a,var_b...]
for var in var_list:
print var
Alternatively, you can put the your variables in a dictionary like,
var_dict = {"var_a":var_a,"var_b":var_b,...}
for var in var_dict:
print var_dict(var)

Could you put your variables var1, var2, ... into a list and iterate through the list, instead of relying upon numbered variable names?
Example:
vars = [var1, var2]
for var in vars:
do_something(var)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Change part of csv string panda - python

Related

Doing calculations while creating a List (in Python)

Shortest way to to write split function and assign variables?

Python: Running function to append values to an empty list returns no values

Python references to references in python

how to increase variable name in Python?

Categories

Resources