Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 12 months ago.
Improve this question
It seems a simple question, but I'm new to Python.
I have 10 variables (those names are from A to J), those variables are float32 np.arrays. I want to apply the following command :
variable = variable*mask[0,:,:]; variable[variable==0] = np.nan
On all variables, just in one line rather than writing 10 lines, taking into account keeping variables names the same.
Psuedocode exmaple
FOR all variables A-J
variable = variable*mask[0,:,:]; variable[variable==0] = np.nan
ENDFOR
You can do something like this?
variables = [a,b,c]
for i in range(len(variables)):
x = variables[i]
variables[i] = x*mask[0,:,:]; x[x==0] = np.nan
note: this just updates the items in the list
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
a = [5,2,7]
If 2 is the latest element added in a then return 2 .
You can use this syntax:
>>> list_ = [0, 1, 2]
>>> list_[-1]
2
There is no direct way of knowing how a list was modified. Python does not keep track of this information. This means you would have to keep a copy of the list before updating it and run something like
a = [5,2,7]
old_a = a.copy()
a[1] = 0
[old_a[i] for i,v in enumerate(a) if old_a[i]!=v]
However, if you are able to keep track of this, you are certainly able to keep track of the added value, and to run the tests before adding it to the new list. In summary, the design of what you are doing should probably be reconsidered.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 months ago.
Improve this question
Problem Statement
I have the CSV data as shown in the image. From this, I have to use only keep RegionName, State and the quarterly mean values from 2000 - 2016. Also, I want to use multi-indexing with [State, RegionName].
I am working on a CSV file with pandas in python. As shown in the screenshot.
Thank you in advance.
Right before the troublesome for year in range(...) loop, you did:
house_data.columns = pd.to_datetime(house_data.columns).to_period('M')
That means your columns are no longer strings. So inside the for loop:
house_data[str(year)+'q2'] = house_data[[str(year)+'-04',...]].mean(axis=1)
would fail and throw that error since there are no column with string name. To fix this, do this instead:
house_data.columns = pd.to_datetime(house_data.columns).to_period('M').strftime('%Y-%m')
However, you are better do:
house_data.columns = pd.to_datetime(house_data.columns).to_period('Q')
house_data.groupby(level=0, axis=1).mean()
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
At every step shall I be introducing a new variable name or I can continue to use the same name. Kindly advise what's the best practice and why?
df1 = df.withColumn('last_insert_timestamp', lit(datetime.now())
df2 = df1.withColumn('process_date', lit(rundate)
Versus
df = df.withColumn('last_insert_timestamp', lit(datetime.now())
df = df.withColumn('process_date', lit(rundate)
There is no best practice for that. It depends on what you want to do.
In Python, variables are just labels assigned to an object. So if you need your original DF object to be modified through your code then change the assignment to the newly generated DF.
Now, if you need to keep the first DF for other processing later in the code, then you may assign a new variable name.
You might find more explanations here: Reassigning Variables in Python
You can use like this
df = df.withColumn('last_insert_timestamp', lit(datetime.now()) \
.withColumn('process_date', lit(rundate)
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
Please help me reduce the time complexity of the nested loop in Python
df is a dataframe with say 3 columns, say name, city and date for eg
rep data frame has the average/means based on 2 columns name and city from df. I need to reattach the mean from rep to df
for i in range(0,len(rep)):
for j in range(k,len(df)):
if df["X"][j] == rep["X"][i]:
df["Mean"][j] = rep["Mean"][i]
else:
k=j
break
What you want is something like:
df.set_index('X').join(rep.set_index('X'))
Setting as index the keys on which you are doing the join will make the process much faster. After you have done the join, you can filter the old mean (with the drop dataframe method), and the values that you don't want
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
Iam looking for function in python that choosing random defined function.
I got 4 function :
w_lewo = Location(345,400)
w_prawo = Location(1570,400)
w_gore = Location(945,900)
w_dol = Location(945,870)
And I need function that randomly click one of the location above.
As you have mentioned yourself, you can store your Location objects in a list:
objectsList = [w_lewo, w_prawo, w_gore, w_dol]
then you can use a randint method from a random library like it was suggested here to randomly pick a number between 0 and the length of your (4 in our case).
randomListElement = objectsList[random.randint(0,len(objectsList)-1)]
Then you can do whatever you want with this element, click for example.