Average data by each hour in python [closed] - python

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have data like below:
Data Columns:
DateTime,Data1,Data2,Data3,Month,Date,Year,Hour,Minutes
1/1/2017 0:00,1.1,2.2,3.3,1,1,2017,0,00
1/1/2017 0:00,1.1,2.2,3.3,1,1,2017,0,15
1/1/2017 0:00,1.1,2.2,3.3,1,1,2017,0,30
1/1/2017 0:00,1.1,2.2,3.3,1,1,2017,1,45
I need to average columns 'WS', 'VWS' .... 'SR' data by each hour. The DateTime column is reported every 15 minutes.

I have an answer to my own question. Posting it here so that others can benefit:
import pandas as pd
df = pd.read_csv("MetData.csv")
df['NewDateTime'] = pd.to_datetime(df['DateTime'])
df.index = df['NewDateTime']
df_p = df.resample('H').mean()
df_p['Month'] = df['NewDateTime'].dt.month
df_p['Year'] = df['NewDateTime'].dt.year
df_p['Date'] = df['NewDateTime'].dt.day
df_p['Hour'] = df['NewDateTime'].dt.hour
writer = pd.ExcelWriter('MetData_Orig1.xlsx', engine='xlsxwriter')
df_p.to_excel(writer, sheet_name='Sheet1')
writer.save()

Related

How to delete punctuation and number from element in the list? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 10 months ago.
Improve this question
I have a list with many nams of columns that i work with for my project.
my list is like this:
list_colunm = ['solar [W].1', 'Wind [W].02', 'Caz [W].33']
(and other elements it's a long list).
if you can help me with same methods to delete .1 .02 and .33
Standard Python:
list_column = ['solar [W].1', 'Wind [W].02', 'Caz [W].33']
list_shortnames = [x.rsplit('.')[0] for x in list_column]
Output:
['solar [W]', 'Wind [W]', 'Caz [W]']
Pandas:
The most simple way is using rename() with a dict as a map.
import pandas as pd
mapping = {"solar [W].1": "solar [W]", "Wind [W].02": "Wind [W]", "Caz [W].33": "Caz [W]"}
df.rename(mapping, inplace=True, axis='columns')
More flexible alternative (#mozway):
df.rename(columns=lambda x: x.rsplit('.', n=1)[0])
Output:
solar [W] Wind [W] Caz [W]
0 1 2 3

How to sort the values within a column based on the values assigned to it? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have a dataframe as follows: The values in the column are separated by ; and each item is assigned with numeric value. I want to sort them based on the numeric value.
ab = {
'Category': ['AD', 'AG'],
'data1': ['a, b=4; b=3; dk=1; kc/d2=8', 'km=4; df,md=8; lko=10; cog=12'],
'data2': ['a=9; kd=1; mn=1; fg=3', 'kl=6; md=1; jhk=3, b &j=4; ghg=7']
}
df1 = pd.DataFrame(ab)
df1
Category data1 data2
0 AD a, b=4; b=3; dk=1; kc/d2=8 a=9; kd=1; mn=1; fg=3
1 AG km=4; df,md=8; lko=10; cog=12 kl=6; md=1; jhk=3, b &j=4; ghg=7
I want to sort the items in each columns based on the value assigned to it.
the expected output is:
Category data1 data2
0 AD kc/d2=8; a, b=4; b=3; dk=1 a=9; fg=3; kd=1; mn=1;
1 AG cog=12; lko=10; df,md=8; km=4 ghg=7; kl=6; b &j=4; jhk=3; md=1
You can try:
df1[df1.filter(like='data').columns] = df1.filter(like='data').applymap(lambda s: '; '.join(sorted(s.split('; '), key=lambda x:x[-1], reverse=True)))
If there is a possibility that you have numbers > 9:
df1[df1.filter(like='data').columns] = df1.filter(like='data').applymap(lambda s: '; '.join(sorted(s.split('; '), key=lambda x:int(x.rsplit('=', 1)[-1]), reverse=True)))

How to create a nested dictionary in python [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Hello all just learning dictionary in python. I have few data please let me know how to create a nested dictionary. Data are available below with duplicate values in excel file. Please do explain using for loop
Name Account Dept
John AC Lab1
Dev AC Lab1
Dilip AC Lab1,Lab2
Sat AC Lab1,Lab2
Dina AC Lab3
Surez AC Lab4
I need the result in below format:
{
'AC': {
'Lab1': ['John', 'Dev', 'Dilip', 'Sat'],
'Lab2': ['Dilip','Sat'],
'Lab3': ['Dina'],
'Lab4': ['Surez']
}
}
Something like this should get you closer to an answer but I'd need your input file to optimize it:
import xlrd
from collections import defaultdict
wb = xlrd.open_workbook("<your filename>")
sheet_names = wb.sheet_names()
sheet = wb.sheet_by_name(sheet_names[0])
d = defaultdict(defaultdict(list))
for row_idx in range(0, sheet.nrows):
cell_obj_0 = sheet.cell(row_idx, 0)
cell_obj_1 = sheet.cell(row_idx, 1)
cell_obj_2 = sheet.cell(row_idx, 2)
for lab in cell_obj_2.split(","):
d[cell_obj_1][lab].append(cell_obj_0)

Get stock data problems [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I was making getting stock data file and output was just
In Progress
[]
what's the matter?
import quandl
from datetime import datetime as dt
def get_stock_data(stock_ticker):
print("In Progress")
start_date = dt(2019, 1, 1)
end_date = dt.now()
quandl_api_key = "tJDGptkdfqwjYi123RVV"
quandl.ApiConfig.api_key = quandl_api_key
source = "WIKI/" + stock_ticker
data = quandl.get(source, start_date=str(start_date), end_date=str(end_date))
data = data[["Open", "High", "Low", "Volume", "Close"]].values
print(data)
return data
get_stock_data("AAPL")
There's nothing wrong with your code. However recent stock data is a Premium product from Quandl and I presume you are just on the free subscription, hence your dataframe comes back empty. If you change the dates to 2017, you will get some results but that's as far as it goes on the free subscription it seems.

Extract data in R or Python from data file with no column headers [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a txt file with several columns. see sample data below.
25  180701  1  12
25  180701  2  15
25  180701  3  11
25  180702  1  11
25  180702  2  14
25  180722  2  14
14  180701  1  11
14  180701  2  13
There are no column headers. Column 1 is ID, Column 2 is date, Column 3 is Hour, Column 4 is value. I am trying to look up the number 25 in column 1 and extract data for all hours during period 180701 to say 180705 all values. so the result would be a new text file with following data.
25  180701  1  12
25  180701  2  15
25  180701  3  11
25  180702  1  11
25  180702  2  14
Any help in R or Python is appreciated.Thanks!
When we read the file with read.csv/read.table, there is an option header=FALSE and use col.names
df1 <- read.csv("file.csv", header = FALSE,
col.names = c("ID", "date", "Hour", "value"))
and subset the values later
subset(df1, ID == 25 & (date %in% 180701:180705), select = 1:4)
In R readr::read_delim() has a col_names parameter that you can set to F
> readr::read_delim('hi;1;T\nbye;2;F', delim = ';', col_names = F)
# A tibble: 2 x 3
X1 X2 X3
<chr> <int> <lgl>
1 hi 1 TRUE
2 bye 2 FALSE
In Python, try this:
import pandas as pd
#To read csv files without headers. use 'header = None' to be explicit
df = pd.read_csv('test.csv',header = None)
df
# Then rename the generated columns
df2 = df.rename({0:'ID',1:'Date',2:'Hours',3:'Value'},axis = 'columns')
df2

Categories