I have CSV data that is already in a datetime format. I need to analyze the data by comparing days and finding the most similar ones. The data is organized in 10minute intervalls which means 144 rows per day need to be clustered into one day. It would be ideal if every day would be copied into an array and could be accessed by saying e.g. print(array_26.08.2022).
[CSV Screenshot]
(https://i.stack.imgur.com/bZEAR.png)
i searched online but couldnt find a solution
I'm currently struggling to find good information on how to calculate differences, percentages etc. using several columns and rows in a Pandas dataframe - and how to show the output in a nice table using Python.
Short example of what I'm going for:
I'm working with NBA data and have gathered a bunch of match statistics for home and away teams during the 2019/20 season (the season finishes later this month). The first row shows the Free Throw percentage and "Regular" means regular matches with audience members and "Bubble" denotes the matches without audience members.
A short view of my Pandas dataframe:
How do I automate the calculations using Python code? Feel free to give me examples!
I've got a dataset with multiple time values as below.
Area,Year,Month,Day of Week,Time of Day,Hour of Day
x,2016,1,6.0,108,1.0
z,2016,1,6.0,140,1.0
n,2016,1,6.0,113,1.0
p,2016,1,6.0,150,1.0
r,2016,1,6.0,158,1.0
I have been trying to transform this into a single datetime object to simplify the dataset and be able to do proper time series analysis against it.
For some reason I have been unable to get the right outcome using the datetime library from Python. Would anyone be able to point me in the right direction?
Update - Example of stats here.
https://data.pa.gov/Public-Safety/Crash-Incident-Details-CY-1997-Current-Annual-Coun/dc5b-gebx/data
I don't think there is a week column. Hmm. I wonder if I've missed something?
Any suggestions would be great. Really just looking to simplify this dataset. Maybe even create another table / sheet for the causes of crash, as their's a lot of superfluous columns that are taking up a lot of data, which can be labeled with simple ints.
In my quest of retrieving the stock prices, in daily, within a 10y period, of the 600 companies of the index EUROSTOXX 600, I'm facing some difficulties.
First question : Retrieving all of this with one part of code seems feasible according to you ?
(I'm considering adding also main financial indicators like ROI,ROE,EBIT,EPS, annual performance... and export all of this on one excel sheet)
I collected all the 600 ISIN. The question is, can I use it to retrieve the data from yahoo finance (or anything else) or should I find a way to find the 600 real tickers defined by Yahoo ?
If yes, does anyone have a tip for that ? I've been looking for lists but this index doesn't look very popular apparently.
Thank you for reading !
I am currently looking for a way to automate a search for cells containing text in excel using python, then printing to a new excel sheet.
My background in coding is very limited but I have done something similar in Python some odd years ago, finding text matching one cell and printing it to another sheet. However, this requires finding information from several cells at once in a large dataset. From my limited skillset I am unable to tell if this is possible.
pandas.read_excel can do this. Check pandas official documentation