I have a web page which I can access from my server. The contents of the web page are as below.
xys.server.com - /xys/reports/
[To Parent Directory]
3/4/2021 6:09 AM <dir> All_Master
3/4/2021 6:09 AM <dir> Hartland
3/4/2021 6:09 AM <dir> Hauppauge
3/4/2021 6:09 AM <dir> Hazelwood
2/15/2019 7:41 AM 58224 NetBackup Retention and Full Backup Occupancy.xlsx
1/1/2022 11:00 AM 23959 OpsCenter_All_Master_Server_Backup_Report_01_01_2022_10_00_45_259_AM_49.zip
2/1/2022 11:00 AM 18989 OpsCenter_All_Master_Server_Backup_Report_01_02_2022_10_00_04_813_AM_4.zip
3/1/2022 11:00 AM 18969 OpsCenter_All_Master_Server_Backup_Report_01_03_2022_10_00_24_664_AM_17.zip
4/1/2021 10:00 AM 21709 OpsCenter_All_Master_Server_Backup_Report_01_04_2021_10_00_02_266_AM_31.zip
5/1/2021 10:00 AM 27491 OpsCenter_All_Master_Server_Backup_Report_01_05_2021_10_00_27_655_AM_11.zip
6/1/2021 10:00 AM 21260 OpsCenter_All_Master_Server_Backup_Report_01_06_2021_10_00_54_053_AM_19.zip
7/1/2021 10:00 AM 19898 OpsCenter_All_Master_Server_Backup_Report_01_07_2021_10_00_12_544_AM_42.zip
8/1/2021 10:00 AM 22642 OpsCenter_All_Master_Server_Backup_Report_01_08_2021_10_00_28_384_AM_25.zip
9/1/2021 10:00 AM 19426 OpsCenter_All_Master_Server_Backup_Report_01_09_2021_10_00_43_851_AM_70.zip
10/1/2021 10:01 AM 19149 OpsCenter_All_Master_Server_Backup_Report_01_10_2021_10_01_00_422_AM_7.zip
11/1/2021 10:00 AM 19638 OpsCenter_All_Master_Server_Backup_Report_01_11_2021_10_00_15_326_AM_20.zip
12/1/2021 11:00 AM 19375 OpsCenter_All_Master_Server_Backup_Report_01_12_2021_10_00_29_943_AM_13.zip
1/2/2022 11:00 AM 22281 OpsCenter_All_Master_Server_Backup_Report_02_01_2022_10_00_45_803_AM_37.zip
2/2/2022 11:00 AM 19435 OpsCenter_All_Master_Server_Backup_Report_02_02_2022_10_00_05_577_AM_71.zip
3/2/2022 11:00 AM 19380 OpsCenter_All_Master_Server_Backup_Report_02_03_2022_10_00_24_973_AM_90.zip
4/2/2021 10:00 AM 21411 OpsCenter_All_Master_Server_Backup_Report_02_04_2021_10_00_03_069_AM_56.zip
Now, I need to get the contents from this page in a structured format. I am using requests module but the data is highly un-structured and difficult to parse. The code is as below..
req = requests.get(url)
print (req.content.decode('utf-8'))
Output is like :
<pre>[To Parent Directory]<br><br> 3/4/2021 6:09 AM <dir> All_Master<br> 3/4/2021 6:09 AM <dir> Hartland<br> 3/4/2021 6:09 AM <dir> Hauppauge<br> 3/4/2021 6:09 AM <dir> Hazelwood<br> 2/15/2019 7:41 AM 58224 NetBackup Retention and Full Backup Occupancy.xlsx<br> 1/1/2022 11:00 AM 23959 OpsCenter_All_Master_Server_Backup_Report_01_01_2022_10_00_45_259_AM_49.zip<br> 2/1/2022 11:00 AM 18989 OpsCenter_All_Master_Server_Backup_Report_01_02_2022_10_00_04_813_AM_4.zip<br> 3/1/2022 11:00 AM 18969 OpsCenter_All_Master_Server_Backup_Report_01_03_2022_10_00_24_664_AM_17.zip<br> 4/1/2021 10:00 AM 21709 OpsCenter_All_Master_Server_Backup_Report_01_04_2021_10_00_02_266_AM_31.zip<br> 5/1/2021 10:00 AM 27491 OpsCenter_All_Master_Server_Backup_Report_01_05_2021_10_00_27_655_AM_11.zip<br> 6/1/2021 10:00 AM 21260 OpsCenter_All_Master_Server_Backup_Report_01_06_2021_10_00_54_053_AM_19.zip<br> 7/1/2021 10:00 AM 19898 OpsCenter_All_Master_Server_Backup_Report_01_07_2021_10_00_12_544_AM_42.zip<br> 8/1/2021 10:00 AM 22642 OpsCenter_All_Master_Server_Backup_Report_01_08_2021_10_00_28_384_AM_25.zip<br> 9/1/2021 10:00 AM 19426 OpsCenter_All_Master_Server_Backup_Report_01_09_2021_10_00_43_851_AM_70.zip<br> 10/1/2021 10:01 AM 19149 OpsCenter_All_Master_Server_Backup_Report_01_10_2021_10_01_00_422_AM_7.zip<br> 11/1/2021 10:00 AM 19638 OpsCenter_All_Master_Server_Backup_Report_01_11_2021_10_00_15_326_AM_20.zip<br> 12/1/2021 11:00 AM 19375 OpsCenter_All_Master_Server_Backup_Report_01_12_2021_10_00_29_943_AM_13.zip<br> 1/2/2022 11:00 AM 22281 OpsCenter_All_Master_Server_Backup_Report_02_01_2022_10_00_45_803_AM_37.zip<br> 2/2/2022 11:00 AM 19435 OpsCenter_All_Master_Server_Backup_Report_02_02_2022_10_00_05_577_AM_71.zip<br> 3/2/2022 11:00 AM 19380 OpsCenter_All_Master_Server_Backup_Report_02_03_2022_10_00_24_973_AM_90.zip<br> 4/2/2021 10:00 AM 21411 OpsCenter_All_Master_Server_Backup_Report_02_04_2021_10_00_03_069_AM_56.zip<br> 5/2/2021 10:00 AM 24191 OpsCenter_All_Master_Server_Backup_Report_02_05_2021_10_00_28_556_AM_14.zip<br> 6/2/2021 10:00 AM 21675 OpsCenter_All_Master_Server_Backup_Report_02_06_2021_10_00_54_962_AM_73.zip<br> 7/2/2021 10:00 AM 19954 OpsCenter_All_Master_Server_Backup_Report_02_07_2021_10_00_13_058_AM_31.zip<br> 8/2/2021 10:00 AM 21085 OpsCenter_All_Master_Server_Backup_Report_02_08_2021_10_00_28_778_AM_79.zip<br> 9/2/2021 10:00 AM 19691 OpsCenter_All_Master_Server_Backup_Report_02_09_2021_10_00_44_294_AM_5.zip<br> 10/2/2021 10:01 AM 23477 OpsCenter_All_Master_Server_Backup_Report_02_10_2021_10_01_00_793_AM_9.zip<br> 11/2/2021 10:00 AM 2
This is very unstructured.
Kindly suggest a way to make this content more readable so it is easy to parse the data...
Related
I have a time period from 9:00 till 22:00 and I need to list all possible durations with a step of 30 minutes within this period. E.g.
9:00 - 9:30
9:00 - 10:00
9:00 - 10:30
...
21:00 - 22:00
21:30 - 22:00
I've googled and found itertools.combinations() for numbers but nothing comparable for dates
So I created 2 different DataFrame Table and integrate it to tkinter GUI.
First Table looks like this;
Entry
Start
Finish
Total Time (Hour)
Status
Reason for Stoppage
1
23.05.2020 07:30
23.05.2020 08:30
01:00
MANUFACTURE
2
23.05.2020 08:30
23.05.2020 12:00
03:30
MANUFACTURE
3
23.05.2020 12:00
23.05.2020 13:00
01:00
STOPPAGE
MALFUNCTION
4
23.05.2020 13:00
23.05.2020 13:45
00:45
MANUFACTURE
5
23.05.2020 13:45
23.05.2020 17:30
03:45
MANUFACTURE
And second Table looks like this;
Start
Finish
Reason for Stoppage
10:00
10:15
Coffee Break
12:00
12:30
Lunch Break
15:00
15:15
Coffee Break
The main task is,combining these Tables and creating another Table.While doing that we should arrange the lines according to hours.At that time,the program has to create new lines 'itself' and show every starting/finishing hour in the Table.But I just can't do it by combining or merging them.
The third graph has to look like this;
Entry
Start
Finish
Total Time (Hour)
Status
Reason for Stoppage
1
23.05.2020 07:30
23.05.2020 08:30
01:00
MANUFACTURE
2
23.05.2020 08:30
23.05.2020 10:00
01:30
MANUFACTURE
3
23.05.2020 10:00
23.05.2020 10:15
00:15
STOPPAGE
Coffee Break
4
23.05.2020 10:15
23.05.2020 12:00
01:45
MANUFACTURE
5
23.05.2020 12:00
23.05.2020 12:30
00:30
STOPPAGE
Lunch Break
6
23.05.2020 12:30
23.05.2020 13:00
00:30
MANUFACTURE
7
23.05.2020 13:00
23.05.2020 13:45
00:45
STOPPAGE
MALFUNCTION
8
23.05.2020 13:45
23.05.2020 15:00
01:15
MANUFACTURE
9
23.05.2020 15:00
23.05.2020 15:15
00:15
STOPPAGE
Coffee Break
10
23.05.2020 15:15
23.05.2020 17:30
02:15
MANUFACTURE
I hope I explained the problem clearly.Thanks in advance.
from tkinter import *
import tkinter as tk
from tkinter import ttk
from pandastable import Table
import pandas as pd
import numpy as np
# import style
root = tk.Tk()
root.title("Çalışma Ve Mola Saatleri")
root.geometry("1800x1600")
work={"Entry":["1","2","3","4","5"],
"Start":["23.05.2020" " 07:30","23.05.2020 08:30",
"23.05.2020 12:00","23.05.2020" " 13:00","23.05.2020 13:45"],
"Finish":["23.05.2020 08:30","23.05.2020 12:00",
"23.05.2020 13:00","23.05.2020 13:45","23.05.2020 17:30"],
"Total Time (Hour)":["01:00","03:30","01:00","00:45","03:45"],
"Status":["MANUFACTURE","MANUFACTURE","STOPPAGE","MANUFACTURE","MANUFACTURE"],
"Reason For Stoppage":[" "," ","MALFUNCTION"," "," "]}
graph1=pd.DataFrame(work)
frame=tk.Frame(root)
frame.place(width=200)
frame.pack(anchor=W,padx=100,pady=50,ipadx=120,ipady=30)
pt=Table(frame,dataframe=graph1)
pt.show()
Break={"Start":["10:00","12:00","15:00"],
"Finish":["10:15","12:30","15:15"],
"Reason For Stoppage":["Coffee Break","Lunch Break","Coffee Break"]}
graph2=pd.DataFrame(Break)
frame2=tk.Frame(root)
frame2.place(width=100,height=50)
frame2.pack(anchor=NE,padx=150,ipadx=20,ipady=10)
pt2=Table(frame2,dataframe=graph2)
pt2.show()
graph3=pd.concat([graph1,graph2])
frame3=tk.Frame(root)
frame3.place()
frame3.pack(anchor=SW,padx=100,ipadx=120,ipady=500)
pt3=Table(frame3,dataframe=graph3)
pt3.show()
root.mainloop()
I have the following pandas dataframe that was converted to string with to_string().
It was printed like this:
S T Q U X A D
02:36 06:00 06:00 06:00 06:30 09:46 07:56
02:37 06:10 06:15 06:15 06:40 09:48 08:00
12:00 11:00 12:00 12:00 07:43 12:00 18:03
13:15 13:00 13:15 13:15 07:50 13:15 18:08
14:00 14:00 14:00 14:00 14:00 19:00
15:15 15:00 14:15 15:15 15:15 19:05
16:15 16:00 15:15 16:15 16:15 20:15
17:15 17:00 17:15 17:15 17:15 20:17
18:15 21:22 21:19 19:55 18:15 20:18
19:15 21:24 21:21 19:58 19:15 20:19
The gaps are due to empty values in the dataframe. I would like to keep the column alignment, perhaps by replacing the empty values with tabs. I would also like to center align the header line.
This wasn't printed in a terminal, but was sent over telegram with the requests post command. I think though, it is just a print formatting problem, independent of the telegram requests library.
The desired output would be like this:
S T Q U X A D
02:36 06:00 06:00 06:00 06:30 09:46 07:56
02:37 06:10 06:15 06:15 06:40 09:48 08:00
12:00 11:00 12:00 12:00 07:43 12:00 18:03
13:15 13:00 13:15 13:15 07:50 13:15 18:08
14:00 14:00 14:00 14:00 14:00 19:00
15:15 15:00 14:15 15:15 15:15 19:05
16:15 16:00 15:15 16:15 16:15 20:15
17:15 17:00 17:15 17:15 17:15 20:17
18:15 21:22 21:19 19:55 18:15 20:18
19:15 21:24 21:21 19:58 19:15 20:19
you can use dataframe style.set_properties to set some of these options like:
df.style.set_properties(**{'text-align': 'center'})
read more here:
https://pandas.pydata.org/docs/reference/api/pandas.io.formats.style.Styler.set_properties.html
I would like to use some daily data in one dataframe as a qualifier to run some code in another dataframe. Both dataframes contain ['Date', 'Time', 'Ticker', 'Open', 'High', 'Low', 'Close']. One dataframe has only daily information, the other contains 5min out of the same fields, here are some examples.
print(df)
Date Time Ticker Open High Low Close
0 01/02/18 3:00 PM ES 2687.00 2696.00 2681.75 2695.75
1 01/03/18 3:00 PM ES 2697.25 2714.25 2697.00 2712.50
2 01/04/18 3:00 PM ES 2719.25 2729.00 2718.25 2724.00
3 01/05/18 3:00 PM ES 2732.25 2743.00 2726.50 2741.25
4 01/08/18 3:00 PM ES 2740.25 2748.50 2737.00 2746.50
5 01/09/18 3:00 PM ES 2751.00 2760.00 2748.00 2753.00
6 01/10/18 3:00 PM ES 2744.00 2751.75 2736.50 2748.75
7 01/11/18 3:00 PM ES 2754.25 2768.50 2752.75 2768.00
8 01/12/18 3:00 PM ES 2771.25 2788.75 2770.00 2786.50
9 01/15/18 3:00 PM ES 2793.75 2796.00 2792.50 2794.50
print(df_tick)
Date Time Ticker Open High Low Close
0 01/02/18 8:45 AM ES 2687.00 2687.25 2681.75 2685.75
1 01/02/18 9:00 AM ES 2686.00 2687.75 2683.50 2687.50
2 01/02/18 9:15 AM ES 2687.50 2690.50 2687.25 2689.25
3 01/02/18 9:30 AM ES 2689.50 2692.00 2689.25 2692.00
4 01/02/18 9:45 AM ES 2692.00 2692.25 2687.25 2690.00
5 01/02/18 10:00 AM ES 2690.00 2691.00 2689.75 2690.75
6 01/02/18 10:15 AM ES 2690.50 2691.25 2690.25 2691.00
7 01/02/18 10:30 AM ES 2691.00 2692.00 2689.00 2689.50
8 01/02/18 10:45 AM ES 2689.50 2689.75 2687.75 2688.25
9 01/02/18 11:00 AM ES 2688.25 2689.50 2687.75 2689.25
10 01/02/18 11:15 AM ES 2689.25 2690.75 2689.25 2690.00
11 01/02/18 11:30 AM ES 2690.00 2690.75 2689.25 2690.00
12 01/02/18 11:45 AM ES 2690.25 2690.50 2688.50 2688.75
13 01/02/18 12:00 PM ES 2689.00 2689.25 2688.50 2689.25
14 01/02/18 12:15 PM ES 2689.25 2691.00 2689.00 2690.50
15 01/02/18 12:30 PM ES 2690.75 2691.00 2689.75 2690.50
16 01/02/18 12:45 PM ES 2690.75 2691.25 2690.25 2691.00
17 01/02/18 1:00 PM ES 2691.25 2691.25 2689.50 2690.75
18 01/02/18 1:15 PM ES 2690.50 2691.50 2690.25 2690.50
19 01/02/18 1:30 PM ES 2690.50 2691.00 2689.75 2690.75
20 01/02/18 1:45 PM ES 2690.75 2691.50 2690.25 2690.75
21 01/02/18 2:00 PM ES 2690.75 2691.25 2690.75 2691.00
22 01/02/18 2:15 PM ES 2691.25 2691.75 2690.50 2691.50
23 01/02/18 2:30 PM ES 2691.50 2693.00 2691.50 2692.75
24 01/02/18 2:45 PM ES 2693.00 2693.75 2691.00 2693.75
25 01/02/18 3:00 PM ES 2693.75 2696.00 2693.25 2695.75
26 01/03/18 8:45 AM ES 2697.25 2702.25 2697.00 2700.75
27 01/03/18 9:00 AM ES 2701.00 2703.75 2700.50 2703.25
28 01/03/18 9:15 AM ES 2703.25 2706.00 2703.00 2705.00
29 01/03/18 9:30 AM ES 2705.00 2707.25 2704.00 2706.50
Code for calculating the gap percentage
#Calculating Gap Percentage
df['Gap %'] = (df['Open'].sub(df['Close'].shift()).div(df['Close'] -
1).fillna(0))*100
I have the code for the df to find the percentage change from Close-Open, and would like to use this information as a qualifier to run some code on the df_tick.
For example if df['Gap %'] > .02, then I want to use that date in df_tick and ignore (or drop) the rest of the information.
#drop rows not meeting certain percentage
df.drop(df[df['Gap %'] < .2].index, inplace=True)
print(df)
Date Time Ticker Open High Low Close Gap Gap %
2 01/04/18 3:00 PM ES 2719.25 2729.0 2718.25 2724.00 6.75 0.247888
3 01/05/18 3:00 PM ES 2732.25 2743.0 2726.50 2741.25 8.25 0.301067
9 01/15/18 3:00 PM ES 2793.75 2796.0 2792.50 2794.50 7.25 0.259531
Now I'd like to use df['Date'] to find the matching Dates in df_tick['Date'] for some code I've already written, I tried to just drop all the data where the dates aren't the same. But received an error.
#drop rows in df_tick not matching dates in df
df_tick.drop(df_tick[df_tick['Date'] != df['Date']].index, inplace=True)
ValueError: Can only compare identically-labeled Series objects
You may be able to reset the index of both dataframes and get away with what you are trying to do, but I would try this:
df_tick = df_tick[df_tick.Date.isin(df.Date.unique())]
I'm working on a Django application and I'm just trying to push data up to the front-end to display.
In my views.py here's what I have:
def index(request):
...
context = RequestContext(request)
rooms = dict(db.studybug.find_one())
timeRange = [room.encode('utf-8') for room in rooms['timeRange']]
return render_to_response('studybug/index.html', timeRange, context)
Here, timeRange is a list that contains the following:
timeRange = ['Room 203A 10:00 AM \xc2\xa0', 'Room 203A 10:30 AM \xc2\xa0', 'Room 203A 11:00 AM \xc2\xa0', 'Room 203A 11:30 AM \xc2\xa0', 'Room 203A 12:00 PM \xc2\xa0', 'Room 203A 12:30 PM \xc2\xa0', 'Room 203A 3:00 PM \xc2\xa0', 'Room 203A 3:30 PM \xc2\xa0', 'Room 203A 4:00 PM \xc2\xa0', 'Room 203A 4:30 PM \xc2\xa0', 'Room 203A 5:00 PM \xc2\xa0', 'Room 203A 5:30 PM \xc2\xa0', 'Room 203A 6:00 PM \xc2\xa0', 'Room 203A 6:30 PM \xc2\xa0', 'Room 203A 7:00 PM \xc2\xa0', 'Room 203A 7:30 PM \xc2\xa0', 'Room 203A 8:00 PM \xc2\xa0', 'Room 203A 8:30 PM \xc2\xa0', 'Room 203A 9:00 PM \xc2\xa0', 'Room 203A 9:30 PM \xc2\xa0', 'Room 203A 10:00 PM \xc2\xa0', 'Room 203A 10:30 PM \xc2\xa0', 'Room 203A 11:00 PM \xc2\xa0', 'Room 203A 11:30 PM \xc2\xa0']
And then in my template (index.html), I have the following loop:
<div class="row">
...
<ul>
{% for item in timeRange %}
<li>{{ item }}</li>
{% endfor %}
</ul>
</div>
However, despite the list being generated in the backend, nothing is being displayed on the webpage. I know the list exists, but Django's rendering engine won't display it.
Am I missing something obvious here?
Thanks,
G
render_to_response second parameter, should be a dict containing your data, you are passing a list.
your render_to_response should looks like this:
return render_to_response('studybug/index.html', {'timeRange':timeRange}, context)