How can combine geometry with countries? - python
I need a file in which I will have the names of European and Asian countries and their "geometry" data.
Part of Russia was jumping to the other side of the chart and I had to correct the data to keep the Russia map in one piece.
I found a code for it, but unfortunately when I execute it, it only keeps geometry data, which cannot be easily linked with the names of countries that I need
Initial map:
enter image description here
Code:
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
asia = world[world["continent"] == "Asia"]
europe = world[world["continent"] == "Europe"]
euroasia = pd.concat([asia, europe])
name = euroasia["name"]
def shift_geom(shift, gdataframe, plotQ=False):
shift -= 180
moved_geom = []
splitted_geom = []
border = LineString([(shift,90),(shift,-90)])
for row in gdataframe["geometry"]:
splitted_geom.append(split(row, border))
for element in splitted_geom:
items = list(element)
for item in items:
minx, miny, maxx, maxy = item.bounds
if minx >= shift:
moved_geom.append(translate(item, xoff=-180-shift))
else:
moved_geom.append(translate(item, xoff=180-shift))
moved_geom_gdf = gpd.GeoDataFrame({"geometry": moved_geom})
if plotQ:
fig1, ax1 = plt.subplots(figsize=[20,30])
moved_geom_gdf.plot(ax=ax1)
plt.show()
return moved_geom_gdf
new = shift_geom(90, euroasia, False)
n_ = shift_geom(-90, new, True)
new
Results (good map, but only geometry data):
enter image description here
Related
iterate several polygons - 'Polygon object error'
I am constructing a graph to represent sales by country. # to download the file [file](https://drive.google.com/file/d/16Uw_rJgzJdqhhxYjK3tJGkTeXyOLENMR/view?usp=share_link) store_sales = pd.read_pickle('store_sales.venta') store_sales p1_sales_by_country = store_sales.groupby(['country']).p1_sales.sum() p1_sales_by_country use a graph to represent sales values by country. plt.figure(figsize=(16,6)) ax = plt.axes(projection=crs.PlateCarree()) shpfile = shapereader.natural_earth(resolution='110m', category='cultural', name='admin_0_countries') reader = shapereader.Reader(shpfile) countries = reader.records() max_sales = p1_sales_by_country.max() for country in countries: country_name = country.attributes['ADM0_A3'] if country_name in p1_sales_by_country: ax.add_geometries(country.geometry, crs.PlateCarree(), facecolor=plt.cm.Greens(p1_sales_by_country[country_name] /max_sales), edgecolor='k') else: ax.add_geometries(country.geometry, crs.PlateCarree(), facecolor='w', edgecolor='k') I have problems with iterating a Polygon TypeError: 'Polygon' object is not iterable
Embedding python Matplotlib Graph in html using Pyscript
I've written a python program that takes some inputs and turns them into a matplotlib graph. Specifically, it displays wealth distributions by percentile for a country of the user's choosing. However, these inputs are currently given by changing variables in the program. I want to put this code on a website, allowing users to choose any country and see the wealth distribution for that country, as well as how they compare. Essentially, I am trying to recreate this: https://wid.world/income-comparator/ The code in python is all done but I am struggling to incorporate it into an HTML file. I was trying to use pyscript but it currently loads forever and displays nothing. Would rather not rewrite it in javascript (mainly because I don't know js). My thoughts are that it has something to do with the code importing csv files from my device? import csv from typing import List import matplotlib.pyplot as plt import collections import math from forex_python.converter import CurrencyRates # ---------------- # # whether or not the graph includes the top 1 percent in the graph (makes the rest of the graph visible!) one_percent = False # True or False # pick which country(ies) you want to view country = 'China' # String # what currency should the graph use currency_used = 'Canada' # String # if you want to compare an income compare_income = True # True or False # what income do you want to compare income = 100000 # Int # ---------------- # codes = {} # get dictionary of monetary country codes monetary_codes = {} with open('codes-all.csv') as csv_file: list = csv.reader(csv_file, delimiter=',') for row in list: if row[5] == "": monetary_codes[row[0]] = (row[2], row[1]) # get dictionary of country names and codes for WID with open('WID_countries.csv') as csv_file: WID_codes = csv.reader(csv_file, delimiter=',') next(WID_codes) for row in WID_codes: if len(row[0]) == 2: if row[2] != "": monetary_code = monetary_codes[row[1].upper()][0] currency_name = monetary_codes[row[1].upper()][1] codes[row[1].upper()] = (row[0], monetary_code, currency_name) elif row[2] == "": codes[row[1].upper()] = (row[0], 'USD', 'United States Dollar') elif row[0][0] == 'U' and row[0][1] == 'S': codes[row[1].upper()] = (row[0], 'USD', 'United States Dollar') # converts user input to upper case country = country.upper() currency_used = currency_used.upper() # gets conversion rate c = CurrencyRates() conversion_rate = c.get_rate(codes[country][1], codes[currency_used][1]) # convert money into correct currency def convert_money(conversion_rate, value): return float(value) * conversion_rate # get and clean data def get_data(country): aptinc = {} # cleaning the data with open(f'country_data/WID_data_{codes[country][0]}.csv') as csv_file: data = csv.reader(csv_file, delimiter=';') for row in data: # I only care about the year 2021 and the variable 'aptinc' if 'aptinc992' in row[1] and row[3] == '2021': # translates percentile string into a numerical value index = 0 for i in row[2]: # index 0 is always 'p', so we get rid of that if index == 0: row[2] = row[2][1:] # each string has a p in the middle of the numbers we care about. I also only # care about the rows which measure a single percentile # (upper bound - lower bound <= 1) elif i == 'p': lb = float(row[2][:index - 1]) ub = float(row[2][index:]) # if the top one percent is being filtered out adds another requirement if not one_percent: if ub - lb <= 1 and ub <= 99: row[2] = ub else: row[2] = 0 else: if ub - lb <= 1: row[2] = ub else: row[2] = 0 index += 1 # adds wanted, cleaned data to a dictionary. Also converts all values to one currency if row[2] != 0: aptinc[row[2]] = convert_money(conversion_rate, row[4]) return aptinc # find the closest percentile to an income def closest_percentile(income, data): closest = math.inf percentile = float() for i in data: difference = income - data[i] if abs(difference) < closest: closest = difference percentile = i return percentile # ---------------- # unsorted_data = {} percentiles = [] average_income = [] # gets data for the country data = get_data(country) for i in data: unsorted_data[i] = data[i] # sorts the data sorted = collections.OrderedDict(sorted(unsorted_data.items())) for i in sorted: percentiles.append(i) average_income.append(data[i]) # makes countries pretty for printing country = country.lower() country = country.capitalize() # calculates where the income places against incomes from country(ies) blurb = "" if compare_income: percentile = closest_percentile(income, sorted) blurb = f"You are richer than {round(percentile)} percent of {country}'s population" # plot this data! plt.plot(percentiles,average_income) plt.title(f'{country} Average Annual Income by Percentile') plt.xlabel(f'Percentile\n{blurb}') plt.ylabel(f'Average Annual Income of {country}({codes[currency_used][1]})') plt.axvline(x = 99, color = 'r', label = '99th percentile', linestyle=':') if compare_income: plt.axvline(x = percentile, color = 'g', label = f'{income} {codes[currency_used][2]}') plt.legend(bbox_to_anchor = (0, 1), loc = 'upper left') plt.show()
Why are the indicators on my chart delayed by at least 1 day, making them not flush on the blue line? Is it because the time frame is too wide?
Why are the up triangles, when the program is supposed to buy, not on the line when it crosses under, or in the other scenario, the down triangle, when the program is supposed to sell, not on the line when it crosses on top? The blue line is the price and the red line is the EMA, tracking the price. import pandas as pd import numpy as np import matplotlib.pyplot as plt import requests plt.style.use("fivethirtyeight") df = pd.read_csv("TSLA.csv") df = df.set_index(pd.DatetimeIndex(df["Date"].values)) ShortEMA = df.Close.ewm(span=5, adjust = False).mean() MiddleEMA = df.Close.ewm(span = 21, adjust = False).mean() LongEMA = df.Close.ewm(span = 53, adjust = False).mean() df['Short'] = ShortEMA df['Middle'] = MiddleEMA df['Long'] = LongEMA def MyStrat(data): bought_list = [] sold_list = [] In = False Out = True for i in range(0, len(data)): if data["Close"][i] > data["Short"][i] and In == False and Out == True: bought_list.append(data["Close"][i]) sold_list.append(np.nan) In = True Out = False elif data["Close"][i] < data["Short"][i] and In == True and Out == False: sold_list.append(data["Close"][i]) bought_list.append(np.nan) In = False Out = True else: bought_list.append(np.nan) sold_list.append(np.nan) return(bought_list,sold_list) df["Bought"] = MyStrat(df)[0] df["Sold"] = MyStrat(df)[1] print(df) plt.figure(figsize=(16, 5)) plt.title('Buy and Sell', fontsize = 18) plt.plot(df['Close'], label = 'Close Price', color = 'blue', alpha = 0.35) plt.plot(ShortEMA, label = 'Short', color = 'red', alpha = 0.35) plt.scatter(df.index, df["Bought"], color = "purple", marker = "^", alpha = 1) plt.scatter(df.index, df["Sold"], color = "blue", marker = "v", alpha = 1) plt.xlabel("Date", fontsize = 18) plt.ylabel("Close", fontsize = 18) plt.show() You can use this data for reference: Date,Open,High,Low,Close,Adj Close,Volume 2022-01-06,1077.000000,1088.000000,1020.500000,1064.699951,1064.699951,30112200 2022-01-07,1080.369995,1080.930054,1010.000000,1026.959961,1026.959961,28054900 2022-01-10,1000.000000,1059.099976,980.000000,1058.119995,1058.119995,30605000 2022-01-11,1053.670044,1075.849976,1038.819946,1064.400024,1064.400024,22021100 2022-01-12,1078.849976,1114.839966,1072.589966,1106.219971,1106.219971,27913000 2022-01-13,1109.069946,1115.599976,1026.540039,1031.560059,1031.560059,32403300 2022-01-14,1019.880005,1052.000000,1013.380005,1049.609985,1049.609985,24308100 2022-01-18,1026.609985,1070.790039,1016.059998,1030.510010,1030.510010,22247800 2022-01-19,1041.709961,1054.670044,995.000000,995.650024,995.650024,25147500 2022-01-20,1009.729980,1041.660034,994.000000,996.270020,996.270020,23496200 2022-01-21,996.340027,1004.549988,940.500000,943.900024,943.900024,34472000 2022-01-24,904.760010,933.510010,851.469971,930.000000,930.000000,50521900 2022-01-25,914.200012,951.260010,903.210022,918.400024,918.400024,28865300 2022-01-26,952.429993,987.690002,906.000000,937.409973,937.409973,34955800 2022-01-27,933.359985,935.390015,829.000000,829.099976,829.099976,49036500 2022-01-28,831.559998,857.500000,792.010010,846.349976,846.349976,44929700 2022-01-31,872.710022,937.989990,862.049988,936.719971,936.719971,34812000 2022-02-01,935.210022,943.700012,905.000000,931.250000,931.250000,24379400 2022-02-02,928.179993,931.500000,889.409973,905.659973,905.659973,22264300 2022-02-03,882.000000,937.000000,880.520020,891.140015,891.140015,26285200 2022-02-04,897.219971,936.500000,881.169983,923.320007,923.320007,24541800 2022-02-07,923.789978,947.770020,902.710022,907.340027,907.340027,20331500 2022-02-08,905.530029,926.289978,894.799988,922.000000,922.000000,16909700 2022-02-09,935.000000,946.270020,920.000000,932.000000,932.000000,17419800 2022-02-10,908.369995,943.809998,896.700012,904.549988,904.549988,22042300 2022-02-11,909.630005,915.960022,850.700012,860.000000,860.000000,26548600 2022-02-14,861.570007,898.880005,853.150024,875.760010,875.760010,22585500 2022-02-15,900.000000,923.000000,893.380005,922.429993,922.429993,19095400 2022-02-16,914.049988,926.429993,901.210022,923.390015,923.390015,17098100 2022-02-17,913.260010,918.500000,874.099976,876.349976,876.349976,18392800 2022-02-18,886.000000,886.869995,837.609985,856.979980,856.979980,22833900 2022-02-22,834.130005,856.729980,801.099976,821.530029,821.530029,27762700 2022-02-23,830.429993,835.299988,760.559998,764.039978,764.039978,31752300 2022-02-24,700.390015,802.479980,700.000000,800.770020,800.770020,45107400 2022-02-25,809.229980,819.500000,782.400024,809.869995,809.869995,25355900 2022-02-28,815.010010,876.859985,814.710022,870.429993,870.429993,33002300 2022-03-01,869.679993,889.880005,853.780029,864.369995,864.369995,24922300 2022-03-02,872.130005,886.479980,844.270020,879.890015,879.890015,24881100 2022-03-03,878.770020,886.440002,832.599976,839.289978,839.289978,20541200 2022-03-04,849.099976,855.650024,825.159973,838.289978,838.289978,22333200 2022-03-07,856.299988,866.140015,804.570007,804.580017,804.580017,24164700 2022-03-08,795.530029,849.989990,782.169983,824.400024,824.400024,26799700 2022-03-09,839.479980,860.559998,832.010010,858.969971,858.969971,19728000 2022-03-10,851.450012,854.450012,810.359985,838.299988,838.299988,19549500 2022-03-11,840.200012,843.799988,793.770020,795.349976,795.349976,22272800 2022-03-14,780.609985,800.700012,756.039978,766.369995,766.369995,23717400 2022-03-15,775.270020,805.570007,756.570007,801.890015,801.890015,22280400 2022-03-16,809.000000,842.000000,802.260010,840.229980,840.229980,28009600 2022-03-17,830.989990,875.000000,825.719971,871.599976,871.599976,22194300 2022-03-18,874.489990,907.849976,867.390015,905.390015,905.390015,33408500 2022-03-21,914.979980,942.849976,907.090027,921.159973,921.159973,27327200 2022-03-22,930.000000,997.859985,921.750000,993.979980,993.979980,35289500 2022-03-23,979.940002,1040.699951,976.400024,999.109985,999.109985,40225400 2022-03-24,1009.729980,1024.489990,988.799988,1013.919983,1013.919983,22973600 2022-03-25,1008.000000,1021.799988,997.320007,1010.640015,1010.640015,20642900 2022-03-28,1065.099976,1097.880005,1053.599976,1091.839966,1091.839966,34168700 2022-03-29,1107.989990,1114.770020,1073.109985,1099.569946,1099.569946,24538300 2022-03-30,1091.170044,1113.949951,1084.000000,1093.989990,1093.989990,19955000 2022-03-31,1094.569946,1103.140015,1076.640015,1077.599976,1077.599976,16265600 2022-03-31,1094.569946,1103.139893,1076.640991,1077.599976,1077.599976,16330919
The problem with this is that the point of intersection occurs between days, not on a specific day. As the data is not continuous, but rather just one point per business day, it is not possible to put the arrow on the intersection itself. I have enlarged a portion of the graph here so you can see what I mean. The change occurs between the 9th and 10th. The data is only on the 9th or the 10th, so the arrow is plotted, and the buy occurs, on the 10th. The buy/sell is on the next possible day, causing the mis-alignment of the arrows.
Slight challenge. Need to optimize code. Original one takes too long :(
So i'm trying to read 3 different csv files and plotting the information in an img with four graphs. One with the average delay by airline, expressed in minutes Another with the ratio of delayed flights, by airline Another with the average delay by destination airport, expressed in minutes And finally another with the ratio of flights delayed to arrival, by destination airport All the information is correct and i got it from the files. The problem is that the program below takes too long to produce the graphs and that they're all being separately plotted and not all together in one image. Is there a way to optimize my code to run faster? And how do i use subplots without changing everything? import pandas as pd import matplotlib.pyplot as plt path_main = '850566403_T_ONTIME.csv' path_airline = 'L_AIRLINE_ID.csv' path_airport = 'L_AIRPORT_ID.csv' df1 = pd.read_csv(path_main) al = pd.read_csv(path_airline) ap = pd.read_csv(path_airport) #remove columns and rows with nan df1.dropna(axis=1, how='all', inplace=True) df = df1.dropna(subset=['ARR_DELAY_NEW']) #------------------------------------------------------------------------------ #Airlines: #dict with {ID: Name} d_al = {} for i in range(len(al)): d_al[al['Code'][i]] = al['Description'][i] # array with ID's of airlines and delays arr_al = df.loc[:, ('AIRLINE_ID', 'ARR_DELAY_NEW')].to_numpy() # list with ID's of airlines list_al = [] for i in arr_al: if i[0] not in list_al: list_al.append(int(i[0])) def airline_avg_ratio(ID): ''' function that requires an airline ID and returns a tuple with name (first 10 characters), average delay and delayed flight ratios ''' nr_voos = 0 soma = 0 nr_atrasos = 0 for i in arr_al: if i[0] == ID: soma += i[1] nr_voos += 1 if i[1] != 0: nr_atrasos += 1 for k,v in d_al.items(): if k == ID: nome = v[0:10] media = round((soma / nr_atrasos), 3) racio = round((nr_atrasos / nr_voos), 3) return (nome, media, racio) dados_al = [] for i in list_al: dados_al.append(airline_avg_ratio(i)) df_al = pd.DataFrame(dados_al, columns=['Airlines', 'Average', 'Ratio']) graph1 = df_al.drop(columns='Ratio').sort_values(by='Average').iloc[-10:] graph1.plot(x='Airlines', y='Average', kind='bar') plt.title("Atraso Médio por Companhia (top 10)") plt.xlabel('Companhia Aérea', fontsize=12) plt.ylabel('Minutos', fontsize=12) plt.show() graph2 = df_al.drop(columns='Average').sort_values(by='Ratio').iloc[-10:] graph2.plot(x='Airlines', y='Ratio', kind='bar') plt.title("Vôos Atrasados por Companhia (top 10)") plt.xlabel('Companhia Aérea', fontsize=12) plt.ylabel('Rácio', fontsize=12) plt.show() #------------------------------------------------------------------------------ # Airports: #dict with {ID: Name} d_ap = {} for i in range(len(ap)): d_ap[ap['Code'][i]] = ap['Description'][i] #array with ID's of Airports and delays arr_ap = df.loc[:, ('DEST_AIRPORT_ID', 'ARR_DELAY_NEW')].to_numpy() #list with ID's of Airports list_ap = [] for i in arr_ap: if i[0] not in list_ap: list_ap.append(int(i[0])) def airport_avg_ratio(ID): ''' function that requires an airport and returns a tuple with name (first 10 characters), average delay and delayed flight ratios ''' nr_chegadas = 0 soma = 0 nr_atrasos = 0 for i in arr_ap: if i[0] == ID: soma += i[1] nr_chegadas += 1 if i[1] != 0: nr_atrasos += 1 for k,v in d_ap.items(): if k == ID: nome = v[0:10] media = round((soma / nr_atrasos), 3) racio = round((nr_atrasos / nr_chegadas), 3) return (nome, media, racio) dados_ap = [] for i in list_ap: dados_ap.append(airport_avg_ratio(i)) df_ap = pd.DataFrame(dados_ap, columns=['Airports', 'Average', 'Ratio']) graph3 = df_ap.drop(columns='Ratio').sort_values(by='Average').iloc[-10:] graph3.plot(x='Airports', y='Average', kind='bar') plt.title("Atraso Médio por Aeroporto (top 10)") plt.xlabel('Aeroporto', fontsize=12) plt.ylabel('Minutos', fontsize=12) plt.show() graph4 = df_ap.drop(columns='Average').sort_values(by='Ratio').iloc[-10:] graph4.plot(x='Airports', y='Ratio', kind='bar') plt.title("Vôos Atrasados por Aeroporto (top 10)") plt.xlabel('Aeroporto', fontsize=12) plt.ylabel('Rácio', fontsize=12) plt.show()
Adding a Title or Text to a Folium Map
I'm wondering if there's a way to add a title or text on a folium map in python? I have 8 maps to show and I want the user to know which map they're looking at without having to click on a marker. I attempted to add an image of the map, but couldn't because I don't have high enough reputation score. My code: #marker cluster corpus_chris_loc = [27.783889, -97.510556] harvey_avg_losses_map = folium.Map(location = corpus_chris_loc, zoom_start = 5) marker_cluster = MarkerCluster().add_to(harvey_avg_losses_map) #inside the loop add each marker to the cluster for row_index, row_values in harvey_losses.iterrows(): loc = [row_values['lat'], row_values['lng']] pop = ("zip code: " + str(row_values["loss_location_zip"]) + "\nzip_avg: " + "$" + str(row_values['zip_avg'])) #show the zip and it's avg icon = folium.Icon(color='red') marker = folium.Marker( title = "Harvey: " + "$" + str(row_values['harvey_avg']), location = loc, popup = pop, icon=icon) marker.add_to(marker_cluster) #save an interactive HTML map by calling .save() harvey_avg_losses_map.save('../data/harveylossclustermap.html') harvey_avg_losses_map[map of hurricane harvey insurance claims][1]
Of course you can add a title to a Folium map. For example: import folium loc = 'Corpus Christi' title_html = ''' <h3 align="center" style="font-size:16px"><b>{}</b></h3> '''.format(loc) m = folium.Map(location=[27.783889, -97.510556], zoom_start=12) m.get_root().html.add_child(folium.Element(title_html)) m.save('map-with-title.html') m