Python df lat long in for loop

Python df lat long in for loop - python

I wanted to change the code into for-loop so that I can change the style for each point.
Code below is working fine without for-loop:
import simplekml
import pandas as pd
excel_file = 'sample.xlsx'
df=pd.read_excel(excel_file)
kml = simplekml.Kml()
df.apply(lambda X: kml.newpoint( coords=[( X["Long"],X["Lat"])]) ,axis=1)
kml.save(path = "data.kml")
I wanted to do it in for-loop so that I can put style to each point, but my for-loop is not working
import simplekml
import pandas as pd
kml = simplekml.Kml()
style = simplekml.Style()
excel_file = 'sample1.xlsx'
df=pd.read_excel(excel_file)
y=df.Long
x=df.Lat
MinLat=int(df.Lat.min())
MaxLat=int(df.Lat.max())
MinLong=int(df.Long.min())
MaxLong=int(df.Long.max())
multipnt =kml.newmultigeometry()
for long in range(MinLong,MaxLong): # Generate longitude values
for lat in range(MaxLat,MinLat): # Generate latitude values
multipnt.newpoint(coords=[(y,x)])
#kml.newpoint(coords=[(y,x)])
kml.save("Point Shared Style.kml")

If want to iterate over a collection of points in an Excel file and add them to a single Placemark as a MultiGeometry using a for-loop then try this.
import simplekml
import pandas as pd
kml = simplekml.Kml()
style = simplekml.Style()
excel_file = 'sample1.xlsx'
df = pd.read_excel(excel_file)
multipnt = kml.newmultigeometry()
for row in df.itertuples(index=False):
multipnt.newpoint(coords=[(row.Lat, row.Long)])
kml.save("PointSharedStyle.kml")
If want to generate a point grid every decimal degree for the bounding box of the points then you would try the following:
import simplekml
import pandas as pd
kml = simplekml.Kml()
style = simplekml.Style()
excel_file = 'sample1.xlsx'
df = pd.read_excel(excel_file)
MinLat = int(df.Lat.min())
MaxLat = int(df.Lat.max())
MinLong = int(df.Long.min())
MaxLong = int(df.Long.max())
for long in range(MinLong, MaxLong+1): # Generate longitude values
for lat in range(MinLat, MaxLat+1): # Generate latitude values
multipnt.newpoint(coords=[(long, lat)])
#kml.newpoint(coords=[(long,lat)])
kml.save("PointSharedStyle.kml")
Note the Style is assigned to the placemark not the geometry so the MultiGeometry can only be assigned a single Style for all points. If want a different style for each point then need to create one placemark per point and assign each with its own Style.
For help setting styles, see https://simplekml.readthedocs.io/en/latest/styles.html

Related

Get Excel from a database and us it in the python script

Given the Code below What I want to do is get the excel from a database where anyone can come and upload the excel or CSV file and then this code runs for that excel file. As you can see in the first line after importing libraries I read a local Excel file but I want to import it from a database online (maybe like MongoDB or oracle I'm not sure)
please help me out what will be the best method to achieve this
from geopy.distance import geodesic as GD
import pandas as pd
import xlsxwriter
import sys
path_excel = r"D:\INTERNSHIP RELATED FILES\New Village details of Gazipur.xlsx"
df = pd.read_excel(path_excel)
radius = float(input("Enter the radius "))
Vle_coordinates = []
for i in range(2):
Vle_coordinates.append(float(input("Enter the Latitude and Longitude")))
Village_Name = list(df["Village Name"])
Lats = list(df["Latitude"])
Longs = list(df["Longitude"])
Population = list(df['Village Population'])
temp = list(zip(Lats,Longs))
villages= dict((key,value) for key,value in zip(Village_Name,temp))
distance =[]
for key,values in villages.items():
d = (GD(Vle_coordinates,values).km)
distance.append(round(d,2))
Vle_details = list(zip(Village_Name,distance,Population))
s = sorted(Vle_details, key = lambda x: (x[1], -x[2]))
for items in s:
if (items[1]<=radius):
print(items[0])

Approch to merge a template with header and Items with Data for each entry

I'm trying to learn Python and find a solution for my business.
I'm working on SAP and i need to merge data to fill a template.
Doing the merge based on Excel VBA, it's working but to fill a file with 10 K entries it's take a very long time.
My template is avaiable here
https://docs.google.com/spreadsheets/d/1FXc-4zUYx0fjGRvPf0FgMjeTm9nXVfSt/edit?usp=sharing&ouid=113964169462465283497&rtpof=true&sd=true
And a sample of data is here
https://drive.google.com/file/d/105FP8ti0xKbXCFeA2o5HU7d2l3Qi-JqJ/view?usp=sharing
So I need to merge for each record from my data file into the Excel template where we have an header and 2 lines (it's a FI posting so I need to fill the debit and credit.
In VBA, I have proceed like that:
Fix the cell:
Copy data from the template with function activecell.offset(x,y) ...
From my Data file fill the different record based on technical name.
Now I'm trying the same in Python.
Using Pandas or openpyxyl I can open the file but I can't see how can I continue or proceed to find a way to merge header data (must be copy for eache posting I have to book) and data.
from tkinter import *
import pandas as pd
import datetime
from openpyxl import load_workbook
import numpy as np
def sap_line_item(ligne):
ledger = ligne
print(ligne)
return
# Constante
c_dir = '/Users/sapfinance/PycharmProjects/SAP'
C_FILE_SEP = ';'
root = Tk()
root.withdraw()
# folder_selected = filedialog.askdirectory(initialdir=c_dir)
fiori_selected = filedialog.askopenfile(initialdir=c_dir)
data_selected = filedialog.askopenfile(initialdir=c_dir)
# read data
pd.options.display.float_format = '{:,.2f}'.format
fichier_cible = str(data_selected.name)
target_filename = fichier_cible + '_' + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") + '.xlsx'
# target = pd.ExcelWriter(target_filename, engine='xlsxwriter')
df_full_data = pd.read_csv(data_selected.name, sep=C_FILE_SEP, encoding='unicode_escape', dtype='unicode')
nb_ligne_data = int(len(df_full_data))
print(nb_ligne_data)
#df_fiori = pd.read_excel(fiori_selected.name)
print(fiori_selected.name)
df_fiori = load_workbook(fiori_selected.name)
df_fiori_data = df_fiori.active
Any help to give some tick to approach and find a solution will be appreciate.
Have a great day
Philippe

Find closest lat & lon from large csv

I need help using python and pandas. I want to find the ship nearest Lat= 45.82019 and Lon= -129.73671.
Here is the imported data
import os
import pandas as pd
master_df = pd.DataFrame()
for file in os.listdir(os.getcwd()): #combine all csv file
if file.endswith('.csv'):
master_df = master_df.append(pd.read_csv(file))
master_df.to_csv('Combined File.CSV', index = False)
df = pd.read_csv('Combined File.CSV')
print(df)
my data output:
After finding the ship nearest given specific lat&lon, I would like to get all info about that ship.

how to drop a categorical value from a data frame column in python?

I am working with a data frame title price_df. and I would like to drop the rows that contain '4wd' from the column drive-wheels. I have tried price_df2 = price_df.drop(index='4wd', axis=0) and a few other variations after reading the docs pages in pandas, but I continue to get error codes. Could anyone direct me to the correct way to drop the rows that contain values 4wd from the column and data frame? Below is the code I have ran before trying to drop the values:
# Cleaned up Dataset location
fileName = "https://library.startlearninglabs.uw.edu/DATASCI410/Datasets/Automobile%20price%20data%20_Raw_.csv"
# Import libraries
from scipy.stats import norm
import numpy as np
import pandas as pd
import math
import numpy.random as nr
price_df = pd.read_csv(fileName)
round(price_df.head(),2) #getting an overview of that data
price_df.loc[:,'drive-wheels'].value_counts()
price_df2 = price_df.drop(index='4wd', axis=0)

You can use pd.DataFrame.query and back ticks for this column name with a hyphen:
price_df.query('`drive-wheels` != "4wd"')

Try this
price_df = pd.read_csv(fileName)
mask = price_df["drive-wheels"] =="4wd"
price_df = price_df[~mask]

Get a subset of your data with this one-liner:
price_df2 = price_df[price_df.drive-wheels != '4wd']

Extract multiple polygon coordinates of csv file

I want to extract the (multiple) polygon coordinates of a .xlsx file into Panda Dataframe in Python.
The .xlsx file is available on google docs.
Now I do this:
import pandas as pd
gemeenten2019 = pd.read_excel('document.xlsx', index=False, skiprows=0 )
gemeenten2019['KML'] = str(gemeenten2019['KML'])
for index, row in gemeenten2019.iterrows():
removepart = str(row['KML'])
row['KML'] = removepart.replace('<MultiGeometry><Polygon><coordinates>', '')
gemeentenamen = []
gemeentePolygon = []
for gemeentenaam in gemeenten2019['NAAM']:
gemeentenamen.append(str(gemeentenaam))
for value in gemeenten2019['KML']:
gemeentePolygon.append(str(value))
df_gemeenteCoordinaten = pd.DataFrame({'Gemeente':gemeentenamen, 'KML': gemeentePolygon})
df_gemeenteCoordinaten
But the result is that every column ("KML") has the same results.
Only I want the coordinates for that specific row his column and not all the coordinates of all the columns.
The dataframe must look like:
Does anyone know how to extract the multiple coordinates for each row?

This would give you each pair of values on its own line:
import pandas as pd
gemeenten2019 = pd.read_excel('Gemeenten 2019.xlsx', index=False, skiprows=0)
gemeenten2019['KML'] = gemeenten2019['KML'].str.strip('<>/abcdefghijklmnopqrstuvwxyzGMP').str.replace(' ', '\n')
For example:
NAAM KML
0 Aa en Hunze 6.81394482119469,53.070971596018\n6.8612875225...
1 Aalsmeer 4.79469736599488,52.2606817589009\n4.795085405...
2 Aalten 6.63891586106867,51.9625470164657\n6.639463741...
3 Achtkarspelen 6.23217311778447,53.2567474241222\n6.235100748...

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python df lat long in for loop - python

Related

Get Excel from a database and us it in the python script

Approch to merge a template with header and Items with Data for each entry

Find closest lat & lon from large csv

how to drop a categorical value from a data frame column in python?

Extract multiple polygon coordinates of csv file

Categories

Resources