I am inserting data from one table to another, however for some reason I get "unrecognized token". This is the code:
cur.execute("INSERT INTO db.{table} SELECT distinct latitude, longitude, port FROM MessageType1 WHERE latitude>={minlat} AND latitude<={maxlat} AND longitude>= {minlong} AND longitude<= {maxlong}".format(minlat = bottomlat, maxlat = toplat, minlong = bottomlong, maxlong = toplong, table=tablename))
This translates to the following, with values:
INSERT INTO db.Vardo SELECT distinct latitude, longitude, port FROM MessageType1 WHERE latitude>=69.41 AND latitude<=70.948 AND longitude>= 27.72 AND longitude<= 28.416
The error code is the following:
sqlite3.OperationalError: unrecognized token: "70.948 AND"
Is the problem that there is three decimal points?
This is the create statement for the table:
cur.execute("CREATE TABLE {site} (latitude, longitude, port)".format(site = site))
Don't make your SQL queries via string formatting, use the driver's ability to prepare SQL queries and pass parameters into the query - this way you would avoid SQL injections and it would make handling of passing parameters of different types transparent:
query = """
INSERT INTO
db.{table}
SELECT DISTINCT
latitude, longitude, port
FROM
MessageType1
WHERE
latitude >= ? AND
latitude <= ? AND
longitude >= ? AND
longitude <= ?
""".format(table=tablename)
cur.execute(query, (bottomlat, toplat, bottomlong, toplong))
Try using ? for your parameters:
cur.execute("INSERT INTO db.? SELECT distinct latitude, longitude, port FROM MessageType1 WHERE latitude>=? AND latitude<=? AND longitude>= ? AND longitude<= ?",(bottomlat, toplat, bottomlong, toplong, tablename))
Related
I'm trying to insert latitude & longitude that are stored as python variables into a table in PostgreSQL via the INSERT query. Any suggestions on how to cast Point other than what I've tried?
I tried the insert query first as shown -
This is the table:
cur.execute('''CREATE TABLE AccidentList (
accidentId SERIAL PRIMARY KEY,
cameraGeoLocation POINT,
accidentTimeStamp TIMESTAMPTZ);''')
Try1:
cur.execute("INSERT INTO AccidentList(cameraGeoLocation,accidentTimeStamp)
VALUES {}".format((lat,lon),ts));
Error:
>Hint: psycopg2.ProgrammingError: column "camerageolocation" is of type point but expression is of type numeric
LINE 1: ...ist (cameraGeoLocation,accidentTimeStamp) VALUES (13.0843, 8...
^
HINT: You will need to rewrite or cast the expression.
Try2:
query = "INSERT INTO AccidentList (cameraGeoLocation,accidentTimeStamp)
VALUES(cameraGeoLocation::POINT, accidentTimeStamp::TIMESTAMPTZ);"
data = ((lat,lon),ts)
cur.execute(query,data)
Error:
LINE 1: ...List (cameraGeoLocation,accidentTimeStamp) VALUES(cameraGeoL...
^
HINT: There is a column named "camerageolocation" in table "accidentlist", but it cannot be referenced from this part of the query.
Try 3:
query = "INSERT INTO AccidentList (camerageolocation ,accidenttimestamp) VALUES(%s::POINT, %s);"
data = (POINT(lat,lon),ts)
cur.execute(query,data)
Error:
cur.execute(query,data)
psycopg2.ProgrammingError: cannot cast type record to point
LINE 1: ...tion ,accidenttimestamp) VALUES((13.0843, 80.2805)::POINT, '...
Single quote your third attempt.
This works: SELECT '(13.0843, 80.2805)'::POINT
I had a similar problem trying to insert data of type point into Postgres.
Using quotes around the tuple (making it a string) worked for me.
conn = psycopg2.connect(...)
cursor = conn.cursor()
conn.autocommit = True
sql = 'insert into cities (name,location) values (%s,%s);'
values = ('City A','(10.,20.)')
cursor.execute(sql,values)
cursor.close()
conn.close()
My environment:
PostgreSQL 12.4,
Python 3.7.2,
psycopg2-binary 2.8.5
I was loading my data from individual csv files into a dataframe using
df = pd.read_csv('data.csv', names=col_names, sep=',', skiprows=1)
col_names = ['created_date', 'latitude', 'longitude']
This would separate my data nicely into column frames and skip the first row which had the row labels
However i wanted to automate the process using a for loop that did the same query for every user. My function goes:
sql = "select distinct mobile_user_id from score where speed_range_id > 1"
distance_query = """SELECT created_date, latitude, longitude FROM score s where s.mobile_user_id = %(mobile_user_id)s and speed_range_id > 1 group by latitude, longitude order by id asc"""
cursor1.execute(sql)
result = cursor1.fetchall()
for rowdict in result:
distance = cursor3.execute(distance_query, rowdict)
distance_result = cursor3.fetchall()
df = pd.read_sql_query(distance_query, rdsConn, params={rowdict})
As you can see here the result variable holds the list of users and i want to iterate through all the users to generate a dataset for every user.
I've been trying to use the pd.read_sql_query but i've been unable to pass the mobile user parameter which is rowdict to the query.
How would i go so i can pass that variable using pandas? How can i organize my data in the way i had it before?
sample of the data.csv:
created_date, latitude, longitude
"2018-05-24 17:46:25", 20.61844841, -100.40813424
"2018-05-24 21:03:02", 20.58469452, -100.39204018
"2018-05-25 10:29:57", 20.61180308, -100.40826959
"2018-05-25 21:02:43", 20.59868518, -100.37825344
Any help is appreciated.
Consider running pure SQL combining both queries by adding a WHERE clause to your aggregate query.
Currently, you are attempting a WHERE clause comparing one value per row to many values: where mobile_user_id = %(mobile_user_id)s which will never be equal. Plus, your prepared statement does not have same number of placeholders as parameter values. Possibly you meant where mobile_user_id IN (?, ?, ?, ?, ?, ...) which involves dynamically setting placeholders, ?.
Nonetheless, simply run an aggregate query. Then, import the resultset into pandas. Specifically, add mobile_user_id as a grouping in query:
sql = """select mobile_user_id, created_date, latitude, longitude
from score
where speed_range_id > 1
group by mobile_user_id, created_date, latitude, longitude
order by id asc
"""
df = pd.read_sql_query(sql, rdsConn)
I am using the sqlalchemy package to state queries to my postgis database which is filled with .osm data of a city. I want to retrieve the longitude and latitude values from lets say the planet_osm_point table.
I state the sql query which looks like this:
SELECT st_y(st_asewkt(st_transform(way, 4326))) as lat,
st_x(st_asewkt(st_transform(way, 4326))) as lon
"addr:housenumber" AS husenumber,
"addr:street" AS street,
"addr:postcode" AS postcode
FROM planet_osm_point
Sqlalchemy throws me this error:
sqlalchemy.exc.InternalError: (psycopg2.InternalError) FEHLER: Argument to ST_Y() must be a point
The only problem is the ST_Y() and ST_X() function.
ST_X/ST_Y Return floats. You could either use the floats or cast them to text.
Using ST_AsEWKT is problematic here since both ST_Y/ST_X return floats and ST_AsEWKT expects a geometry.
Use the floats you get:
SELECT st_y(st_transform(way, 4326)) AS lat,
st_x(st_transform(way, 4326)) AS lon
"addr:housenumber" AS husenumber,
"addr:street" AS street,
"addr:postcode" AS postcode
FROM planet_osm_point
Or cast to text:
SELECT cast(st_y(st_transform(way, 4326)) as text) AS lat,
cast(st_x(st_transform(way, 4326)) as text) AS lon
"addr:housenumber" AS husenumber,
"addr:street" AS street,
"addr:postcode" AS postcode
FROM planet_osm_point
I have a location table with following structure:
CREATE TABLE location
(
id BIGINT,
location GEOMETRY,
CONSTRAINT location_pkey PRIMARY KEY (id, location),
CONSTRAINT enforce_dims_geom CHECK (st_ndims(location) = 2),
CONSTRAINT enforce_geotype_geom CHECK (geometrytype(location) = 'POINT'::TEXT OR location IS NULL),
CONSTRAINT enforce_srid_geom CHECK (st_srid(location) = 4326)
)
WITH (
OIDS=FALSE
);
CREATE INDEX location_geom_gist ON location
USING
GIST (location);
I run the following query to insert data:
def insert_location_data(msisdn, lat, lon):
if not (lat and lon):
return
query = "INSERT INTO location (id, location) VALUES ('%s', ST_GeomFromText('POINT(%s %s)', 4326))"%(str(id), str(lat), str(lon))
try:
cur = get_cursor()
cur.execute(query)
conn.commit()
except:
tb = traceback.format_exc()
Logger.get_logger().error("Error while inserting location in sql: %s", str(tb))
return False
return True
I run this block of code 10,000,000 times in a loop but somewhere after 1 million inserts the inserting speed drops drastically. The speed returns to normal when I restart the script but it again drops around a million documents and the same trend continues. I cannot figure out why?
Any help.
Here's a few tips.
Watch out for str(id), which would always return a string '<built-in function id>', since id is not shown to be a variable in the question, and is a built-in id() function.
The correct axis order for PostGIS is (X Y) or (lon lat).
There are more efficient ways to insert points.
Don't format a string to insert
This is how to insert one point:
cur.execute(
"INSERT INTO location (id, location) "
"VALUES (%s, ST_SetSRID(ST_MakePoint(%s, %s), 4326))",
(msisdn, lon, lat))
And see executemany if you want to insert more records at a time, where you would prepare a list of parameters to insert (i.e. [(msisdn, lon, lat), (msisdn, lon, lat), ..., (msisdn, lon, lat)]).
Brand new to python and loving it, and I imagine this might be a simple one.
I am currently inserting points into SQL Server 2008 via a Python script with the help of pymssql.
var1 = "hi"
lat = "55.92"
lon = "-3.29"
cursor.execute("INSERT INTO table (field1, x, y) VALUES(%s, %s, %s)",
(var1 , lat, lon))
This all works fine.
I need to also insert those coordinates into a GEOGRAPHY type field (called geog).
geog_type = "geography::STGeomFromText('POINT(%s %s)',4326))" % (lat, lon)
cursor.execute("INSERT INTO table (field1, x, y, geog) VALUES(%s, %s, %s, %s)",
(var1 , lat, lon, geog_type))
This throws the following exception:
The label geography::STGeomFro in the input well-known text (WKT) is
not valid. Valid labels are POINT, LINESTRING, POLYGON, MULTIPOINT,
MULTILINESTRING, MULTIPOLYGON, GEOMETRYCOLLECTION, CIRCULARSTRING,
COMPOUNDCURVE, CURVEPOLYGON and FULLGLOBE (geography Data Type only).
From SSMS I can run an insert statement on the table to insert a point fine.
USE [nosde]
INSERT INTO tweets (geog)
VALUES(
geography::STGeomFromText(
'POINT(55.9271035250276 -3.29431266523898)',4326))
Let me know in the comments if you need more details.
Some of my workings on pastebin.
Several issues - firstly, you're supplying the coordinates in the wrong order - the STPointFromText() method expects longitude first, then latitude.
Secondly, it may be easier to use the Point() method rather than the STPointFromText() method, which doesn't require any string manipulation - just supply the two numeric coordinate parameters directly. http://technet.microsoft.com/en-us/library/bb933811.aspx
But, from the error message, it appears that the value you're sending is attempting to be parsed as a WKT string. If this is the case, you don't want the extra geography::STGeomFromText and the SRID at the end anyway - these are assumed. So try just supplying:
geog_type = "'POINT(%s %s)'" % (lon, lat)
cursor.execute("INSERT INTO table (field1, x, y, geog) VALUES(%s, %s, %s, %s)",
(var1 , lat, lon, geog_type))
I'm not sure if you need the extra single quotes in the first line or not, but don't have a system to test on at the moment.