Insert Values from dictionary into sqlite database - python

I cannot get my head around it.
I want to insert the values of a dictionary into a sqlite databse.
url = "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5f...1b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description"
soup = BeautifulSoup(urlopen(url)) #soup it up
for data in soup.find_all('photo'): #parsing the data
dict = { #filter the data, find_all creats dictionary KEY:VALUE
"id_p": data.get('id'),
"title_p": data.get('title'),
"tags_p": data.get('tags'),
"latitude_p": data.get('latitude'),
"longitude_p": data.get('longitude'),
}
#print (dict)
connector.execute("insert into DATAGERMANY values (?,?,?,?,?)", );
connector.commit()
connector.close
My keys are id_p, title_p etc. and the values I retrieve through data.get.
However, I cannot insert them.
When I try to write id, title, tags, latitude, longitude behind ...DATAGERMANY values (?,?,?,?,?)", ); I get
NameError: name 'title' is not defined.
I tried it with dict.values and dict but then its saying table DATAGERMANY has 6 columns but 5 values were supplied.
Adding another ? gives me the error (with `dict.values): ValueError: parameters are of unsupported type
This is how I created the db and table.
#creating SQLite Database and Table
connector = sqlite3.connect("GERMANY.db") #create Database and Table, check if NOT NULL is a good idea
connector.execute('''CREATE TABLE DATAGERMANY
(id_db INTEGER PRIMARY KEY AUTOINCREMENT,
id_photo INTEGER NOT NULL,
title TEXT,
tags TEXT,
latitude NUMERIC NOT NULL,
longitude NUMERIC NOT NULL);''')
The method should work even if there is no valueto fill in into the database... That can happen as well.

You can use named parameters and insert all rows at once using executemany().
As a bonus, you would get a good separation of html-parsing and data-pipelining logic:
data = [{"id_p": photo.get('id'),
"title_p": photo.get('title'),
"tags_p": photo.get('tags'),
"latitude_p": photo.get('latitude'),
"longitude_p": photo.get('longitude')} for photo in soup.find_all('photo')]
connector.executemany("""
INSERT INTO
DATAGERMANY
(id_photo, title, tags, latitude, longitude)
VALUES
(:id_p, :title_p, :tags_p, :latitude_p, :longitude_p)""", data)
Also, don't forget to actually call the close() method:
connector.close()
FYI, the complete code:
import sqlite3
from urllib2 import urlopen
from bs4 import BeautifulSoup
url = "https://api.flickr.com/services/rest/?method=flickr.photos.search&api_key=5f...1b&per_page=250&accuracy=1&has_geo=1&extras=geo,tags,views,description"
soup = BeautifulSoup(urlopen(url))
connector = sqlite3.connect(":memory:")
cursor = connector.cursor()
cursor.execute('''CREATE TABLE DATAGERMANY
(id_db INTEGER PRIMARY KEY AUTOINCREMENT,
id_photo INTEGER NOT NULL,
title TEXT,
tags TEXT,
latitude NUMERIC NOT NULL,
longitude NUMERIC NOT NULL);''')
data = [{"id_p": photo.get('id'),
"title_p": photo.get('title'),
"tags_p": photo.get('tags'),
"latitude_p": photo.get('latitude'),
"longitude_p": photo.get('longitude')} for photo in soup.find_all('photo')]
cursor.executemany("""
INSERT INTO
DATAGERMANY
(id_photo, title, tags, latitude, longitude)
VALUES
(:id_p, :title_p, :tags_p, :latitude_p, :longitude_p)""", data)
connector.commit()
cursor.close()
connector.close()

As written, your connector.execute() statement is missing the parameters argument.
It should be used like this:
connector.execute("insert into some_time values (?, ?)", ["question_mark_1", "question_mark_2"])
Unless you need the dictionary for later, I would actually use a list or tuple instead:
row = [
data.get('id'),
data.get('title'),
data.get('tags'),
data.get('latitude'),
data.get('longitude'),
]
Then your insert statement becomes:
connector.execute("insert into DATAGERMANY values (NULL,?,?,?,?,?)", *row)
Why these changes?
The NULL in the values (NULL, ...) is so the auto-incrementing primary key will work
The list instead of the dictionary because order is important, and dictionaries don't preserve order
The *row so the five-element row variable will be expanded (see here for details).
Lastly, you shouldn't use dict as a variable name, since that's a built-in variable in Python.

If you're using Python 3.6 or above, you can do this for dicts:
dict_data = {
'filename' : 'test.txt',
'size' : '200'
}
table_name = 'test_table'
attrib_names = ", ".join(dict_data.keys())
attrib_values = ", ".join("?" * len(dict_data.keys()))
sql = f"INSERT INTO {table_name} ({attrib_names}) VALUES ({attrib_values})"
cursor.execute(sql, list(dict_data.values()))

Related

How do I get a SQLite query to return the id of a table when I have two tables that have an attribute named id?

I can't seem to find anything on how to access the id attribute from the table I want. I have 4 tables that I have joined. User, workouts, exercises, and sets. They all have primary keys with the attribute name id.
My query:
query = """SELECT users.firstName, workouts.dateandtime, workouts.id, sets.*, exercises.name FROM users
JOIN workouts ON users.id = workouts.userID JOIN sets ON workouts.id = sets.workoutID JOIN exercises ON
sets.exerciseID = exercises.id WHERE users.id = ? ORDER BY sets.id DESC"""
I'm only grabbing the workouts.id and sets.id because user.id is found when the user logs in and exercises.id is cast amongst all users and it's important in this step.
Trying to access the sets.id like this does not work:
posts_unsorted = cur.execute(query, userID).fetchall()
for e in posts_unsorted:
print(e['id']) # Prints workouts.id I'm assuming because it's the first id I grab in the query
print(e['sets.id']) # Error because sets.id does not exist
Is there a way to name the sets.id when making the query so that I can actually use it? Should I be setting up my database differently to gab the sets.id? I don't know what direction I should be going.
This post How do you avoid column name conflicts?. Shows that you can give your tables aliases. This helps make it easier to refer to you tables in queries. It also gives what your query returns direction in what to name everything.
If you have two tables that both have an attribute called id. You will need to give them an alias to be able to access both attributes.
An example:
.schema sets
CREATE TABLE "sets"(
id INTEGER NOT NULL,
interval INTEGER NOT NULL,
workoutID INTEGER NOT NULL,
PRIMARY KEY id,
FORGIEN KEY workoutID REFERENCES workouts(id)
);
.schema workouts
CREATE TABLE "workouts"(
id INTEGER NOT NULL,
date SMALLDATETIME NOT NULL,
PRIMARY KEY id,
FORGIEN KEY workoutID REFERENCES workouts(id)
);
Fill the database:
INSERT INTO workouts (date) VALUES (2022-03-14), (2022-02-13);
INSERT INTO sets (interval, workoutID) VALUES (5, 1), (4, 1), (3, 2), (2, 2);
Both tables have a primary key labeled id. If you must access both ids you will need to add an alias in your query.
database = sqlite3.connect("name.db")
database.row = sqlite3.Row
cur = database.cursor()
query = """SELECT sets.id AS s_id, workouts.date AS w_date, workouts.id AS w_id
FROM sets JOIN workouts ON sets.workoutID=w_id"""
posts = cur.execute(query).fetchall()
This will return to you named tuples making to easy to retrieve the data you want. The data will look like this:
[{'s_id':1, 'w_date':'2022-03-14', 'w_id':1},
{'s_id':2, 'w_date':'2022-03-14', 'w_id':1},
{'s_id':3, 'w_date':'2022-02-13', 'w_id':2},
{'s_id':4, 'w_date':'2022-02-13', 'w_id':2}]
With this set of data you will be able to access everything by name instead of index.

Store big integer values correctly in sqlite db

I'm trying to store very large int values in sqlite3 db. the values are 100-115 digits long.
I've tried every possible combination - send the input as string/integer and store it as int/text/blob, but the result is always the same - the value of 7239589231682139...97853 becomes 7.239589231682139e+113 in the db.
My db schema is:
conn.execute('''CREATE TABLE DATA
RESULT TEXT NOT NULL''')
and the query is:
def insert(result):
conn.execute(f'INSERT INTO DATA (RESULT) VALUES ({result})')
conn.commit()
I wrote a simple function to test the above case:
DB_NAME = 'test.db'
conn = sqlite3.connect(DB_NAME)
conn.execute(('''CREATE TABLE TEST_TABLE
(TYPE_INT INT,
TYPE_REAL REAL,
TYPE_TEXT TEXT,
TYPE_BLOB BLOB);
'''))
value1 = '123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'
value2 = 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
conn.execute(f'INSERT INTO TEST_TABLE (TYPE_INT, TYPE_REAL, TYPE_TEXT, TYPE_BLOB) VALUES ({value1}, {value1}, {value1}, {value1})')
conn.execute(f'INSERT INTO TEST_TABLE (TYPE_INT, TYPE_REAL, TYPE_TEXT, TYPE_BLOB) VALUES ({value2}, {value2}, {value2}, {value2})')
conn.commit()
cursor = conn.execute('SELECT * from TEST_TABLE')
for col in cursor:
print(f'{col[0]}, {col[1]}, {col[2]}, {col[3]}')
print('--------------')
conn.close()
As you can see - I try all the possibilites, and the output is:
1.2345678901234568e+119, 1.2345678901234568e+119, 1.23456789012346e+119, 1.2345678901234568e+119
1.2345678901234568e+119, 1.2345678901234568e+119, 1.23456789012346e+119, 1.2345678901234568e+119
You are passing a value without single quotes so it is considered numeric.
Pass it as a string like this:
value1 = "123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890"
conn.execute("INSERT INTO TEST_TABLE (TYPE_TEXT) VALUES (?)", (value1,))
The ? placeholder will be replaced by:
'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'
because the value's type is string and it will be stored properly as TEXT which must be the data type of the column.

How to save dictionaries of different lengths to the same table in database?

What would be the most elegant way to save multiple dictionaries - most of them following the same structure, but some having more/less keys - to the same SQL database table?
The steps I can think of are the following:
Determine which dictionary has the most keys and then create a table which follows the dictionary's keys order.
Sort every dictionary to match this column order.
Insert each dictionary's values into the table. Do not insert anything (possible?) if for a particular table column no key exists in the dictionary.
Some draft code I have:
man1dict = {
'name':'bartek',
'surname': 'wroblewski',
'age':32,
}
man2dict = {
'name':'bartek',
'surname': 'wroblewski',
'city':'wroclaw',
'age':32,
}
with sqlite3.connect('man.db') as conn:
cursor = conn.cursor()
#create table - how do I create it automatically from man2dict (the longer one) dicionary, also assigning the data type?
cursor.execute('CREATE TABLE IF NOT EXISTS People(name TEXT, surname TEXT, city TEXT, age INT)')
#show table
cursor.execute('SELECT * FROM People')
print(cursor.fetchall())
#insert into table - this will give 'no such table' error if dict does not follow table column order
cursor.execute('INSERT INTO People VALUES('+str(man1dict.values())+')', conn)
Use NoSQL databases such as MongoDB for this purpose. They will handle these themselves. Using relational data for something that is not relational, this is an anti-pattern. This will break your code, degrade your application's scalability and when you want to change the table structure, it will more cumbersome to do so.
It might be easiest to save the dict as pickle and then unpickle it later. ie
import pickle, sqlite3
# SAVING
my_pickle = pickle.dumps({"name": "Bob", "age": 24})
conn = sqlite3.connect("test.db")
c = conn.cursor()
c.execute("CREATE TABLE test (dict BLOB)")
conn.commit()
c.execute("insert into test values (?)", (my_pickle,))
conn.commit()
# RETRIEVING
b = [n[0] for n in c.execute("select dict from test")]
dicts = []
for d in b:
dicts.append(pickle.loads(d))
print(dicts)
This outputs
[{"name": "Bob", "age": 24}]

SQLite with Python "Table has X columns but Y were supplied"

I have a python script that executes some simple SQL.
c.execute("CREATE TABLE IF NOT EXISTS simpletable (id integer PRIMARY KEY, post_body text, post_id text, comment_id text, url text);")
command = "INSERT OR IGNORE INTO simpletable VALUES ('%s', '%s', '%s', '%s')" % (comments[-1].post_body, comments[-1].post_id, comments[-1].comment_id,
comments[-1].url)
c.execute(command)
c.commit()
But when I execute it, I get an error
sqlite3.OperationalError: table simpletable has 5 columns but 4 values were supplied
Why is it not automatically filling in the id key?
In Python 3.6 I did as shown below and data was inserted successfully.
I used None for autoincrementing ID since Null was not found.
conn.execute("INSERT INTO CAMPAIGNS VALUES (?, ?, ?, ?)", (None, campaign_name, campaign_username, campaign_password))
The ID structure is as follows.
ID INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL
If you don't specify the target columns VALUES is expected to provide values for all columns and that you didn't do.
INSERT
OR IGNORE INTO simpletable
(text,
post_id,
comment_id,
text)
VALUES ('%s',
'%s',
'%s',
'%s');
Specifying the target columns is advisable in any case. The query won't break, if, for any reason, the order of the columns in the tables changes.
try to specify the columns names to ensure that the destination of values doesn't depends on order.
ex:
INTO simpletable
(text,
post_id,
comment_id,
text)
And if you wants the id column to be automatically incremented make sure to add Identity property on, or similar auto increment of your dbms.
ex:
CREATE TABLE IF NOT EXISTS simpletable (id integer PRIMARY KEY Identity(1,1),
and remember your script is not prepared to alter the table structure, only creation.
If you wrote code correctly delete your SQL file(name.db) and run your code again some time it solve the problem.
Imagine this is your code:
cursor.execute('''CREATE TABLE IF NOT EXISTS food(name TEXT , price TEXT)''')
cursor.execute('INSERT INTO food VALUES ("burger" , "20")')
connection.commit()
and you see an error like this:
table has 1 column but 2 values were supplied
it happened because for example you create a file with one column and then you modify your file to two column but you don't change the file name so compiler do not over write it because it exist.

Using python function output to update individual postgresql rows

I'm working on a project that requires a column in postgresql to be updated by the Mapbox geocoding api to convert an address into lon,lat coordinates. I created a FOR loop to read in the address from each row. I'd like to then save the unique lon,lat coordinates created into the "coordinates" column.
However, the code I've written updates the entire "coordinates" column with the first row's lon,lat coordinates, rather than iterating and updating each row's "coordinates" column individually.
Where did I go wrong? Any help would be greatly appreciated.
Main Code
import psycopg2
import json
from psycopg2.extras import RealDictCursor
import sys
from mapbox import Geocoder
from mapboxgeocode import getCoord
import numpy as np
con = None
try:
con = psycopg2.connect(database='database', user='username')
cur = con.cursor()
cur.execute("DROP TABLE IF EXISTS permits")
cur.execute("""CREATE TABLE permits(issued_date DATE, address
VARCHAR(200), workdesc VARCHAR(600),permit_type VARCHAR(100), permit_sub_type
VARCHAR(100), anc VARCHAR(4), applicant VARCHAR(100),owner_name
VARCHAR(200))""")
cur.execute(""" COPY permits FROM '/path/to/csv/file'
WITH DELIMITER ',' CSV HEADER """)
cur.execute("""ALTER TABLE permits ADD COLUMN id SERIAL PRIMARY KEY;
UPDATE permits set id = DEFAULT;""")
cur.execute("""ALTER TABLE permits ADD COLUMN coordinates VARCHAR(80);
UPDATE permits SET coordinates = 4;""")
cur.execute("""ALTER TABLE permits ADD COLUMN city VARCHAR(80);
UPDATE permits SET city = 'Washington,DC'; ALTER TABLE permits ALTER
COLUMN city SET NOT NULL;""")
cur.execute("UPDATE permits SET address = address || ' ' || city;")
cur.execute("SELECT * FROM permits;")
for row in cur.fetchall():
test = row[1]
help = getCoord(test)
cur.execute("UPDATE permits SET coordinates = %s;", (help,) )
print(test)
con.commit()
except psycopg2.DatabaseError, e:
print 'Error %s' % e
sys.exit(1)
finally:
if con:
cur.close()
con.commit()
con.close()
Geocode Function
from mapbox import Geocoder
import numpy as np
def getCoord(address):
geocoder = Geocoder(access_token='xxxxxxxxxxxxxxxx')
response = geocoder.forward(address)
first = response.geojson()['features'][0]
row = first['geometry']['coordinates']
return row
You need to add a WHERE condition in your UPDATE statement. Without WHERE, SQL simply thinks you want to update all of the coordinates columns. A proper WHERE condition will let it know specifically which cell in the column it needs to modify.
You'll probably want to use your primary key, as it's a unique identifier. Perhaps a statement along the lines of:
cur.execute("UPDATE permits SET coordinates = %s WHERE id = %s;", (help, row[index of the id column]) )
I think the row index you need would be row[8], but you'll have to confirm that in your code. I hope that gets it working.

Categories