How to split comma delimited values into multiple rows using Sqlite

How to split comma delimited values into multiple rows using Sqlite - python

I'm using Python and SQLite to manipulate a string in android.
I have a SQLite Table that looks like this:
| ID | Country
+----------------+-------------
| 1 | USA, Germany, Mexico
| 2 | Brazil, Canada
| 3 | Peru
I would like to split the comma delimited values of Country column and insert them into another table countries so that Countries table looks like this
| ID | Country
+----------------+-------------
| 1 | USA
| 1 | Germany
| 1 | Mexico
| 2 | Brazil
| 2 | Canada
| 3 | Peru
How do I do split the values from Country column in one table and insert them into Country column of another table?

There is no split function in SQLite.
There is of course the substring function but it's not suitable for your needs since every row could contain more than 1 commas.
If you were an expert in SQLite I guess you could create a recursive statement using substring to split each row.
If you're not use Python to read the data, split each row and write it back to the db.

You can use a recursive common table expression to split the comma-delimited column by extracting substrings of the Country column recursively.
CREATE TABLE country_split AS
WITH RECURSIVE split(id, value, rest) AS (
SELECT ID, '', Country||',' FROM country
UNION ALL SELECT
id,
substr(rest, 0, instr(rest, ',')),
substr(rest, instr(rest, ',')+1)
FROM split WHERE rest!=''
)
SELECT id, value
FROM split
WHERE value!='';

im solved
im using python
import sqlite3
db = sqlite3.connect(':memory:')
db = sqlite3.connect('mydb.db')
cursor = db.cursor()
cursor.execute("""Select * from Countries""")
all_data = cursor.fetchall()
cursor.execute("""CREATE TABLE IF NOT EXISTS Countriess
(ID TEXT,
Country TEXT)""")
for single_data in all_data:
countriess = single_data[1].split(",")
for single_country in countriess :
cursor.execute("INSERT INTO Countriess VALUES(:id,:name)", { "id": single_data[0], "name": single_country })
db.commit()
and after use sqlite db another project; :)

Related

Create new SQLite table combining column from other tables with sqlite3 and python

I am trying to create a new table that combines columns from two different tables.
Let's imagine then that I have a database named db.db that includes two tables named table1 and table2.
table1 looks like this:
id | item | price
-------------
1 | book | 20
2 | copy | 30
3 | pen | 10
and table2 like this (note that has duplicated axis):
id | item | color
-------------
1 | book | blue
2 | copy | red
3 | pen | red
1 | book | blue
2 | copy | red
3 | pen | red
Now I'm trying to create a new table named new_table that combines both columns price and color over the same axis and also without duplicates. My code is the following (it does not obviously work because of my poor SQL skills):
con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE new_table (id varchar, item integer, price integer, color integer)")
cur.execute("ATTACH DATABASE 'db.db' AS other;")
cur.execute("INSERT INTO new_table (id, item, price) SELECT * FROM other.table1")
cur.execute("UPDATE new_table SET color = (SELECT color FROM other.table2 WHERE distinct(id))")
con.commit()
I know there are multiple errors in the last line of code but I can't get my head around it. What would be your approach to this problem? Thanks!

Something like
CREATE TABLE new_table(id INTEGER, item TEXT, price INTEGER, color TEXT);
INSERT INTO new_table(id, item, price, color)
SELECT DISTINCT t1.id, t1.item, t1.price, t2.color
FROM table1 AS t1
JOIN table2 AS t2 ON t1.id = t2.id;
Note the fixed column types; yours were all sorts of strange. item and color as integers?
If each id value is unique in the new table (Only one row will ever have an id of 1, only will be 2, and so on), that column should probably be an INTEGER PRIMARY KEY, too.
EDIT: Also, since you're creating this table in an in-memory database from tables from an attached file-based database... maybe you want a temporary table instead? Or a view might be more appropriate? Not sure what your goal is.

Append sqlite3 data from csv to a table whose 1 column is - id INTEGER PRIMARY KEY AUTOINCREMENT

So I have a table, which has the first column called id as autoincrement.
Now, Suppose I have data in the table with ids- 1,2,3
And I also have some data in the csv that starts with id 1,2,3
This is the code that I am trying to use-
cur.execute("CREATE TABLE IF NOT EXISTS sub_features (id INTEGER PRIMARY KEY AUTOINCREMENT,featureId INTEGER, name TEXT, FOREIGN KEY(featureId) REFERENCES features(id))")
df = pd.read_csv(csv_location+'/sub_features_table.csv')
df.to_sql("sub_features", con, if_exists='append', index=False)
I am getting this error-
sqlite3.IntegrityError: UNIQUE constraint failed: sub_features.id
How do I make sure that data gets appended and the id gets set as per requirement and in case the entire row is a duplicate then it gets ignored?
To explain further, Say I have a table:
id | Name
1 | Abhisek
2 | Amit
And I am trying to import this csv to the same table:
id | Name
1 | Abhisek
2 | Rahul
Then my resultant table should be:
id | Name
1 | Abhisek
2 | Amit
3 | Rahul

Insert a value into a row with petl?

I'm using petl and trying to figure out how to insert a value into a specific row.
I have a table that looks like this:
+----------------+---------+------------+
| Cambridge Data | IRR | Price List |
+================+=========+============+
| '3/31/1989' | '4.37%' | |
+----------------+---------+------------+
| '4/30/1989' | '5.35%' | |
+----------------+---------+------------+
I want to set the price list to 100 on the row where Cambridge Data is 4/30/1989. This is what I have so far:
def insert_initial_price(self, table):
import petl as etl
initial_price_row = etl.select(table, 'Cambridge Data', lambda v: v == '3/31/1989')
That selects the row I need to insert 100 into, but i'm unsure how to insert it. petl doesn't seem to have an "insert value" function.

I would advice not to use select.
To update the value of a field use convert.
See the docs with many examples: https://petl.readthedocs.io/en/stable/transform.html#petl.transform.conversions.convert
I have not tested it, but this should solve it:
import petl as etl
table2 = etl.convert(
table,
'Price List',
100,
where = lambda rec: rec["Cambridge Data"] == '4/30/1989',
)

SQLAlchemy; prevent automatic selection when ordering

With SQLAlchemy ORM querying with PostgreSQL(v9.5); how to prevent the automatic selection when sorting by a column; the sorted column should not be selected.
Hopefully the sample code below makes this more clear.
Example code
A table with an integer 'id', an integer 'object_id' and a string 'text':
id | object_id | text
---------------------
1 | 1 | house
2 | 2 | tree
3 | 1 | dog
The following query should return the distinct object_id as its own id with the most recent text:
query = session.query(
MyTable.object_id.label('id'),
MyTable.text
).\
distinct(MyTable.object_id).\
order_by(MyTable.object_id, MyTable.id.desc())
So far so good; but when I compile the query:
print(query.statement.compile(dialect=postgresql.dialect()))
The mytable.id and mytable.object_id are selected as well, so the column id is specified twice:
SELECT DISTINCT ON (mytable.object_id) mytable.object_id AS id,
mytable.text,
mytable.object_id,
mytable.id
FROM mytable
ORDER BY mytable.object_id,
mytable.id DESC

You can try it. It should work:
query = session.query(MyTable.object_id.distinct().label('id'), MyTable.text).order_by(MyTable.object_id, MyTable.id.desc())

Mapping rows ids with external csv file?

I have a csv file with addresses information: zip, city, state, country, street, house_no (the last one is house number). This is being Imported throught OpenERP import interface. So you can import related data by providing one of three - name, database id or external id. The simplest is by providing name.
For example for city I don't need to specifically provide it's id (and change column from street to street_id and then that street id), but just its real name like Some city. If such city name exists in city table, then everything will be imported without problems.
But problems arise when there are more than one city with same name. Then to solve name clashes I need to specifically provide those cities ids. But the problem is, there are so many addresses that is nearly impossible to just look and manually change names to ids.
So I'm wondering if it's possible to write some script or pass that csv file to postgresql (or OpenERP using ORM) as condition so it would return me list of ids that matches conditions from csv file.
In my database there is already imported all needed streets in street table and cities in city table.
city table has this structure (with example data):
id| name| state_id|
1 | City1| 1
2 | City1| 2
3 | City2| 2|
state table example:
id| name|
1 | State1
2 | State2
So as you can see same names can be distinguished by their id or by state_id or state name if you would go to state table.
And an example of adddresses csv file (also in database there is table to import that information)
|zip| city | state_id| country | street| house_no
123 | City1| 1 | Country1| Street1| 25a
124 | City1| 2 | Country1| Street2| 34
125 | City2| 2
If I validate such csv file through OpenERP interface, I get warning that there two cities with same name. And if I proceed, then it chooses city that was first imported in database and then some addresses will have city assigned for them with wrong state (keep in mind that column city is also used for various villages etc, so thats why there are same names in different states.
So there I need to change from city names to there ids, but as I said there are hundreds of thousands of lines and doing manually is nearly impossible and would take lots of time.
Finally what I need is to somehow pass all that information from addresses csv file into database, specifically into city table and get return of ids list.
For example if I would input (as condition for city table):
name | state_id|
City1| 1
City1| 2
City2| 2
City1| 1
It should output to me this:
1
2
3
1
Could someone suggest me how to get such result?

I was able to solve this problem by writing this script:
# -*- encoding: utf-8 -*-
#!/usr/bin/python
import psycopg2
import csv
#Connect to database
conn = psycopg2.connect(database="db_name",
user="user", password="password", host="127.0.0.1", port="5432")
cur = conn.cursor()
#Get all cities ids and names with specific state
cur.execute("SELECT id, name from res_country_state_city WHERE state_id = 53")
rows = cur.fetchall()
rows_dict = {}
#Generate dict from data provided
for row in rows:
rows_dict[row[1]] = row[0]
#Check which name from cities-names.csv match with name in database
#(match returns that cities id
with open('cities-names.csv') as csvfile:
with open('cities-ids.csv', 'wb') as csvfile2:
reader = csv.reader(csvfile)
writer = csv.writer(csvfile2)
#create ids csv file and write ids that were matched
for row in reader:
if rows_dict.get(row[0]):
writer.writerow([rows_dict.get(row[0])])
conn.close()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to split comma delimited values into multiple rows using Sqlite - python

Related

Create new SQLite table combining column from other tables with sqlite3 and python

Append sqlite3 data from csv to a table whose 1 column is - id INTEGER PRIMARY KEY AUTOINCREMENT

Insert a value into a row with petl?

SQLAlchemy; prevent automatic selection when ordering

Mapping rows ids with external csv file?

Categories

Resources