Add dataframe column WITH VARYING VALUES to MySQL table? - python

Pretty simple question, but not sure if it’s possible from what I’ve seen so far online.
To keep it simple, let’s say I have a MySQL table with 1 column and 5 rows made already. If I have a pandas dataframe with 1 column and 5 rows, how can I add that dataframe column (with its values) to the database table?
The guides I’ve read so far only show you how to simply create a new column with either null values or 1 constant value, which doesn’t help much. The same question was asked here but the answer provided didn’t answer the question, so I’m asking it again here.
As an example:
MySQL table:
Pandas DataFrame:
Desired MySQL table:
Then for kicks, let's say we have a string column to add as well:
Desired MySQL output:
Safe to assume the index column will always match in the DF and the MySQL table.

You can use INSERT ... ON DUPLICATE KEY UPDATE.
You have the following table:
create table tbl (
index_ int ,
col_1 int ,
primary key index_(`index_`)
) ;
insert into tbl values (1,1), (2,2), (3,3), (4,4), (5,5);
And want to add the following data in a new column on the same table ;
(1,0.1),(2,0.2),(3,0.3),(4,0.4),(5,0.5)
First you need to add the column with the alter command,
alter table tbl add column col_2 decimal(5,2) ;
Then use INSERT ON DUPLICATE KEY UPDATE Statement
INSERT INTO tbl (index_,col_2)
VALUES
(1,0.1),
(2,0.2),
(3,0.3),
(4,0.4),
(5,0.5)
ON DUPLICATE KEY UPDATE col_2=VALUES(col_2);
Fiddle

Related

Copy data from one table to another table in Postgres which contains many Columns

I want to copy data from Table A to Table B in Postgres. Table A contains 40 columns and Table B contains 20 columns. It's like Table B is the subset of Table A, which means - Table B contains only some columns which are in Table A.
I have found the answer https://stackoverflow.com/a/7483174/12556735 for copying the data if there are less number of columns.
Since there are many columns , is there any way in which we can copy data without mentioning the Column names?
Choice 1:
If you notice, your referred question / answer itself answers your question as does not have any limitation on the number of columns. But you should know the original column definition and the required column list. It can run well for many columns.
Choice 2:
If the columns are unknown, you shall try using
create table 'newtable' as (select * from 'existingtable');
Choice 3:
If the columns are unknown and you wanted to create a new table with selected columns ! (which means you should know about columns), you shall try
select * from information_schema.columns where table_schema= 'yourdatabase' and table_name= 'yourtable';
And among the columns list, you shall use column_name, column_type, is_nullable, etc can be used on your script.

How can I append a column in sqlite3 and fill its values with pandas Series

I'd like to append a column into the table existing in sqlite3 database, using values stored in a pandas Series.
My original DataFrame df looks like:
a b
0 1 2
1 3 4
And this is stored as a table in sqlite3 also.
If I add a column to df as:
df['c'] = df.a + df.b
then df will be:
a b c
0 1 2 3
1 3 4 7
whereas the table in the sqlite3 db is not changed yet.
What I want to do is to append a column ('c') into the table in sqlite3 and fill its values with df['c'].
What I tried is:
con = sqlite3.connect('data/a.db')
cur = con.cursor()
cur.execute('alter table temptable add column c integer')
con.commit()
cur.execute('update temptable set c=?', df.c)
con.commit()
con.close()
However, it is not working. Is there a possible way to perform bulk update for the new column 'c' in sqlite3? The number of rows is usually around 100,000,000.
Assuming ALTER TABLE worked and you were able to add the new column, try using sqlite3's executemany method to insert values into the new column. The accepted answer to this SO question shows you how to do it (Note that you'll need a primary key on your table)
As an alternative, this link shows you how to use DataFrame.to_sql to update the entire table using the dataframe without writing sql query yourself.

Update multiple rows in mysql table with single call

I have a table in MySql DB. I need to update a particular column for all the rows in the table.
I need to do this in a single call without deleting the rows, only updating the column values.
I have tried using df.to_sql(if_exists='replace') but this deletes the rows and re-inserts them. Doing so drops the rows from other table which are linked by the foreign key.
merge_df = merge_df[['id', 'deal_name', 'fund', 'deal_value']]
for index, row in merge_df.iterrows():
ma_deal_obj = MA_Deals.objects.get(id=row['id'])
ma_deal_obj.deal_value = row['deal_value']
ma_deal_obj.save()
merge_df has other columns as well. I only need to update the 'deal_value' column for all rows.
One solution I have is by iterating over the dataframe rows and using Django ORM to save the value but this is quite slow for too many rows.

How to add a new column (Python list) to a Postgresql table?

I have a Python list newcol that I want to add to an existing Postgresql table. I have used the following code:
conn = psycopg2.connect(host='***', database='***', user='***', password='***')
cur = conn.cursor()
cur.execute('ALTER TABLE %s ADD COLUMN %s text' % ('mytable', 'newcol'))
conn.commit()
This added the list newcol to my table, however the new column has no values in it. In python, when I print the the list in python, it is a populated list.
Also, the number of rows in the table and in the list I want to add are the same. I'm a little confused.
Thanks in advance for the help.
ALTER TABLE only changes table schema -- in your case it will create the new column and initialize it with empty (NULL) values.
To add list of values to this column you can do:
UPDATE TABLE <table> SET ... in a loop.

How create valueless column In Cassandra By CQLEngine

I have a question about how I can Create valueless column in Cassandra by CQLEngine. I mean I want store information in column name instead of column value for some purpose.
But in CqlEngine you should define you column name before you run your project in model file.
Thanks for any help.
What your asking about would traditionally be accomplished using the Thrift api and wide rows, defining your own column names. This is not how CQL3 works. If you want to work this way, you can use pycassa, or you can create columns with compound keys.
For example, with CQL3:
create table ratings
( user_id uuid, item_id uuid,
rating int,
primary key (user_id, item_id));
You'd actually create a wide row, with user_id as the key, and the columns would have the item_id, and the rating would be the value. With compound primary keys, rows are transposed into columns.
You might want to read this: http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html
Ok, first I need to define my column family like this:
create table test(
user text,
itemLiked text,
primary key (user,itemLiked))
so the user is row key and itemLiked is Clustering key.
If I insert values to this cf , cassanndra doesn't create row (internally) for that
and the itemliked value will be the name of columns
insert into test(user , itemLiked) values ( "user1","Item1");
insert into test(user , itemLiked) values ( "user1","Item1");
So cassandra will create two column with name Item1 and Item2 for rowkey user1
this columns hav not any values and as described in this article:
http://cassandra.apache.org/doc/cql3/CQL.html
in line that:
Moreover, a table must define at least one column that is not part of the PRIMARY KEY as a row exists in Cassandra only if it contains at least one value for one such column.

Categories