How create valueless column In Cassandra By CQLEngine - python

I have a question about how I can Create valueless column in Cassandra by CQLEngine. I mean I want store information in column name instead of column value for some purpose.
But in CqlEngine you should define you column name before you run your project in model file.
Thanks for any help.

What your asking about would traditionally be accomplished using the Thrift api and wide rows, defining your own column names. This is not how CQL3 works. If you want to work this way, you can use pycassa, or you can create columns with compound keys.
For example, with CQL3:
create table ratings
( user_id uuid, item_id uuid,
rating int,
primary key (user_id, item_id));
You'd actually create a wide row, with user_id as the key, and the columns would have the item_id, and the rating would be the value. With compound primary keys, rows are transposed into columns.
You might want to read this: http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html

Ok, first I need to define my column family like this:
create table test(
user text,
itemLiked text,
primary key (user,itemLiked))
so the user is row key and itemLiked is Clustering key.
If I insert values to this cf , cassanndra doesn't create row (internally) for that
and the itemliked value will be the name of columns
insert into test(user , itemLiked) values ( "user1","Item1");
insert into test(user , itemLiked) values ( "user1","Item1");
So cassandra will create two column with name Item1 and Item2 for rowkey user1
this columns hav not any values and as described in this article:
http://cassandra.apache.org/doc/cql3/CQL.html
in line that:
Moreover, a table must define at least one column that is not part of the PRIMARY KEY as a row exists in Cassandra only if it contains at least one value for one such column.

Related

Copying a distinct sqlite table grouped by single column results in IntegrityError: Column is not unique (Python)

I have a relatively small sqlite3 database (~2.6GB) with 820k rows and 26 columns (single table). Within this database there is a table named old_table, at the moment I created this table I had no primary key, and therefore adding new rows was prone to having duplicates being added.
In terms of efficiency, I created the same database again, but this time with the column Ref set as primary key: 'Ref VARCHAR(50) PRIMARY KEY,'. According to many resources, we should be able to select only the unique rows based on a single column with the query SELECT * from old_table GROUP BY Ref. I want to keep the unique values so I insert them into a new table with INSERT INTO new_table. Afterwards I would like to drop the old table with DROP TABLE old_table. Finally, the new_table should be renamed to old_table with ALTER TABLE new_table RENAME TO new_table.
Why does my sql state that column Ref is not unique?
#Transferring old database to new one, with ref as unique primary key
#And delting old_table
conn = connect_to_db("/mnt/wwn-0x5002538e00000000-part1/DATABASE/database.db")
c = conn.cursor()
c.executescript("""
INSERT INTO new_table SELECT * from old_table GROUP BY Ref;
DROP TABLE old_table;
RENAME TO new_table
""")
conn.close()
---------------------------------------------------------------------------
IntegrityError: column Ref is not unique

Add dataframe column WITH VARYING VALUES to MySQL table?

Pretty simple question, but not sure if it’s possible from what I’ve seen so far online.
To keep it simple, let’s say I have a MySQL table with 1 column and 5 rows made already. If I have a pandas dataframe with 1 column and 5 rows, how can I add that dataframe column (with its values) to the database table?
The guides I’ve read so far only show you how to simply create a new column with either null values or 1 constant value, which doesn’t help much. The same question was asked here but the answer provided didn’t answer the question, so I’m asking it again here.
As an example:
MySQL table:
Pandas DataFrame:
Desired MySQL table:
Then for kicks, let's say we have a string column to add as well:
Desired MySQL output:
Safe to assume the index column will always match in the DF and the MySQL table.
You can use INSERT ... ON DUPLICATE KEY UPDATE.
You have the following table:
create table tbl (
index_ int ,
col_1 int ,
primary key index_(`index_`)
) ;
insert into tbl values (1,1), (2,2), (3,3), (4,4), (5,5);
And want to add the following data in a new column on the same table ;
(1,0.1),(2,0.2),(3,0.3),(4,0.4),(5,0.5)
First you need to add the column with the alter command,
alter table tbl add column col_2 decimal(5,2) ;
Then use INSERT ON DUPLICATE KEY UPDATE Statement
INSERT INTO tbl (index_,col_2)
VALUES
(1,0.1),
(2,0.2),
(3,0.3),
(4,0.4),
(5,0.5)
ON DUPLICATE KEY UPDATE col_2=VALUES(col_2);
Fiddle

How to get all the values in a column from a database in python (SQLAlchemy)

I have a table called draws. All I want to do is get the column 'user_id' as an array, in order from the 'id' column (top to bottom).
The ID column just numbers each row.
I've written this:
user_id = Draw.query.order_by(desc('id')).all()
This query gets every column though right? I just want the information in the 'user_id' column
user_id = Draw.query.with_entities(Draw.user_id).order_by(desc(Draw.id)).all()
With entities explanation from the docs.
Alternatively you can just do a list comprehension on the query you already have -- something like:
user_ids = [d.user_id for d in user_id]
Try this!
from sqlalchemy import desc
db.session.query(Draw.user_id).order_by(desc(Draw.id)).all()

Add another column to existing list

I'm starting to learn python and I'm trying to do an exercise where I have to save in a "rows" variable some stock data coming from a SQL query, like this:
rows = db.execute("SELECT * FROM user_quote WHERE user_id=:userid", userid=session["user_id"])
This will return 4 columns (id, user_id, symbol, name)
Then, for every row the query returns I'll get the last known price of that stock from an API, and I want to add that information to another column in my rows variable. Is there a way to do this? Should I use another approach?
Thanks for your time!
I'm not sure what type the rows variable is, but you can just add an additional column in the SELECT:
rows = db.execute("SELECT *, 0 NewCol FROM user_quote WHERE user_id=:userid", userid=session["user_id"])
Assuming rows is mutable, this will provide a placeholder for the new value.
Convert the rows tuple to a list, then you can use append() to add the price.
rows = list(rows)
rows.append(price)

Pymssql insert multiple from list of dictionaries with dynamic column names

I am using python 2.7 to perform CRUD operations on a MS SQL 2012 DB.
I have data stored in a List of Dictionaries "NewComputers" (each dictionary is a row in the database).
This is working properly. However, the source data column names and the destination column names are both hard-coded (note that the column names are different).
Questions: Instead of hard-coding the data source column names, how can I loop over the dictionary to determine the column names dynamically? Also, instead of hard-coding the destination (database table) column names, how can I loop over a list of column names dynamically?
I would like to make this function re-usable for different data source columns and destination columns.
In other words:
"INSERT INTO Computer (PARAMETERIZED LIST OF COLUMN NAMES) VALUES (PARAMETERIZED LIST OF VALUES)"
Here is the function:
def insertSR(NewComputers):
conn = pymssql.connect(mssql_server, mssql_user, mssql_pwd, "Computers")
cursor = conn.cursor(as_dict=True)
try:
cursor.executemany("INSERT INTO Computer (ComputerID, HostName, Type) VALUES (%(computer_id)s, %(host_name)s, %(type)s)", NewComputers) # How to make the column names dynamic?
except:
conn.rollback()
print("ERROR: Database Insert failed.")
conn.commit()
print("Inserted {} rows successfully".format(cursor.rowcount))
conn.close()
You can't do what you'd like to.
Basically, your multiple insert SQL query will get translated to :
insert into table (column1, column2, column3) values (a1,a2,a3), (b1,b2,b3)
So as you can see, you'll at least have to make one different query per destination columns group.
Then on the data source side, (a1,a2,a3),(b1,b2,b3) in my example, you don't have to specify the column name, so you can have different data sources for a given destination.
On this part, I'd do something like this :
First build a correspondance dict, key is the destination field name, and values are the other names used for this field in the data source tables :
source_correspondance = {
'ComputerID':['id_computer', 'computer_Id'],
'HostName': ['host', 'ip', 'host_name'],
'Type':['type', 'type']
}
Then iterate over your data source, and replace the column name by the key of your correspondance dict.
Then finally you can build your queries (1 executemany per destination 'type').

Categories