Gather rows under the same key in Cassandra/Python - python

I was working through the Pattern 1 and was wondering if there is a way, in python or otherwise to "gather" all the rows with the same id into a row with dictionaries.
CREATE TABLE temperature (
weatherstation_id text,
event_time timestamp,
temperature text,
PRIMARY KEY (weatherstation_id,event_time)
);
Insert some data
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:01:00','72F');
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:02:00','73F');
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:03:00','73F');
INSERT INTO temperature(weatherstation_id,event_time,temperature)
VALUES ('1234ABCD','2013-04-03 07:04:00','74F');
Query the database.
SELECT weatherstation_id,event_time,temperature
FROM temperature
WHERE weatherstation_id='1234ABCD';
Result:
weatherstation_id | event_time | temperature
-------------------+--------------------------+-------------
1234ABCD | 2013-04-03 06:01:00+0000 | 72F
1234ABCD | 2013-04-03 06:02:00+0000 | 73F
1234ABCD | 2013-04-03 06:03:00+0000 | 73F
1234ABCD | 2013-04-03 06:04:00+0000 | 74F
Which works, but I was wondering if I can turn this into a row per weatherstationid.
E.g.
{
"weatherstationid": "1234ABCD",
"2013-04-03 06:01:00+0000": "72F",
"2013-04-03 06:02:00+0000": "73F",
"2013-04-03 06:03:00+0000": "73F",
"2013-04-03 06:04:00+0000": "74F"
}
Is there some parameter in cassandra driver that can be specified to gather by certain id (weatherstationid) and turn everything else into dictionary? Or is there needs to be some Python magic to turn list of rows into single row per id (or set of IDs)?

Alex, you will have to do some post execution data processing to get this format. The driver returns row by row, no matter what row_factory you use.
One of the reasons the driver cannot accomplish the format you suggest is that there is pagination involved internally (default fetch_size is 5000). So the results generated your way could potentially be partial or incomplete. Additionally, this can be easily be done with Python when the query execution is done and you are sure that all required results are fetched.

Related

How do I get only the value of a column identified by a unique userid in MySQL? pls refer to Question for more detailed explanation

I'm currently writing an rpg game in python that uses a mysql database to store info on players. However, I've come across a problem.
Sample Code of How Database has been Set Up:
playerinfo Table
userID | money | xp |
1 | 200 | 20 |
2 | 100 | 10 |
I'm trying to select the amount of money with only the value. My select query right now is
SELECT money FROM playerinfo WHERE id = 1
The full code/function for collecting selecting the info is
def get_money_stats(user_id):
global monresult
remove_characters = ["{", "}", "'"]
try:
with connection.cursor() as cursor:
monsql = "SELECT money FROM players WHERE userid = %s"
value = user_id
cursor.execute(monsql, value)
monresult = str(cursor.fetchone())
except Exception as e:
print(f"An error Occurred> {e}")
CURRENT OUTPUT:
{'money': 200}
DESIRED OUTPUT:
200
Basically, all I want to select is the INT/DATA from the player's row (identified by unique userid). How do I do that? The only solution I have is to replace the characters with something else but I don't really want to do that as it's incredibly inconvenient and messy. Is there another way to reformat/select the data?
It seems like that fetching one row gives you a dictionary of the selected columns with its values, which seems the correct approach to me. You should simply access the dictionary with the column that you want to retrieve:
monresult = cursor.fetchone()['money']
If you don't want to specify again the column (which you should) you could get the values of the dictionary as a list and retrieve the first one:
monresult = list(cursor.fetchone().values())[0]
I do not recommend the last approach because it's heavily dependent on the current status of the query and it may have to change if the query is changed.

How to create a filter in SQLite database across multiple tables?

I am looking for a way to create a number of filters across a few tables in my SQL database. The 2 tables I require the data from are Order and OrderDetails.
The Order table is like this:
------------------------------------
| OrderID | CustomerID | OrderDate |
------------------------------------
The OrderDetails table is like this:
----------------------------------
| OrderID | ProductID | Quantity |
----------------------------------
I want to make it so that it counts the number of instances a particular OrderID pops up in a single day. For example, it will choose an OrderID in Order and then match it to the OrderIDs in OrderDetails, counting the number of times it pops up in OrderDetails.
-----------------------------------------------------------
| OrderID | CustomerID | OrderDate | ProductID | Quantity |
-----------------------------------------------------------
The code I used is below here:
# Execute SQL Query (number of orders made on a particular day entered by a user)
cursor.execute("""
SELECT 'order.*', count('orderdetails.orderid') as 'NumberOfOrders'
from 'order'
left join 'order'
on ('order.orderid' = 'orderdetais.orderid')
group by
'order.orderid'
""")
print(cursor.fetchall())
Also, the current output that I get is this when I should get 3:
[('order.*', 830)]
Your immediate problem is that you are abusing the use of single quotes. If you need to quote an identifiers (table name, column name and the-like), then you should use double quotes in SQLite (this actually is the SQL standard). And an expression such as order.* should not be quoted at all. You are also self-joining the orders table, while you probably want to bring the orderdetails.
You seem to want:
select
o.orderID,
o.customerID,
o.orderDate,
count(*) number_of_orders
from "order" o
left join orderdetails od on od.orderid = o.orderid
group by o.orderID, o.customerID, o.orderDate
order is a language keyword, so I did quote it - that table would be better named orders, to avoid the conflicting name. Other identifiers do not need to be quoted here.
Since all you want from orderdetails is the count, you could also use a subquery instead of aggregation:
select
o.*,
(select count(*) from orderdetails od where od.orderid = o.oderid) number_of_orders
from "order" o

SQL - Possible to Auto Increment Number but with leading zeros?

Our IDs look something like this "CS0000001" which stands for Customer with the ID 1. Is this possible to to with SQL and Auto Increment or do i need to to that in my GUI ?
I need the leading zeroes but with auto incrementing to prevent double usage if am constructing the ID in Python and Insert them into the DB.
Is that possible?
You have few choices:
Construct the CustomerID in your code which inserts the data into
the Customer table (=application side, requires change in your code)
Create a view on top of the Customer-table that contains the logic
and use that when you need the CustomerID (=database side, requires change in your code)
Use a procedure to do the inserts and construct the CustomerID in
the procedure (=database side, requires change in your code)
Possible realization.
Create data table
CREATE TABLE data (id CHAR(9) NOT NULL DEFAULT '',
val TEXT,
PRIMARY KEY (id));
Create service table
CREATE TABLE ids (id INT NOT NULL AUTO_INCREMENT PRIMARY KEY);
Create trigger which generates id value
CREATE TRIGGER tr_bi_data
BEFORE INSERT
ON data
FOR EACH ROW
BEGIN
INSERT INTO ids () VALUES ();
SET NEW.id = CONCAT('CS', LPAD(LAST_INSERT_ID(), 7, '0'));
DELETE FROM ids;
END
Create trigger which prohibits id value change
CREATE TRIGGER tr_bu_data
BEFORE UPDATE
ON data
FOR EACH ROW
BEGIN
SET NEW.id = OLD.id;
END
Insert some data, check result
INSERT INTO data (val) VALUES ('data-1'), ('data-2');
SELECT * FROM data;
id | val
:-------- | :-----
CS0000001 | data-1
CS0000002 | data-2
Try to update, ensure id change prohibited
UPDATE data SET id = 'CS0000100' WHERE val = 'data-1';
SELECT * FROM data;
id | val
:-------- | :-----
CS0000001 | data-1
CS0000002 | data-2
Insert one more data, ensure enumeration continues
INSERT INTO data (val) VALUES ('data-3'), ('data-4');
SELECT * FROM data;
id | val
:-------- | :-----
CS0000001 | data-1
CS0000002 | data-2
CS0000003 | data-3
CS0000004 | data-4
Check service table is successfully cleared
SELECT COUNT(*) FROM ids;
| COUNT(*) |
| -------: |
| 0 |
db<>fiddle here
Disadvantages:
Additional table needed.
Generated id value edition is disabled (copy and delete old record must be used instead, custom value cannot be set).

How can I get Zapier MySQL "update row" to update a value in a row with a Null value?

I have created a Zapier Zap to populate data from a SmartSheet to a MySQL database. I have it branching so if the row does not already exist in MySQL a new row is created. This part works fine.
In my second branch, if the row already exists then the data in the row is updated with new data from the SmartSheet row. When existing data is replaced with new data the Zap works fine. E.g. for an example existing MySQL row:
+--------+---------------+--------------------+
| row_id | email_comment | smartsheet_orig_id |
+--------+---------------+--------------------+
| 895 | easy | 6876364645150921 |
+--------+---------------+--------------------+
In the SmartSheet if the user replaces the comment with another, the MySQL data is updated successfully, e.g:
+--------+---------------+--------------------+
| row_id | email_comment | smartsheet_orig_id |
+--------+---------------+--------------------+
| 895 | difficult | 6876364645150921 |
+--------+---------------+--------------------+
But, if the user has deleted the comment in SmartSheet and not replaced it with another, leaving the comment empty, the data is not removed from the corresponding MySQL record e.g:
+--------+---------------+--------------------+
| row_id | email_comment | smartsheet_orig_id |
+--------+---------------+--------------------+
| 895 | difficult | 6876364645150921 |
+--------+---------------+--------------------+
What I need the MySQL record to look like in this case would be:
+--------+---------------+--------------------+
| row_id | email_comment | smartsheet_orig_id |
+--------+---------------+--------------------+
| 895 | | 6876364645150921 |
+--------+---------------+--------------------+
After quite a lot of testing, and a conversation with Zapier support it appears the problem is that Null values are removed from the Zapier Code output step. So, the above case, this is a summary of what I'm expecting to happen:
Zapier Code step: email_comment = Null --> MySQL Update Row step: email_comment = Null
But at the output of the Code step my Null value for email_comment is stripped and so the MySQL Update Row Zap step interprets the record as not needing to be updated as there is no change and leaves the old value there.
I have tried, in my code, passing an empty string " " instead of a Null but I get the exact same result. The only way around I can see is to pass on some empirical character and then in the Update Row step replace that with a Null to store in the record but I can't see a way of doing that in Zapier.
I have searched Google and Here for others wrestling with this issue but have drawn a complete blank. The search strings I have been using are [Zapier] delete data, [Zapier] remove data and [Zapier] Null. None of the results of those searches seem to be dealing with the issue I am having.
This is the Python code I'm using to gather the inputs from SmartSheet:
#for a non existent input store an empty value
def gather_vals(inp):
return input_data.get(inp, emptyInput)
def pull_inputs(inputs, vinputs):
for key, value in zip(vinputs,inputs):
v = gather_vals(value)
d_inputs.update( {key:v})
x_vinputs = ['input_equipment', 'input_from', 'input_to', 'input_description', 'input_contractor', 'input_booked', 'input_confirmed', 'input_job_no', 'input_complete', 'input_est_val', 'input_inv_val', 'input_inv_no', 'input_book', 'input_update', 'input_comments_email']
x_inputs = ['equipment', 'from', 'to', 'description', 'contractor', 'booked', 'confirmed', 'job_no', 'complete', 'est_val', 'inv_val', 'inv_no', 'book', 'update', 'comments_email']
# Gather rest of inputs
emptyInput = None
d_inputs = {}
#gather pick-up/delivery date/time input data
pull_inputs(x_inputs, x_vinputs)
results.update(d_inputs)
return results
It appears that the code works, it returns no errors and when there is an updated actual value in SmartSheet it is updated in MySQL but when the comment is deleted the old value is left in MySQL.
I'm hoping someone may have a suggestion for me to follow.
This is the Zap flow:
Zapier support tells me the problem is that Nulls are being stripped off the output of the Python code step circled in red. The Nulls need to flow through to the Update Row step.
Manually entering NULL or Null or null in the Update Row step results in a string of characters being sent to MySQL. See the outpt from MySQL Workbench for that record:
Sending an empty string results in a string with quotation marks being sent to MySQL:
It appears this Zapier step will only send strings to MySQL so I guess it is a moot point that the code step strips NULLs from the output.

Insert big list into Cassandra using python

I have a problem inserting big list into Cassandra using python. I have a list of 3200 string that I want to save in Cassandra:
CREATE TABLE IF NOT EXISTS my_test (
id bigint PRIMARY KEY,
list_strings list<text>
);
When I'm reducing my list I have no problem. It works.
prepared_statement = session.prepare("INSERT INTO my_test (id, list_strings) VALUES (?, ?)")
session.execute(prepared_statement, [id, strings[:5]])
But if I keep the totality of my list I have an error:
Error from server: code=1500 [Replica(s) failed to execute write] message="Operation failed - received 0 responses and 1 failures" info={'required_responses': 1, 'consistency': 'LOCAL_ONE', 'received_responses': 0, 'failures': 1}
How can I insert big list into Cassandra?
A DB array type is not supossed to hold that ammount of data. Using different rows of the table to store each string would be better:
id | time | strings
-----------+------------+---------
bigint | timestamp | string
partition | clustering |
Using id as the clustering key would be a bad solution as when requesting all the tweets from a user id, it will require to do a read in multiple nodes while when used as a partition key it will only require to read in one node per user.

Categories