Why is COUNT returning the wrong number?

Why is COUNT returning the wrong number? - python

I'm very new to programming and trying to figure what I'm doing wrong. I have a database with two tables. One is called "addresses and the other is called "tablePlayers". I'm trying to count the number of times a specific person's name appears in the "winner" column and then update it under the "W" column in the table "tablePlayers" on the row of that same person's name.
Here's the code I'm using
c.execute("UPDATE tablePlayers SET W = COUNT(winner) FROM addresses WHERE winner ='Mika'")
Here's what the tables look like in DB Browser for SQLite. As you can see "Mika" only appears once under the "winners" column. But the count says 6 in the other table, and is only printed on one row and not the one with the matching name
addresses
tablePlayers

I can't tell you why exactly it is going wrong, but I would recommend that you use parentheses and a SELECT statement to build a nested query. For example, your query could rather look like this:
UPDATE tablePlayers SET W = (SELECT COUNT(winner) FROM addresses WHERE winner ='Mika') WHERE Name='Mika'
You could also do the more general case and do it for all names at once:
UPDATE tablePlayers SET W = (SELECT COUNT(winner) FROM addresses WHERE winner=tablePlayers.Name)

Related

pyodbc join tables with equal named columns ("upsert") [duplicate]

I need to write an SQL query for MS-Access 2000 so that a row is updated if it exists, but inserted if it does not. (I believe this is called an "upsert")
i.e.
If row exists...
UPDATE Table1 SET (...) WHERE Column1='SomeValue'
If it does not exist...
INSERT INTO Table1 VALUES (...)
Can this be done in one query?

You can simulate an upsert in an Access by using an UPDATE query with a LEFT JOIN.
update b
left join a on b.id=a.id
set a.f1=b.f1
, a.f2=b.f2
, a.f3=b.f3

Assuming a unique index on Column1, you can use a DCount expression to determine whether you have zero or one row with Column1 = 'SomeValue'. Then INSERT or UPDATE based on that count.
If DCount("*", "Table1", "Column1 = 'SomeValue'") = 0 Then
Debug.Print "do INSERT"
Else
Debug.Print "do UPDATE"
End If
I prefer this approach to first attempting an INSERT, trapping the 3022 key violation error, and doing an UPDATE in response to the error. However I can't claim huge benefits from my approach. If your table includes an autonumber field, avoiding a failed INSERT would stop you from expending the next autonumber value needlessly. I can also avoid building an INSERT string when it's not needed. The Access Cookbook told me string concatenation is a moderately expensive operation in VBA, so I look for opportunities to avoid building strings unless they're actually needed. This approach will also avoid creating a lock for an unneeded INSERT.
However, none of those reasons may be very compelling for you. And in all honesty I think my preference in this case may be about what "feels right" to me. I agree with this comment by #David-W-Fenton to a previous Stack Overflow question: "It's better to write your SQL so you don't attempt to append values that already exist -- i.e., prevent the error from happening in the first place rather than depending on the database engine to save you from yourself."

An "upsert" is possible, if the tables have a unique key.
This old tip from Smart Access is one of my favourites:
Update and Append Records with One Query
By Alan Biggs
Did you know that you can use an update query in Access to both update
and add records at the same time? This is useful if you have two
versions of a table, tblOld and tblNew, and you want to integrate the
changes from tblNew into tblOld.
Follow these steps:
Create an update query and add the two tables. Join the two tables by
dragging the key field of tblNew onto the matching field of tblOld.
Double-click on the relationship and choose the join option that includes all records from tblNew and only those that match from
tblOld.
Select all the fields from tblOld and drag them onto the QBE grid.
For each field, in the Update To cell type in tblNew.FieldName, where FieldName matches the field name of tblOld.
Select Query Properties from the View menu and change Unique Records to False. (This switches off the DISTINCTROW option in the SQL
view. If you leave this on you'll get only one blank record in your
results, but you want one blank record for each new record to be added
to tblOld.)
Run the query and you'll see the changes to tblNew are now in tblOld.
This will only add records to tblOld that have been added to tblNew.
Records in tblOld that aren't present in tblNew will still remain in
tblOld.

I usually run the insert statement first and then I check to see if error 3022 occurred, which indicates the row already exists. So something like this:
On Error Resume Next
CurrentDb.Execute "INSERT INTO Table1 (Fields) VALUES (Data)", dbFailOnError
If Err.Number = 3022 Then
Err.Clear
CurrentDb.Execute "UPDATE Table1 SET (Fields = Values) WHERE Column1 = 'SomeValue'", dbFailOnError
ElseIf Err.Number <> 0 Then
'Handle the error here
Err.Clear
End If
Edit1:
I want to mention that what I've posted here is a very common solution but you should be aware that planning on errors and using them as part of the normal flow of your program is generally considered a bad idea, especially if there are other ways of achieving the same results. Thanks to RolandTumble for pointing this out.

You don't need to catch the error. Instead, just run the INSERT statement and then check
CurrentDb.RecordsAffected
It will either be 1 or 0, depending.
Note: It's not good practice to execute against CurrentDB. Better to capture the database to a local variable:
Dim db As DAO.Database
Set db = CurrentDb
db.Execute(INSERT...)
If db.RecordsAffected = 0 Then
db.Execute(UPDATE...)
End If

As others have mentioned, You can UPSERT with an UPDATE LEFT JOIN using the new table as the left hand side. This will add all missing records and update matching records, leaving deleted records intact.
If we follow the Create and run an update query Article we will end up with SQL that looks like this:
UPDATE Table1
INNER JOIN NewTable1 ON Table1.ID = NewTable1.ID
SET Table1.FirstName = [NewTable1].[FirstName]
but an inner join will only update matching records, it won't add new records. So let's change that INNER to a LEFT:
UPDATE Table1
LEFT JOIN NewTable1 ON Table1.ID = NewTable1.ID
SET Table1.FirstName = [NewTable1].[FirstName]
Now save a copy of the DB. Run a test on the copy before you run this on your primary DB.

Row sorting and selection logic in Python on Sqlite db

hello Thanks for taking the time to go through my question. I work in the budget space for a small city and during these precarious time, I am learning some python for maybe in the future helping me with some financial data modelling. We use SAP currently but i also wanted to learn a new language.
I need some pointers on where to look for certain answers.
for ex, I made a database with a few million records, sorted by date and time. I was able to strip off the data I did not need and now have a clean database to work on
At a high level, I want to know if based on the first record in a day, is there another entry the same day that is double of the first record.
Date|time|dept|Value1
01/01/2019|11:00|BUD|51.00
01/01/2019|11:30|CSD|101.00
01/01/2019|11:50|BUD|102.00
01/02/2019|10:00|BUD|200.00
01/02/2019|10:31|BUD|201.00
01/02/2019|11:51|POL|400.00
01/03/2019|11:00|BUD|100.00
01/03/2019|11:30|PWD|101.00
01/03/2019|11:50|BUD|110.00
based on the data above and the requirement, I want to get an output of
Date|time|dept|Value| Start Value
01/01/2019|11:50|BUD|102.00|51.00
01/02/2019|11:51|POL|400.00|200.00
01/03/2019|NONE|NONE|NONE|100.00
On Day 3, There were no values that was at least double so, we have none or null.
What I have done so far
I have been able to connect to database [python]
2. I was able to strip off the unnecessary information and depts from the database [sqlite]
3. I have been able to create new tables for result [Python]
Questions / best Practices
How to get the first line per day. Do I start off with a variable before the loop that is assigned to Jan 1, 2019 and then pick the row number and store it in another table or what other options do we have here.
Once the first row per day is stored/captured in another table or a array, How do I get the first occurrence of a value at least twice of the first line.
ex? begin meta code***********
Start from Line 1 to end
table2.date[] Should be equal to 01/01/2019
table2.value[] Should be equal to 51.00
look through each line if date = table2.date and value >= 2* (table2.value[])
*if successful, get record line number and department and value and store in new table
else
goto next line
Then increase table2.date and table2.value by 1 and do the loop again.
end meta code*****************
is this the right approach, I feel like going through millions of records for each date change is not very optimized.
I can probably add a of condition to exit if date is not equal to table2.date[1] but am still not sure if this is the right way to approach this problem. This will be run only once or twice a year so system performance is not that important but still am thinking of approaching it the right way.
Should I export the final data to excel for analysis or are thee good analysis modelling tools in Python. What would the professionals recommend?

You could use exists to check if another record exists on the same day and with a value that is twice greater, and window functions to filter on the top record per day:
select *
from (
select
t.*,
row_number() over(partition by date order by time) rn
from mytable t
where exists (
select 1 from mytable t1 where t1.date = t.date and t1.value = 2 * t.value
)
) t
where rn = 1
In versions of SQLite where row_number() is not available, another option is to filter with a correlated subquery:
select t.*
from mytable t
where
exists(select 1 from mytable t1 where t1.date = t.date and t1.value = 2 * t.value)
and t.time = (select min(t1.time) from mytable t1 where t1.date = t.date)

You could do it that way, but you're correct, it would take a long time. I don't know if SQLite has the capabilities to do what you want effectively, but I know Python does. It sounds like you might want to use the Python Data Analysis Library, Pandas. You can find out how to get your SQLite into Pandas here:
How to open and convert sqlite database to pandas dataframe
Once you have it in a Pandas Dataframe, there are tons of functions to get the first occurrence of something, find duplicates, find unique values, and even generate other dataframes with only unique values.

Delete first row from SQLITE table in python

Its a simple question, how can I just delete the first line from a table without having to give a search criteria.
Normaly it is:
c.execute('DELETE FROM name_table WHERE tada=?', (tadida,))
I just want to delete first row. Not having the WHERE part. The reason is that I want to create a FIFO table (or stack) add from the bottom and delete from the top.
I can do this by keeping track of time and date or giving the rows a ID. But I would prefer the described method.
Thanx.

I just want to delete first row
SQL tables have no inherent ordering, so there is no defined concept of first row, unless a column (or a set of columns) is specified for ordering.
Assuming that you do have an ordering colum, say id, you can use limit to restrict which row should be deleted:
delete from mytable order by id limit 1
This removes the record that has the smallest id from the table.

Unless you use a custom version of sqlite, you can't use ORDER BY or LIMIT with DELETE.
If your version of sqlite wasn't built with that option (Some OS-distributed ones are, some aren't), and building and installing a copy with it is beyond your comfort level, an alternative, assuming a column named id is used for ordering, with the smallest value of id being the oldest record:
DELETE FROM yourtable WHERE id = (SELECT min(id) FROM yourtable);

Postgres: autogenerate primary key in postgres using python

cursor.execute('UPDATE emp SET name = %(name)s',{"name": name} where ?)
I don't understand how to get primary key of a particular record.
I have some N number of records present in DB. I want to access those record &
manipulate.
Through SELECT query i got all records but i want to update all those records accordingly
Can someone lend a helping hand?
Thanks in Advance!
Table structure:
ID CustomerName ContactName
1 Alfreds Futterkiste
2 Ana Trujillo
Here ID is auto genearted by system in postgres.
I am accessing CustomerName of two record & updating. So here when i am updating
those record the last updated is overwrtited in first record also.
Here i want to set some condition so that When executing update query according to my record.
After Table structure:
ID CustomerName ContactName
1 xyz Futterkiste
2 xyz Trujillo
Here I want to set first record as 'abc' 2nd record as 'xyz'
Note: It ll done using PK. But i dont know how to get that PK

You mean you want to use UPDATE SQL command with WHERE statement:
cursor.execute("UPDATE emp SET CustomerName='abc' WHERE ID=1")
cursor.execute("UPDATE emp SET CustomerName='xyz' WHERE ID=2")
This way you will UPDATE rows with specific IDs.

Maybe you won't like this, but you should not use autogenerated keys in general. The only exception is when you want to insert some rows and do not do anything else with them. The proper solution is this:
Create a sequencefor your table. http://www.postgresql.org/docs/9.4/static/sql-createsequence.html
Whenever you need to insert a new row, get the next value from the generator (using select nextval('generator_name')). This way you will know the ID before you create the row.
Then insert your row by specifying the id value explicitly.
For the updates:
You can create unique constraints (or unique indexes) on sets of coulmns that are known to be unique
But you should identify the rows with the identifiers internally.
When referring records in other tables, use the identifiers, and create foreign key constraints. (Not always, but usually this is good practice.)
Now, when you need to updatea row (for example: a customer) then you should already know which customer needs to be modified. Because all records are identified by the primary key id, you should already know the id for that row. If you don't know it, but you have an unique index on a set of fields, then you can try to get the id. For example:
select id from emp where CustomerName='abc' -- but only if you have a unique constraing on CustomerName!
In general, if you want to update a single row, then you should NEVER update this way:
update emp set CustomerName='newname' where CustomerName='abc'
even if you have an unique constraint on CustomerName. The explanation is not easy, and won't fit here. But think about this: you may be sending changes in a transaction block, and there can be many opened transactions at the same time...
Of course, it is fine to update rows, if you intention is to update all rows that satisfy your condition.

sqlite tracking IDs - find missing integers in a seq

First I am not even sure whether I am asking the right question, sorry for that. SQL is new to me. I have a table I create in SQLITE like this:
CREATE TABLE ENTRIES "(ID INTEGER PRIMARY KEY AUTOINCREMENT,DATA BLOB NOT NULL)"
Which is all fine if I have only additions for entries. If I create entries, they increment. Let us say I added 7 entries. Now I delete 3 entries:
DELETE FROM NODES WHERE ID = 3
DELETE FROM NODES WHERE ID = 4
DELETE FROM NODES WHERE ID = 5
Entries I now have are:
1,2,6,7.
The next time I add an entry it will have ID=8.
So, my question is:
How do I get the next 3 entries, to get the IDs 3, 4, 5 and only the 4 entry will then get 8? I realize this is similar to SQL: find missing IDs in a table, and it is maybe also a general programming (not just SQL) problem. So, I would be happy to see some Python and SQLite solutions.
Thanks,
Oz

I don't think that's the way auto incrementing fields work. SQLite keeps a counter of the last used integer. It will never 'fill in' the deleted values if you want to get the next 3 rows after
an id you could:
SELECT * FROM NODES WHERE ID > 2 LIMIT 3;
This will give you the next three rows with an id greater than 2
Additionally you could just create a deleted flag or something? so the rows are never actually removed from your database.

You can't. SQLite will never re-use deleted IDs, for database integrity reasons. Let's assume you have a second table which has a foreign key which references the first table. If, for some reason, a corresponding row is removed without removing the rows which reference it (using the primary ID) as well, it will point to the wrong row.
Example: If you remove a person record without removing the purchases as well, the purchase records will point to the wrong person once you re-assign the old ID.
───────────────────── ────────────────────
Table 1 – customers Table 2 – purchase
───────────────────── ────────────────────
*ID <───────┐ *ID
Name │ Item
Address └─────── Customer
Phone Price
This is why pretty much any database engine out there assigns primary IDs strictly incremental. They are database internals, you usually shouldn't touch them. If you need to assign your own IDs, just add a separate column (think twice before doing so).
If you want to keep track of the number of rows, you can query it like this: SELECT Count(*) FROM table_name.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.