Pandas query group by /order by

Pandas query group by /order by - python

How can I get the following using Pandas query.
SELECT site_id, count(issue) FROM [Randall]
where site_id >3
group by site_id
LIMIT 10
My query could be found below; However, when executed it have 2 'issue' columns, one for the actual issue and another the 'count' and I have repetitive issues. what I want is to sum the issues by site.
w_alarms.groupby(['site_id', 'issue']).size()

Somethink like
w_alarms[w_alarms.site_id > 3].groupby('site_id')['issue'].count()

Try
w_alarms.siteid[w_alarms.siteid>3].value_counts().head(10)

You don't provide example of output you desired.
You might want this:
w_alarms.issue.groupby[w_alarms.siteid[w_alarms.siteid > 3]].count()

Related

Why does COUNT(*) in my SQL query return several values? How to get one single value for total rows found?

I have a Django class getting some order data from a PostreSQL database. I need to get the number of rows found in a query.
Trying to get the found rows count from the following query with COUNT(*):
When I print the result from the query above, I get a lot of data:
I only want to get a single number, the count of the total rows found and loaded by the select query above. How do I achieve this?
Keep in mind that I'm pretty new to SQL, so I might be missing something obvious to you.
Thanks!

COUNT(expr) will return a count of the number of non-null values of expr in the rows that are retrieved by the SELECT.
In your case, you're grouping a bunch togheter and it returns the count for each grouped result row.
You'll probably get the result you're looking for, wrapping it in a subselect.
For example:
SELECT COUNT(*)
FROM (SELECT o.* FROM your_table o ....)

Remove duplicates from paired rows before counting values

I have a Python program I am trying to convert from CSV to SQLite, I have managed to do everything apart from remove duplicates for counting entries. My database is JOINed. I'm reading the database like this:
df = pd.read_sql_query("SELECT d.id AS is, mac.add AS mac etc etc
I have tried df.drop_duplicates('tablename1','tablename2')
and
df.drop_duplicates('row[1],row[3]')
but it doesn't seem to work.
The below code is what I used with the CSV version & I would like to replicate for the Python SQLite script.
for row in reader:
key = (row[1], row[2])
if key not in entries:
writer.writerow(row)
entries.add(key)
del writer

have you tried running SELECT DISTINCT col1,col2 FROM table first?
In your case it might be as simple as placing the DISTINCT keyword prior to your column names.

You need to use the subset parameter
df.drop_duplicates(subset=['tablename1','tablename2'])

Thank you piRSquared, The missing subset is all i needed, thank you.
You need to use the subset parameter
df.drop_duplicates(subset=['tablename1','tablename2'])
Will also look into SELECT DISTINCT but for now, subset works.

Select all with latest timestamp in django?

This seems like it should be obvious but I can't get it to work.
I'm using django to query a simple table into which a bunch of data gets inserted periodically. I'd like to write a view that only pulls data with the latest timestamp, something like
select * from mytable
where event_time = (
select max(event_time) from mytable);
What's the proper syntax for that?

You can try with latest method. Here documentation
MyModel.objects.latest('event_time')
assuming event_time has a date value, this will return the latest object (the max date)

You can try:
Assuming your table name is EventTime
EventTime.objects.all().order_by('-timestamp')[0]
[0] at the end will give you the first result.
Remove it and you will have all entries order by timestamp in desc order
EDIT: A better approach suggested by #Gocht will be:
EventTime.objects.all().order_by('-timestamp').first()
This will handle the scenario when there is no object present in the database.

Django startswith query

How would I do the following query?
SELECT * FROM title WHERE id LIKE '12345%'
What I currently have is:
Title.objects.get(id='12345')
Which obviously doesn't do the LIKE% (and the icontains does both). What would be the correct query here?

Title.objects.filter(id__startswith='12345')
https://docs.djangoproject.com/en/dev/ref/models/querysets/

You can do like this where code would be your startswith string by which you want to filter the table.
code = '12345'
enter code hereTitle.objects.extra(where=["%s LIKE id||'%%'"], params=[code])

Sqlalchemy get last X rows in order

I need to get the last X rows from a table, but in order of the ID. How could I achieve this?

query = users.select().order_by(users.c.id.desc()).limit(5)
print reversed(conn.execute(query).fetchall() )
something like that anyway

This worked for me...
c=session.query(a).order_by(a.id.desc()).limit(2)
c=c[::-1]
This solution is 10 times faster than the python slicing solution proposed by BrendanSimon.

I believe you can prefix the order_by parameter with a '-' to get reverse order.
query = users.select().order_by(-users.c.id.desc()).limit(5)
Also, I believe you can use python slices as an alternative to limit.
query = users.select().order_by(users.c.id.desc())[-5:]
query = users.select().order_by(-users.c.id.desc())[:5]

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pandas query group by /order by - python

Somethink like w_alarms[w_alarms.site_id > 3].groupby('site_id')['issue'].count()

Try w_alarms.siteid[w_alarms.siteid>3].value_counts().head(10)

You don't provide example of output you desired. You might want this: w_alarms.issue.groupby[w_alarms.siteid[w_alarms.siteid > 3]].count()

Related

Why does COUNT(*) in my SQL query return several values? How to get one single value for total rows found?

Remove duplicates from paired rows before counting values

Select all with latest timestamp in django?

Django startswith query

Sqlalchemy get last X rows in order

Categories

Resources