I have this query
SELECT *, COUNT(app.id) AS totalApps FROM users JOIN app ON app.id = users.id
GROUP BY app.id ORDER BY app.time DESC LIMIT ?
which is supposed to get all results from "users" ordered by another column (time) in a related table (the id from the app tables references the id from the users table).
The issue I have is that the grouping is done before the ordering by date, so I get very old results. But I need the grouping in order to get distinct users, because each user can have multiple 'apps'... Is there a different way to achieve this?
Table users:
id TEXT PRIMARY KEY
Table app:
id TEXT
time DATETIME
FOREIGN KEY(id) REFERENCES users(id)
in my SELECT query I want to get a list of users, ordered by the app.time column. But because one user can have multiple app records associated, I could get duplicate users, that's why I used GROUP BY. But then the order is messed up
The underlying issue is that the SELECT is an aggregate query as it contains a GROUP BY clause :-
There are two types of simple SELECT statement - aggregate and
non-aggregate queries. A simple SELECT statement is an aggregate query
if it contains either a GROUP BY clause or one or more aggregate
functions in the result-set.
SQL As Understood By SQLite - SELECT
And thus that the column's value for that group, will be an arbitrary value the column of that group (first according to scan/search, I suspect, hence the lower values) :-
If the SELECT statement is an aggregate query without a GROUP BY
clause, then each aggregate expression in the result-set is evaluated
once across the entire dataset. Each non-aggregate expression in the
result-set is evaluated once for an arbitrarily selected row of the
dataset. The same arbitrarily selected row is used for each
non-aggregate expression. Or, if the dataset contains zero rows, then
each non-aggregate expression is evaluated against a row consisting
entirely of NULL values.
So in short you cannot rely upon the column values that aren't part of the group/aggregation, when it's an aggregate query.
Therefore have have to retrieve the required values using an aggregate expression, such as max(app.time). However, you can't ORDER by this value (not sure exactly why by it's probably inherrent in the efficiency aspect)
HOWEVER
What you can do is use the query to build a CTE and then sort without aggregates involved.
Consider the following, which I think mimics your problem:-
DROP TABLE IF EXISTS users;
DROP TABLE If EXISTS app;
CREATE TABLE IF NOT EXISTS users (id INTEGER PRIMARY KEY, username TEXT);
INSERT INTO users (username) VALUES ('a'),('b'),('c'),('d');
CREATE TABLE app (the_id INTEGER PRIMARY KEY, id INTEGER, appname TEXT, time TEXT);
INSERT INTO app (id,appname,time) VALUES
(4,'app9',721),(4,'app10',7654),(4,'app11',11),
(3,'app1',1000),(3,'app2',7),
(2,'app3',10),(2,'app4',101),(2,'app5',1),
(1,'app6',15),(1,'app7',7),(1,'app8',212),
(4,'app9',721),(4,'app10',7654),(4,'app11',11),
(3,'app1',1000),(3,'app2',7),
(2,'app3',10),(2,'app4',101),(2,'app5',1),
(1,'app6',15),(1,'app7',7),(1,'app8',212)
;
SELECT * FROM users;
SELECT * FROM app;
SELECT username
,count(app.id)
, max(app.time) AS latest_time
, min(app.time) AS earliest_time
FROM users JOIN app ON users.id = app.id
GROUP BY users.id
ORDER BY max(app.time)
;
This results in :-
Where although the latest time for each group has been extracted the final result hasn't been sorted as you would think.
Wrapping it into a CTE can fix that e.g. :-
WITH cte1 AS
(
SELECT username
,count(app.id)
, max(app.time) AS latest_time
, min(app.time) AS earliest_time
FROM users JOIN app ON users.id = app.id
GROUP BY users.id
)
SELECT * FROM cte1 ORDER BY cast(latest_time AS INTEGER) DESC;
and now :-
Note simple integers have been used instead of real times for my convenience.
Since you need the newest date in every group, you could just MAX them:
SELECT
*,
COUNT(app.id) AS totalApps,
MAX(app.time) AS latestDate
FROM users
JOIN app ON app.id = users.id
GROUP BY app.id
ORDER BY latestDate DESC
LIMIT ?
You could use windowed COUNT:
SELECT *, COUNT(app.id) OVER(PARTITION BY app.id) AS totalApps
FROM users
JOIN app
ON app.id = users.id
ORDER BY app.time DESC
LIMIT ?
Maybe you could use?
SELECT DISTINCT
Read more here: https://www.w3schools.com/sql/sql_distinct.asp
Try to grouping by id and time and then order by time.
select ...
group by app.id desc, app.time
I assume that id is unique in app table.
and how you assign ID to? maybe you have enough to order by id desc
I am working with my lab database and close to complete it. But i am stuck in a query and a few similar queries which all give back the similar results.
Here is the Query in design mode
and this is what it gives out
This query is counting the number of ID values in table PatientTestIDs whereas I want to count the number of unique PatientID values grouped by each department
I have even tried Unique Values and Unique Records properties but all the times it gives the same result.
What you want requires two queries.
Query1:
SELECT DISTINCT PatientID, DepartmentID FROM PatientTestIDs;
Query2:
SELECT Count(*) AS PatientsPerDept, DepartmentID FROM Query1 GROUP BY DepartmentID;
Nested all in one:
SELECT Count(*) AS PatientsPerDept, DepartmentID FROM (SELECT DISTINCT PatientID, DepartmentID FROM PatientTestIDs) AS Query1 GROUP BY DepartmentID;
You can include the Departments table in query 2 (or the nested version) to pull in descriptive fields but will have to include those additional fields in the GROUP BY.
We have data as follows in system
User data
Experience
Education
Job Application
This data will be used across application and there are few logic also attached to these data.
Just to make sure that this data are consistent across application, i thought to create View for the same and get count of these data then use this view at different places.
Now question is, as detail tables does not have relation with each other, how should i create view
Create different view for each table and then use group by
Create one view and write sub query to get these data
From performance perspective, which one is the best approach?
For e.g.
SELECT
UserId,
COUNT(*) AS ExperienceCount,
0 AS EducationCount
FROM User
INNER JOIN Experience ON user_id = User_Id
GROUP BY
UserId
UNION ALL
SELECT
UserId,
0,
COUNT(*)
FROM User
INNER JOIN Education ON user_id = user_id
GROUP BY
UserId
And then group by this to get summary of all these data in one row per user.
One way to write the query that you have specified would probably be:
SELECT UserId, SUM(ExperienceCount), SUM(EducationCount
FROM ((SELECT UserId, COUNT(*) as ExperienceCount, 0 AS EducationCount
FROM Experience
GROUP BY UserId
) UNION ALL
(SELECT UserId, 0, COUNT(*)
GROUP BY UserId
)
) u
GROUP BY UserId;
This can also be written as a FULL JOIN, LEFT JOIN, and using correlated subqueries. Each of these can be appropriate in different circumstances, depending on your data.
I have an SQL table with duplicate records on FacilityID. FacilityKey is unique. I need to select rows that have duplicates on FacilityID but I only want to show one record for each and I want to choose the one with the most recent (highest) FacilityKey. Can anybody help me figure out how to write my query? I've tried everything I could think of and searched the internet for something similar to no avail. All I can find are examples of identifying the duplicate records.
Something like this should work:
select FacilityID, max(FacilityKey)
from Facilities
group by FacilityID
having count(FacilityID)>1
And then if you want to get all of the fields, something like this:
select *
from facilities
inner join (
select FacilityID, max(FacilityKey) as maxkey
from Facilities
group by FacilityID
having count(FacilityID)>1
) t on t.FacilityID = facilities.FacilityID and t.maxkey=facilities.FacilityKey
try this:
select
FacilityKey , MAX(FacilityID) AS FacilityID
From YourTable
GROUP BY FacilityKey
HAVING COUNT(*)>1
Using SQL Server, I have a table that looks like this:
What I need to do is write a query to identify scenarios where the Name and Permissions field are equal so that I can give give them a unique Set ID.
For instance, rows 2 and 4 would be a set I can give a SetID as well as rows 6 and 7 are a set that I can give another SetID. But rows 2 and 3 are NOT a set.
So far I have tried using DENSE_RANK () Over(Order by Name) which helps to add an id based on like Names but doesn't take into account matching permissions. And have tried joining the table on itself but with millions of rows of data I end up with unwanted duplicates.
The logic I am following is this:
If (Name and Permissions) of one row = (Name and Permissions) of another row give them a SetID to share.
Please help I have been banging my head against the wall with this one. Ideally a SQL query would accomplish this but am open to anything.
Thank you!
You could do it for example like this:
select
Name,
Permission,
row_number() over (order by Name, Permission) as RN
from (
select distinct
Name,
Permission
from
permissions
) TMP
order by Name, Permission
The inner select gets the distinct combinations, and the outer one assigns the numbers.
SQL Fiddle: http://sqlfiddle.com/#!6/c8319/3
This will probably do something similar to what you want.
SELECT
name,
permissions,
accountname,
ROW_NUMBER() OVER (PARTITION BY name,permissions ORDER By name,permissions) as SetID
FROM table;