SQLite3. Select only highest revision using FKs, MAX() and GROUP BY - sql

What you need to know about schema and data:
SELECT * FROM 'income'; -- Returns all 309 rows.
SELECT * FROM 'income' WHERE businessday_revision = 0; -- 308 rows
SELECT * FROM 'income' WHERE businessday_revision = 1; -- 1 row
The businessday table has:
id INTEGER,
revision INTEGER,
....
PRIMARY KEY(id, revision)
The income table has:
id -- integer primary key, quite unimportant I think
businessday_id -- FK
businessday_revision -- FK, when a day is edited, a new revision is created
The foreign key looks like this:
FOREIGN KEY(businessday_id, businessday_revision) REFERENCES businessday(id, revision) ON DELETE CASCADE,
The problem
I want to select incomes only from the latest revision on each day. Which should be 308 rows.
But sadly I'm too dense to figure it out. I've found that I can get all the latest businessday revisions using this:
SELECT id, MAX(revision)
FROM businessday
GROUP BY id;
Is there some way I can use this data to select my incomes? Something along the lines of:
-- Pseudo-code:
SELECT *
FROM income i
WHERE i.businessday_id = businessday.id THAT EXISTS IN
(SELECT id, MAX(revision)
FROM businessday
GROUP BY id);
I obviously have no clue here, please point me in the right direction!

This should work:
SELECT i.*
FROM Income i
INNER JOIN (
SELECT id, MAX(revision) maxrevision
FROM businessDay
GROUP BY id
) t ON i.businessday_id = t.id AND i.businessday_revision = t.maxrevision

How about using join?
SELECT i.*
FROM income i
INNER JOIN
(
SELECT id, MAX(revision) revision
FROM businessday
GROUP BY id
) s ON i.businessday_id = s.id AND
i.businessday_revision = s.revision

Related

Select a NON-DISTINCT column in a query that return distincts rows

The following query returns the results that I need but I have to add the ID of the row to then update it. If I add the ID directly in the select statement it will return me more results then I need because each ID is unique so the DISTINCT statement see the line as unique.
SELECT DISTINCT ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
So basically I need to add ucpse.ID in the Select statement while keeping DISTINCT values for MemberID,ProductID and UserID.
Any Ideas ?
Thank you
According to you comment:
If the data has been duplicated 67 times for a given employee with a given product and a given client, I need to keep only one of thoses records. It's not important which one, so this is why I use DISTINC to obtain unique combinaison of given employee with a given product and a given client.
You can use MIN() or MAX() and GROUP BY instead of DISTINCT
SELECT MAX(ucpse.ID) AS ID, ucpse.MemberID, ucpse.ProductID, ucpse.UserID
FROM UserCustomerProductSalaryExceptions as ucpse
WHERE EXISTS (SELECT NULL
FROM UserCustomerProductSalaryExceptions as upcse2
WHERE ucpse.userid = upcse2.userid AND ucpse.MemberID = upcse2.MemberID AND ucpse.ProductID = upcse2.ProductID
GROUP BY upcse2.UserID, upcse2.memberid, upcse2.productid
HAVING COUNT(UserID) >= 2
)
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
UPDATE:
From you comments I think the below query is what you need
DELETE FROM UserCustomerProductSalaryExceptions
WHERE ID NOT IN ( SELECT MAX(ucpse.ID) AS ID
FROM #UserCustomerProductSalaryExceptions
GROUP BY ucpse.MemberID, ucpse.ProductID, ucpse.UserID
HAVING COUNT(ucpse.ID) >= 2
)
If all you want is to delete the duplicates, this will do it:
WITH X AS
(SELECT ID,
ROW_NUMBER() OVER (PARTITION BY MemberID, ProductID, UserID ORDER BY ID) AS DupRowNum<br
FROM UserCustomerProductSalaryExceptions
)
DELETE X WHERE DupRowNum > 1
ID's not necessary - try:
UPDATE uu SET
<your settings here>
FROM UserCustomerProductSalaryExceptions uu
JOIN ( <paste your entire query above here>
) uc ON uc.MemberID=uu.MemberId AND uc.ProductID=uu.ProductId AND uc.UserID=uu.UserId
From the sound of your data structure (which I would STRONGLY advise normalizing as soon as possible), it sounds like you should be updating all the records. It sounds as if each duplicate is important because it contains some information about an employee's relation to a customer or product.
I would probably update all the records. Try this:
UPDATE UCPSE
SET
--Do your updates here
FROM UserCustomerProductSalaryExceptions as ucpse
JOIN
(
SELECT UserID, MemberID, ProductID
FROM UserCustomerProductSalaryExceptions
GROUP BY UserID, MemberID, ProductID
HAVING COUNT(UserID) >= 2
) T
ON ucpse.UserID = T.UserID AND ucpse.MemberID = T.MemberID AND ucpse.ProductID = T.ProductID

Find last row in group by query-SQL Server

I have table in SQL Server. I want to find last row in each group. I tried with the following query, but it does not return exact result. ID column is PK and other columns are set to NOT NULL.
select ID, Name FROM
(select ID, Name, max(ID) over (partition by Name) as MAX_ID
from Customer) x where ID= MAX_ID
To be more clear. I have 2 queries.First:
ALTER PROCEDURE [dbo].[Ramiz_Musterija_RowNum]
#Datum DATE,
#BrojKamiona INT
AS SET NOCOUNT ON
SELECT Ime,MusterijaID,RowNum=ROW_NUMBER() OVER(ORDER BY Ime)FROM Musterije
WHERE Datum=#Datum AND BrojKamiona=#BrojKamiona GROUP BY Ime,MusterijaID
And second one:
ALTER PROCEDURE [dbo].[Ramiz_Musterija_FindLast]
#Datum DATE,
#BrojKamiona INT
AS SET NOCOUNT ON
SELECT a.* from Musterije a
JOIN (SELECT Ime, MAX(MusterijaID) AS MAXID FROM Musterije GROUP BY Ime) AS b
ON a.MusterijaID = b.MAXID AND a.Datum=#Datum AND a.BrojKamiona=#BrojKamiona
Then LINQ query:
var rowNumList = from f in customerFindLastList
join r in customerRowNumList
on f.MusterijaID equals r.MusterijaID
select new { r.RowNum };
I am trying to find last row in each row,then match this 2 queries on MusterijaID column.
Any help regarding this would be appreciated.
This is output of one group. Now, problem is that these two queries are matched on "4250" MusterijaID, but I need to match queries on "4229".
Ime MusterijaID
100//1 4246
100//1 4247
100//1 4248
100//1 4249
100//1 4250
100//1 4229
select ID, Name
FROM (select ID, Name, -- add other columns here
ROW_NUMBER() over (partition by Name ORDER BY ID DESC) as MAX_ID
from Customer) x
WHERE MAX_ID = 1

Select a Column in SQL not in Group By

I have been trying to find some info on how to select a non-aggregate column that is not contained in the Group By statement in SQL, but nothing I've found so far seems to answer my question. I have a table with three columns that I want from it. One is a create date, one is a ID that groups the records by a particular Claim ID, and the final is the PK. I want to find the record that has the max creation date in each group of claim IDs. I am selecting the MAX(creation date), and Claim ID (cpe.fmgcms_cpeclaimid), and grouping by the Claim ID. But I need the PK from these records (cpe.fmgcms_claimid), and if I try to add it to my select clause, I get an error. And I can't add it to my group by clause because then it will throw off my intended grouping. Does anyone know any workarounds for this? Here is a sample of my code:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
This is the result I'd like to get:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, cpe.fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
The columns in the result set of a select query with group by clause must be:
an expression used as one of the group by criteria , or ...
an aggregate function , or ...
a literal value
So, you can't do what you want to do in a single, simple query. The first thing to do is state your problem statement in a clear way, something like:
I want to find the individual claim row bearing the most recent
creation date within each group in my claims table
Given
create table dbo.some_claims_table
(
claim_id int not null ,
group_id int not null ,
date_created datetime not null ,
constraint some_table_PK primary key ( claim_id ) ,
constraint some_table_AK01 unique ( group_id , claim_id ) ,
constraint some_Table_AK02 unique ( group_id , date_created ) ,
)
The first thing to do is identify the most recent creation date for each group:
select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
That gives you the selection criteria you need (1 row per group, with 2 columns: group_id and the highwater created date) to fullfill the 1st part of the requirement (selecting the individual row from each group. That needs to be a virtual table in your final select query:
select *
from dbo.claims_table t
join ( select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
) x on x.group_id = t.group_id
and x.date_created = t.date_created
If the table is not unique by date_created within group_id (AK02), you you can get duplicate rows for a given group.
You can do this with PARTITION and RANK:
select * from
(
select MyPK, fmgcms_cpeclaimid, createdon,
Rank() over (Partition BY fmgcms_cpeclaimid order by createdon DESC) as Rank
from Filteredfmgcms_claimpaymentestimate
where createdon < 'reportstartdate'
) tmp
where Rank = 1
The direct answer is that you can't. You must select either an aggregate or something that you are grouping by.
So, you need an alternative approach.
1). Take you current query and join the base data back on it
SELECT
cpe.*
FROM
Filteredfmgcms_claimpaymentestimate cpe
INNER JOIN
(yourQuery) AS lookup
ON lookup.MaxData = cpe.createdOn
AND lookup.fmgcms_cpeclaimid = cpe.fmgcms_cpeclaimid
2). Use a CTE to do it all in one go...
WITH
sequenced_data AS
(
SELECT
*,
ROW_NUMBER() OVER (PARITION BY fmgcms_cpeclaimid ORDER BY CreatedOn DESC) AS sequence_id
FROM
Filteredfmgcms_claimpaymentestimate
WHERE
createdon < 'reportstartdate'
)
SELECT
*
FROM
sequenced_data
WHERE
sequence_id = 1
NOTE: Using ROW_NUMBER() will ensure just one record per fmgcms_cpeclaimid. Even if multiple records are tied with the exact same createdon value. If you can have ties, and want all records with the same createdon value, use RANK() instead.
You can join the table on itself to get the PK:
Select cpe1.PK, cpe2.MaxDate, cpe1.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe1
INNER JOIN
(
select MAX(createdon) As MaxDate, fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate
group by fmgcms_cpeclaimid
) cpe2
on cpe1.fmgcms_cpeclaimid = cpe2.fmgcms_cpeclaimid
and cpe1.createdon = cpe2.MaxDate
where cpe1.createdon < 'reportstartdate'
Thing I like to do is to wrap addition columns in aggregate function, like max().
It works very good when you don't expect duplicate values.
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, MAX(cpe.fmgcms_claimid) As fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
What you are asking, Sir, is as the answer of RedFilter.
This answer as well helps in understanding why group by is somehow a simpler version or partition over:
SQL Server: Difference between PARTITION BY and GROUP BY
since it changes the way the returned value is calculated and therefore you could (somehow) return columns group by can not return.
You can use as below,
Select X.a, X.b, Y.c from (
Select X.a as a, sum (b) as sum_b from name_table X
group by X.a)X
left join from name_table Y on Y.a = X.a
Example;
CREATE TABLE #products (
product_name VARCHAR(MAX),
code varchar(3),
list_price [numeric](8, 2) NOT NULL
);
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('Dinding', 'ADE', 2000)
INSERT INTO #products VALUES ('Kaca', 'AKB', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
--SELECT * FROM #products
SELECT distinct x.code, x.SUM_PRICE, product_name FROM (SELECT code, SUM(list_price) as SUM_PRICE From #products
group by code)x
left join #products y on y.code=x.code
DROP TABLE #products

Calculate sum of column for selected Ids in SQL

These are my tables:
Member: Id, Points
CartRegister : Id, Member_Id, CartId, RegisterDate, Point
SelectetMembers: Id, Member_Id
Members can register Cart in CartRegister, and in Member.Points All points that a member earned must be calculated and inserted. So I need calculate all points of each SelectedMembers and update the Member table, but I don't know how to implement it.
The following script is in my head:
UPDATE [Member]
SET [Points]=
(
SELECT SUM([CR].[Point]) AS [AllPoints]
FROM [CartRegister] AS [CR]
WHERE [CR].[Member_Id] = --???
)
WHERE [Members].[Member].[Id] IN ( SELECT Member_Id From SelectedMembers )
So I am confused to what is the where clause in Select Sum(Point) if I use
WHERE [CR].[Member_Id] IN ( Select Member_Id From SelectedMembers )
Then the sum of all members be same of sum of all Members Point, maybe I need something like foreach What is your suggestion?
You could use a CTE (Common Table Expression) to first calculate the points for each member, and then use that inforation to update the Members table:
-- this CTE takes all selected members, and calculates their total of points
;WITH MemberPoints AS
(
SELECT
Member_ID,
AllPoints = SUM(Point)
FROM
CartRegister
WHERE
Member_ID IN (SELECT ID FROM SelectedMembers)
GROUP BY
Member_ID
)
UPDATE dbo.Member
SET Points = mp.AllPoints
FROM MemberPoints mp
WHERE dbo.Member.Member_ID = mp.Member_ID
A variation on #marc_s's solution, which is basically the same, only uses a slightly different syntax:
WITH aggregated AS (
SELECT
*,
AllPoints = SUM(Point) OVER (PARTITION BY Member_ID)
FROM CartRegister
WHERE Member_ID IN (SELECT ID FROM SelectedMembers)
)
UPDATE aggregated
SET Points = AllPoints
Check this:
UPDATE [Member]
SET [Points]=
(
SELECT SUM([CR].[Point]) AS [AllPoints]
FROM [CartRegister] AS [CR]
WHERE [CR].[Member_Id] = [Member].[Id]
)
WHERE [Members].[Member].[Id] IN ( SELECT Member_Id From SelectedMembers )

selecting subsequent records arbitrarily with limit

I want to do a query to retrieve the record immediately after a record for any given record, in a result set ordered by list. I do not understand how to make use of the limit keyword in sql syntax to do this.
I can use WHERE primarykey = number, but how will limiting the result help when I will only have one result?
How would I obtain the next record with an arbitrary primary key number?
I have an arbitrary primary key, and want to select the next one ordered by date.
This will emulate the LEAD() analytic function (i. e. select the next value for each row from the table)
SELECT mo.id, mo.date,
mi.id AS next_id, mi.date AS next_date
FROM (
SELECT mn.id, mn.date,
(
SELECT id
FROM mytable mp
WHERE (mp.date, mp.id) > (mn.date, mn.id)
ORDER BY
mp.date, mp.id
LIMIT 1
) AS nid
FROM mytable mn
ORDER BY
date
) mo,
mytable mi
WHERE mi.id = mo.nid
If you just want to select next row for a given ID, you may use:
SELECT *
FROM mytable
WHERE (date, id) >
(
SELECT date, id
FROM mytable
WHERE id = #myid
)
ORDER BY
date, id
LIMIT 1
This will work most efficiently if you have an index on (date, id)
How about something like this, if you're looking for the one after 34
SELECT * FROM mytable WHERE primaryKey > 34 ORDER BY primaryKey LIMIT 1
Might be as simple as:
select *
from mytable
where datecolumn > (select datecolumn from mytable where id = #id)
order by datecolumn
limit 1
(Edited after comments)