How to select a row after group by unioned tables? - sql

I need to select the newest row from two tables, two tables have the same schema
Table A and Table B is the same schema, like this:
Table A :
user_id, time_stamp, order_id
1,20190101,100
2,20190103,201
3,20190102,300
5,20180209,99
Table B:
user_id, time_stamp, order_id
1,20190102,101
2,20190101,200
3,20190103,305
4,20190303,900
I want the output is A union B, then select the newer row of a user, order by time_stamp:
output should be:
1,20190102,101
2,20190103,201
3,20190103,305
4,20190303,900
5,20180209,99
How to write this SQL?

You can write as following sample query demo
with unionedTable as (
select * from tableA
union
select * from tableB)
,newerUsersTable as (
select distinct on (u.user_id)u.*
from unionedTable u
order by u.user_id, u.time_stamp desc
)select * from newerUsersTable

The main idea is using FULL OUTER JOIN among two tables, and then using UNION [ALL] for returning data set. So, consider the following SELECT statement with WITH clause :
with a( user_id, time_stamp, order_id ) as
(
select 1,20190101,100 union all
select 2,20190103,201 union all
select 3,20190102,300 union all
select 5,20180209,99
), b( user_id, time_stamp, order_id ) as
(
select 1,20190102,101 union all
select 2,20190101,200 union all
select 3,20190103,305 union all
select 4,20190303,900
), c as
(
select a.user_id as user_id_a, a.time_stamp as time_stamp_a, a.order_id as order_id_a,
b.user_id as user_id_b, b.time_stamp as time_stamp_b, b.order_id as order_id_b
from a full outer join b
on a.user_id = b.user_id
), d as
(
select user_id_a, time_stamp_a, order_id_a
from c
where coalesce(time_stamp_b,time_stamp_a) <= time_stamp_a
union all
select user_id_b, time_stamp_b, order_id_b
from c
where time_stamp_b >= coalesce(time_stamp_a,time_stamp_b)
)
select user_id_a as user_id, time_stamp_a as time_stamp, order_id_a as order_id
from d
order by user_id_a;
user_id time_stamp order_id
1 20190102 101
2 20190103 201
3 20190103 305
4 20190303 900
5 20180209 99
Demo

Use Group by(user_id) to show all user_id
Use max(time_stamp) get the newer row of user
SELECT aa.* from (select * from a union SELECT * from b ) aa
JOIN
(select user_id,max(time_stamp) as new_time
from (select * from a union SELECT * from b ) u
group by u.user_id) bb
on bb.new_time=aa.time_stamp and bb.user_id=aa.user_id
order by aa.user_id;
SQL Fiddle

I would simply do:
select user_id, time_stamp, order_id
from (select ab.*,
row_number() over (partition by user_id order by time_stamp desc) as seqnum
from (select a.* from a union all
select b.* from b
) ab
) ab
where seqnum = 1;

Related

Fetch rows with same id and different prod_id

I have two tables: tbltest1 and tbltest2
I want all the distinct rows of both tables, except the ones that have null in prod_id unless there is not any row in both tables with the same id with a not null prod_id
I tried to make a set with all the values then DISTINCTed to take only the unique ones and after used ROWNUMBER() OVER().:
with p as(
select t.*
from tbltest1 as t
union all
select d.*
from tbltest2 as d
),
s as (
select distinct colb, num,
ROW_NUMBER() OVER (PARTITION BY num ORDER BY colb DESC) as rnk
from p
)select *
from s
where rnk = 1
How can I achieve that? Is there also any other more efficient way to do it instead of this logic?
Use UNION for the 2 tables to remove the duplicates (if any) and then NOT EXISTS:
WITH cte AS (
SELECT prod_id, dn FROM tbltest2
UNION
SELECT prod_id1, dn1 FROM tbltest1
)
SELECT c1.*
FROM cte c1
WHERE c1.prod_id IS NOT NULL
OR NOT EXISTS (SELECT 1 FROM cte c2 WHERE c2.dn = c1.dn AND c2.prod_id IS NOT NULL)
See the demo.

Getting MAX datetime event from multiple tables, and outputing a simple list of most recent events by ID

I have a table:
and multiple other tables - consider them purchases, in this example:
And would like an output table to show the most recent purchase (NB that there may be multiple instances of a purchase within each table), by id from the main table:
The id can be a customer number, for example.
I've tried using OUTER APPLY on each purchase table, getting the TOP 1 by datetime desc, then getting the max value from the OUTER APPLY tables, but I would not get the table name - eg. Apples, just the datetime.
Another idea was to UNION all of the purchase tables together in a join with the main table (by id), and pick out the top 1 datetime and a table name, but I don't think this would be very efficient for a lot of rows:
SELECT MT.id, MT.gender, MT.age,
b.Name as LastPurchase, b.dt as LastPurchaseDateTime
FROM MainTable MT
LEFT JOIN (
SELECT id, Name, MAX(dt) FROM
(
SELECT id, 'Apples' as Name, ApplesDateTime as dt FROM ApplesTable
UNION
SELECT id, 'Pears' as Name, PearsDateTime as dt FROM PearsTable
UNION
SELECT id, 'Bananas' as Name, BananasDateTime as dt FROM BananasTable
)a
GROUP BY etc
)b
Does anyone have a more sensible idea?
Many thanks in advance.
I would go for a lateral join:
select m.*, x.*
from maintable m
outer apply (
select top (1) x.*
from (
select id, 'apples' as name, applesdatetime as dt from applestable
union all select id, 'pears', pearsdatetime from pearstable
union all select id, 'bananas', bananasdatetime from bananastable
) x
where x.id = m.id
order by dt desc
) x
I would suggest apply:
SELECT MT.id, mt.gender, mt.age, p.*
FROM MainTable MT OUTER APPLY
(SELECT p.name, p.dt
FROM (SELECT id, 'Apples' as Name, ApplesDateTime as dt FROM ApplesTable
UNION ALL
SELECT id, 'Pears' as Name, PearsDateTime as dt FROM PearsTable
UNION ALL
SELECT id, 'Bananas' as Name, BananasDateTime as dt FROM BananasTable
) p
WHERE p.id = mt.id
ORDER BY dt DESC
) p

Select number of IDs in more than one table (from three tables)

I need the count of this:
select distinct ID
from (
select ID from A
union all
select ID from B
union all
select ID from C
) ids
GROUP BY ID HAVING COUNT(*) > 1;
but I have no idea how to do it.
Use a subquery:
select count(*)
from (select ID
from (select ID from A
union all
select ID from B
union all
select ID from C
) ids
group by ID
having count(*) > 1
) i;
SELECT DISTINCT is almost never needed with GROUP BY and definitely not in this case.
You just want to find the id that appear 2 more times in the A,B,C table, the SQL is below:
select count(1) from (
select
id,
count(1)
from
(
select ID from A
union all
select ID from B
union all
select ID from C
)
group by id having(count(1)>1)
) tmp

Select unique field

I have this table:
TableA
----------------
ID (pk) Name
1 A
2 B
3 C
4 A
5 D
6 A
7 B
8 A
9 D
10 C
....
I need to randomly extract with a SELECT TOP 5 ID, Name FROM TableA
with Name that must be unique within the 5 records.
I'm trying :
;WITH group
AS
(
SELECT ID, Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY NewId()) rn
FROM TableA
)
SELECT ID, Name
FROM group
WHERE rn = 1
but every time I have quite the same results.
I need to select between all the values for ID at random, assuring that Name will always be different for each record.
I hope the problem is understandable. Any ideas?
Found a solution. It seems to work!
;WITH group
AS (
SELECT ID, Name, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY NewId()) rn FROM TableA )
SELECT top 5 ID, Name, NewId() [NewId]
FROM group
WHERE rn = 1
ORDER BY [newid]
Perhaps the problem is that although newid() is random, it may tend to be sequential. Does this fix the problem?
WITH g as (
SELECT ID, Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY RAND(CHECKSUM(NewId()))) as rn
FROM TableA
)
SELECT ID, Name
FROM g
WHERE rn = 1;
CREATE TABLE #test(ID INT ,Name VARCHAR(1)) INSERT INTO #test(ID ,Name )
SELECT 1,'A' UNION ALL SELECT 2,'B' UNION ALL SELECT 3,'C' UNION ALL
SELECT 4,'A' UNION ALL SELECT 5,'D'UNION ALL SELECT 6,'A' UNION ALL
SELECT 7,'B' UNION ALL SELECT 8,'A'UNION ALL SELECT 9,'D' UNION ALL
SELECT 10,'C'
SELECT T1.ID ,T1.Name FROM #test T1
JOIN ( SELECT TOP 5 Name FROM #test T2 ORDER BY NEWID()
) A ON T1.Name = A.Name ORDER BY A.Name
;WITH group
AS
(
SELECT ID, Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY NewId()) rn
FROM TableA
)
SELECT top 5 ID, Name, NewId() [NewId]
FROM group
WHERE rn = 1
ORDER BY [newid]

A simple way to sum a result from UNION in MySQL

I have a union of three tables (t1, t2, t3).
Each rerun exactly the same number of records, first column is id, second amount:
1 10
2 20
3 20
1 30
2 30
3 10
1 20
2 40
3 50
Is there a simple way in SQL to sum it up, i.e. to only get:
1 60
2 80
3 80
select id, sum(amount) from (
select id,amount from table_1 union all
select id,amount from table_2 union all
select id,amount from table_3
) x group by id
SELECT id, SUM(amount) FROM
(
SELECT id, SUM(amount) AS `amount` FROM t1 GROUP BY id
UNION ALL
SELECT id, SUM(amount) AS `amount` FROM t2 GROUP BY id
) `x`
GROUP BY `id`
I groupped each table and unioned because i think it might be faster, but you should try both solutions.
Subquery:
SELECT id, SUM(amount)
FROM ( SELECT * FROM t1
UNION ALL SELECT * FROM t2
UNION ALL SELECT * FROM t3
)
GROUP BY id
Not sure if MySQL uses common table expression but I would do this in postgres:
WITH total AS(
SELECT id,amount AS amount FROM table_1 UNION ALL
SELECT id,amount AS amount FROM table_2 UNION ALL
SELECT id,amount AS amount FROM table_3
)
SELECT id, sum(amount)
FROM total
I think that should do the trick as well.
As it's not very clear from previous answers, remember to give aliases (on MySQL/MariaDb) or you'll get error:
Every derived table must have its own alias
select id, sum(amount) from (
select id,amount from table_1 union all
select id,amount from table_2 union all
select id,amount from table_3
) AS 'aliasWhichIsNeeded'
group by id
Yes!!! Its okay! Thanks!!!!
My code finishing:
SELECT SUM(total)
FROM (
(SELECT 1 as id, SUM(e.valor) AS total FROM entrada AS e)
UNION
(SELECT 1 as id, SUM(d.valor) AS total FROM despesa AS d)
UNION
(SELECT 1 as id, SUM(r.valor) AS total FROM recibo AS r WHERE r.status = 'Pago')
) x group by id
SELECT BANKEMPNAME, workStation, SUM (CALCULATEDAMOUNT) FROM(
SELECT BANKEMPNAME, workStation, SUM(CALCULATEDAMOUNT) AS CALCULATEDAMOUNT,SALARYMONTH
FROM dbo.vw_salaryStatement
WHERE (ITEMCODE LIKE 'A%')
GROUP BY BANKEMPNAME,workStation, SALARYMONTH
union all
SELECT BANKEMPNAME, workStation, SUM(CALCULATEDAMOUNT) AS CALCULATEDAMOUNT,SALARYMONTH
FROM dbo.vw_salaryStatement
WHERE (ITEMCODE NOT LIKE 'A%')
GROUP BY BANKEMPNAME, workStation, SALARYMONTH) as t1
WHERE SALARYMONTH BETWEEN '20220101' AND '20220131'
group by BANKEMPNAME, workStation
order by BANKEMPNAME asc
IN MSSQL You can write this way, But Doing UNION ALL THE Column should be the same for both ways.
I have given this example So that you can understand the process...