I am using Postgresql, I created two subqueries that return results as follows:
firm_id type_1 fee_1
1 2 100
2 4 300
5 1 100
firm_id type_2 fee_2
1 3 200
2 3 200
3 2 150
4 5 300
I would like to yield a result as:
firm_id type_1 type_2 total_fee
1 2 3 300
2 4 3 500
3 0 2 150
4 0 5 300
5 1 0 100
Any helps appreciated!
Use FULL JOIN and coalesce():
with q1(firm_id, type_1, fee_1) as (
values
(1, 2, 100),
(2, 4, 300),
(5, 1, 100)),
q2 (firm_id, type_2, fee_2) as (
values
(1, 3, 200),
(2, 3, 200),
(3, 2, 150),
(4, 5, 300))
select
firm_id,
coalesce(type_1, 0) type_1,
coalesce(type_2, 0) type_2,
coalesce(fee_1, 0)+ coalesce(fee_2, 0) total_fee
from q1
full join q2
using (firm_id);
firm_id | type_1 | type_2 | total_fee
---------+--------+--------+-----------
1 | 2 | 3 | 300
2 | 4 | 3 | 500
3 | 0 | 2 | 150
4 | 0 | 5 | 300
5 | 1 | 0 | 100
(5 rows)
SELECT firm_id
,coalesce(t.type_1, 0) type_1
,coalesce(b.type_1, 0) type_2
,coalesce(t.fee_1, 0) + coalesce(b.fee_1, 0) total_fee
FROM (
SELECT * --Your first select query
FROM tablea
) t
FULL JOIN (
SELECT * --Your second select query
FROM tableb
) b using (firm_id)
FULL JOIN: combines the results of both left and right outer joins.
The joined table will contain all records from both tables, and fill in NULLs for missing matches on either side.
COALESCE function returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display
SELECT coalesce( t1."firm_id", t2."firm_id" ) as firm_id,
coalesce( t1."type_1", 0 ) as type_1,
coalesce( t2."type_2", 0 ) as type_2,
coalesce( t1."fee_1", 0 )
+
coalesce( t2."fee_2", 0 ) as total_fee
FROM table1 t1
FULL JOIN table2 t2
ON t1."firm_id" = t2."firm_id"
where table1 and table2 must be replaced by your subqueries
see a demo: http://sqlfiddle.com/#!15/6d391/2
select coalesce(sq1.firm_id, sq2.firm_id) as firm_id, coalesce(type_1, 0), coalesce(type_2, 0), coalesce(fee_1, 0)+coalesce(fee_2, 0) as total_fee
from <subquery1> as sq1
outer join <subquery1> as sq1
on sq1.firm_id=sq2.firm_id
If you have two subqueries, say a and b, that give you set A and set B, what you're looking for is a full join A x B where at least one firm_id in the full join is not null, with a calculated column total_fee = fee_1 + fee_2.
I'm not that familiar with postgresql syntax but it should be something like
select
-- get rid of the columns you don't want
a.firm_id, a.type_1, a.type_2, a.fee_1,
b.firm_id, b.type_1, b.type_2, b.fee_2,
a.fee_1 + b.fee_2 as total_fee
from ( subquery_1 here ) as a
full join ( subquery_2 here) as b
on
b.firm_id = a.firm_id and
b.type_1 <> a.type_1
where
a.firm_id is not null or b.firm_id is not null
Related
I have a table A:
entity_id name
------------------
1 Test1
2 Test2
3 Test3
4 Test4
5 Test5
6 Test6
I have a table B:
entity_id value1 value2
-----------------------------
1 10 20
1 15 30
2 10 25
1 9 45
3 null 1
2 45 50
3 20 null
I need to write a single query to select the entity_id and name from Table A and count the total occurrences for an entity_id of columns value1 and value2 from Table B and then the total of those column counts (null doesn't count).
So my output table would be:
entity_id name value1_count value2_count total_count
----------------------------------------------------------------------
1 Test1 3 3 6
2 Test2 1 2 3
3 Test3 1 1 2
4 Test4 0 0 0
5 Test5 0 0 0
6 Test6 0 0 0
I am having trouble summing the count of value1 and count of value2 and outputting that value in the total_count per unique entity_it.
This is the query I have so far:
SELECT DISTINCT a.entity_id, a.name
, count(b.value1) AS value1_count, count(b.value2) AS value2_count, sum(2) AS total_count
FROM a
LEFT JOIN b ON a.entity_id = b.entity_id
GROUP BY a.entity_id, a.name
I know that the sum(2) as total_count is incorrect and doesn't get me what I want.
SELECT entity_id, a.name
, COALESCE(b.v1_ct, 0) AS value1_count
, COALESCE(b.v2_ct, 0) AS value2_count
, COALESCE(b.v1_ct + b.v2_ct, 0) AS total_count
FROM a
LEFT JOIN (
SELECT entity_id, count(value1) AS v1_ct, count(value2) AS v2_ct
FROM b
GROUP BY 1
) b USING (entity_id);
db<>fiddle here
Aggregate first, join later. That's simpler and faster. See:
Query with LEFT JOIN not returning rows for count of 0
count() never produces NULL. Only the LEFT JOIN can introduce NULL values for counts in this query, so v1_ct and v2_ct are either both NULL or both NOT NULL. Hence COALESCE(v1_ct + v2_ct, 0) is ok. (Else, one NULL would nullify the other summand in the addition.)
try this :
WITH list AS
(
SELECT b.entity_id
, count(*) FILTER (WHERE b.value1 IS NOT NULL) OVER () AS value1_count
, count(*) FILTER (WHERE b.value2 IS NOT NULL) OVER () AS value2_count
FROM Table_B AS b
GROUP BY b.entity_id
)
SELECT a.entity_id, a.name
, COALESCE(l.value1_count, 0)
, COALESCE(l.value2_count,0)
, COALESCE(l.value1_count + l.value2_count, 0) AS total_count
FROM Table_A AS a
LEFT JOIN list AS l
ON a.entity_id = l.entity_id
I have a big data table that looks something like this
ID Marker Value1 Value2
================================
1 A 10 11
1 B 12 13
1 C 14 15
2 A 10 11
2 B 13 12
2 C
3 A 10 11
3 C 12 13
I want to search this data by the following data, which is user input and not stored in a table:
Marker Value1 Value2
==========================
A 10 11
B 12 13
C 14 14
The result should be something like this:
ID Marker Value1 Value2 Match?
==========================================
1 A 10 11 true
1 B 12 13 true
1 C 14 15 false
2 A 10 11 true
2 B 13 12 true
2 C false
3 A 10 11 true
3 C 12 13 false
And ultimately this (the above table is not necessary, it should demonstrate how these values came to be):
ID Matches Percent
========================
1 2 66%
2 2 66%
3 1 33%
I'm searching for the most promising approach to get this to work in SQL (PostgreSQL to be exact).
My ideas:
Create a temporary table, join it with the above one and group the result
Use CASE WHEN or a temporary PROCEDURE to only use a single (probably bloated) query
I'm not satisified with either approach, hence the question. How can I compare two tables like these efficiently?
The user input can be supplied using a VALUES clause in a common table expression and that can then be used in a left join with the actual table.
with user_input (marker, value1, value2) as (
values
('A', 10, 11),
('B', 12, 13),
('C', 14, 14)
)
select d.id,
count(*) filter (where (d.marker, d.value1, d.value2) is not distinct from (u.marker, u.value1, u.value2)),
100 * count(*) filter (where (d.marker, d.value1, d.value2) is not distinct from (u.marker, u.value1, u.value2)) / cast(count(*) as numeric) as pct
from data d
left join user_input u on (d.marker, d.value1, d.value2) = (u.marker, u.value1, u.value2)
group by d.id
order by d.id;
Returns:
id | count | pct
---+-------+------
1 | 2 | 66.67
2 | 2 | 66.67
3 | 1 | 50.00
Online example: https://rextester.com/OBOOD9042
Edit
If the order of the values isn't relevant (so (12,13) is considered the same as (13,12) then the comparison gets a bit more complicated.
with user_input (marker, value1, value2) as (
values
('A', 10, 11),
('B', 12, 13),
('C', 14, 14)
)
select d.id,
count(*) filter (where (d.marker, least(d.value1, d.value2), greatest(d.value1, d.value2)) is not distinct from (u.marker, least(u.value1, u.value2), greatest(u.value1, u.value2)))
from data d
left join user_input u on (d.marker, least(d.value1, d.value2), greatest(d.value1, d.value2)) = (u.marker, least(u.value1, u.value2), greatest(u.value1, u.value2))
group by d.id
order by d.id;
You can use a CTE to pre-compute the matches. Then a simple aggregation will do the trick. Assuming your parameters are:
Marker Value1 Value2
==========================
m1 x1 y1
m2 x2 y2
m3 x3 y3
You can do:
with x as (
select
id,
case when
marker = :m1 and (value1 = :x1 and value2 = :y1 or value1 = :y1 and value2 = :x1)
or marker = :m2 and (value1 = :x2 and value2 = :y2 or value1 = :y2 and value2 = :x2)
or marker = :m3 and (value1 = :x3 and value2 = :y3 or value1 = :y3 and value2 = :x3)
then 1 else 0 end as matches
from t
)
select
id,
sum(matches) as matches,
100.0 * sum(matches) / count(*) as percent
from x
group by id
Try this:
CREATE TABLE #Temp
(
Marker nvarchar(50),
Value1 nvarchar(50),
Value2 nvarchar(50)
)
INSERT INTO #Temp Values ('A', '10', '11')
INSERT INTO #Temp Values ('B', '12', '13')
INSERT INTO #Temp Values ('C', '14', '14')
SELECT m.Id, m.Marker, m.Value1, m.Value2,
(Select
CASE
WHEN COUNT(*) = 0 THEN 'False'
WHEN COUNT(*) <> 0 THEN 'True'
END
FROM #Temp t
WHERE t.Marker = m.Marker and t.Value1 = m.Value1 and t.Value2 = m.Value2) as Matches
FROM [Test].[dbo].[Markers] m
ORDER BY Matches DESC
Drop TABLE #Temp
If it's exactly what you want, I try to solve the second part of it.
I have these 3 tables mar_tb, sel_tb, cust_tb.
mar_tb
mar_id (int) PK
mar_name (nvarchar(50)) not null
sel_tb
sel_id (int) pk
mar_id (int) (not null) FK
cust_tb
cust_id (int) pk
cust_active (bit) not null)
mar_id (int) (not null) FK
I have this data in each table
mar_tb
mar_id | mar_name
-----------------
1 mar_one
2 mar_two
3 mar_three
sel_tb
sel_id | mar_id
----------------------------
1 1
2 1
cust_tb
cust_id | cust_active | mar_id
----------------------------
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 1 1
7 1 1
8 1 1
9 1 1
10 1 1
11 1 1
12 1 1
13 1 2
14 1 2
15 1 2
16 1 2
All I need to get result like this
mar_name | cus_num | sel_num
--------------------------------
mar_one 12 2
mar_three 0 0
mar_two 4 0
I tried to write simple code like this
select
mar_tb.mar_name,
count(cust_tb.cust_id) as 'cus_num',
count(sel_tb.sel_id) over (PARTITION by mar_tb.mar_name ) as 'sel_num'
from
mar_tb
left join
cust_tb on cust_tb.mar_id = mar_tb.mar_id
left join
sel_tb on sel_tb.mar_id = mar_tb.mar_id
group by
mar_tb.mar_name, sel_tb.sel_id
and I got this result
mar_name | cus_num | sel_num
--------------------------------
mar_one 12 2
mar_one 12 2
mar_three 0 0
mar_two 4 0
Then I solved this issue by using subquery like this
select
mar_name, cus_num, sel_num
from
(select
mar_tb.mar_name,
count(cust_tb.cust_id) as 'cus_num',
count(sel_tb.sel_id)over (PARTITION by mar_tb.mar_name ) as 'sel_num'
from
mar_tb
left join
cust_tb on cust_tb.mar_id = mar_tb.mar_id
left join
sel_tb on sel_tb.mar_id = mar_tb.mar_id
group by
mar_tb.mar_name, sel_tb.sel_id) a
group by
mar_name, sel_num, cus_num
Finally I got what i need
mar_name | cus_num | sel_num
--------------------------------
mar_one 12 2
mar_three 0 0
mar_two 4 0
The question is: is there any way to get what I need without using subquery or (distinct) clause?
If I understand correctly, I would use union all and group by:
select m.mar_name, sum(cus_num) as cus_num, sum(sel_num) as sel_num
from mar_tb m left join
((select c.mar_id, 1 as cus_num, 0 as sel_num
from cust_tb c
) union all
(select s.mar_id, 0, 1
from sel_tb s
)
) cs
on m.mar_id = cs.mar_id
group by m.mar_name;
An alternative approach uses apply or correlated subqueries:
select m.mar_name, c.cus_num, s.sel_num
from mar_tb m outer apply
(select count(*) as cus_num
from cust_tb c
where c.mar_id = m.mar_id
) c outer apply
(select count(*) as sel_num
from sel_tb s
where s.mar_id = m.mar_id
) s;
Both of these use subqueries, but they are much more direct calculations of the values you want. I strongly encourage you to be more concerned about using distinct than about using subqueries.
If I understand your question correctly then you can try 2 ways as well shown below.
DECLARE #Mar_tb AS TABLE (mar_Id INT, mar_Name VARCHAR(100))
DECLARE #Sel_tb AS TABLE (sel_Id INT,Mar_Id INT)
DECLARE #cust_tb AS TABLE
(
cust_id int,
cust_active bit,
mar_id int)
INSERT INTO #mar_tb (mar_id ,mar_name)
VALUES
(1,'mar_one'),
(2,'mar_two'),
(3,'mar_three')
INSERT INTO #sel_tb(
sel_id ,mar_id)
VALUES(1 , 1),
(2 , 1),
(3 , 2)
INSERT INTO #cust_tb
(cust_id , cust_active , mar_id)
VALUES
( 1 , 1 , 1),
( 2 , 1 , 1),
( 3 , 1 , 1),
( 4 , 1 , 1),
( 5 , 1 , 1),
( 6 , 1 , 1),
( 7 , 1 , 1),
( 8 , 1 , 1),
( 9 , 1 , 1),
(10 , 1 , 1),
(11 , 1 , 1),
(12 , 1 , 1),
(13 , 1 , 2),
(14 , 1 , 2),
(15 , 1 , 2),
(16 , 1 , 2)
/*
mar_name | cus_num | sel_num
--------------------------------
mar_one 12 2
mar_three 0 0
mar_two 4 0 */
/************** Approach 1****/
SELECT m.mar_name,COUNT(DISTINCT s.sel_id) as sal, Count(DISTINCT c.cust_id) as customer
FROM #mar_tb m
LEFT OUTER JOIN #sel_tb s ON s.mar_id = m.mar_id
LEFT OUTER JOIN #cust_tb c ON c.mar_id = m.mar_id
GROUP BY m.mar_name
/************** Approach 2****/
SELECT m.mar_Name,COUNT(c.cust_Id) AS CustCount, tmp.saleCount
FROM #mar_tb m
LEFT OUTER JOIN #cust_tb c ON c.mar_id = m.mar_id
CROSS APPLY (SELECT COUNT(1) As saleCount
FROM #Sel_tb s
WHERE s.Mar_Id = m.mar_Id )tmp
GROUP BY m.mar_name,tmp.saleCount
I'm stuck on this simple select and don't know what to do.
I Have this:
ID | Group
===========
1 | NULL
2 | 100
3 | 100
4 | 100
5 | 200
6 | 200
7 | 100
8 | NULL
and want this:
ID | Group
===========
1 | NULL
2 | 100
3 | 100
4 | 100
7 | 100
5 | 200
6 | 200
8 | NULL
all group members keep together, but others order by ID.
I can not write this script because of that NULL records. NULL means that there is not any group for this record.
First you want to order your rows by the minimum ID of their group - or their own ID in case they belong to no group.Then you want to order by ID. That is:
order by min(id) over (partition by case when grp is null then id else grp end), id
If IDs and groups can overlap (i.e. the same number can be used for an ID and for a group, e.g. add a record for ID 9 / group 1 to your sample data) you should change the partition clause to something like
order by min(id) over (partition by case when grp is null
then 'ID' + cast(id as varchar)
else 'GRP' + cast(grp as varchar) end),
id;
Rextester demo: http://rextester.com/GPHBW5600
What about data after a null? In a comment you said don't sort the null.
declare #T table (ID int primary key, grp int);
insert into #T values
(1, NULL)
, (3, 100)
, (5, 200)
, (6, 200)
, (7, 100)
, (8, NULL)
, (9, 200)
, (10, 100)
, (11, NULL)
, (12, 150);
select ttt.*
from ( select tt.*
, sum(ff) over (order by tt.ID) as sGrp
from ( select t.*
, iif(grp is null or lag(grp) over (order by id) is null, 1, 0) as ff
from #T t
) tt
) ttt
order by ttt.sGrp, ttt.grp, ttt.id
ID grp ff sGrp
----------- ----------- ----------- -----------
1 NULL 1 1
3 100 1 2
7 100 0 2
5 200 0 2
6 200 0 2
8 NULL 1 3
10 100 0 4
9 200 1 4
11 NULL 1 5
12 150 1 6
I have tricky grouping problem for our business reasons, I have a table which has values like this
----------------------------
| NAME | TYPE | VALUE |
----------------------------
| N1 | T1 | V1 |
| N1 | T2 | V2 |
| N1 | NULL | V3 |
| N2 | T2 | V4 |
| N2 | NULL | V5 |
| N3 | NULL | V6 |
-----------------------------
I need to group it in a way that,
The first level grouping will be by name.
At the second level,
When the available types are T1,T2 and NULL, group T1 and NULL together and have T2 grouped seperately.
When the available types are T2 and NULL, group NULL with T2.
When NULL is the only available type, just have it as it is.
The expected O/P for the above table is,
----------------------------
| N1 | T1 | V1+V3 |
| N1 | T2 | V2 |
| N2 | T2 | V4+V5 |
| N3 | NULL | V6 |
-----------------------------
How to achieve this in snowflake sql. Or any other server, so that I can find an equivalent in Snowflake.
The following query should work:
SELECT t1.NAME, COALESCE(TYPE, MIN_TYPE), SUM(VALUE)
FROM mytable AS t1
JOIN (
SELECT NAME, MIN(TYPE) AS MIN_TYPE
FROM mytable
GROUP BY NAME
) AS t2 ON t1.NAME = t2.NAME
GROUP BY t1.NAME, COALESCE(TYPE, MIN_TYPE)
The query uses a derived table in order to extract the MIN(TYPE) value per NAME. Using COALESCE we can then convert NULL to either T1 or T2.
Edit:
You can create a pivoted version of the expected result set using the following query:
SELECT NAME,
CASE
WHEN T1SUM IS NULL THEN 0
ELSE COALESCE(T1SUM, 0) + COALESCE(NULLSUM,0)
END AS T1SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NOT NULL
THEN COALESCE(T2SUM, 0) + COALESCE(NULLSUM,0)
ELSE COALESCE(T2SUM, 0)
END AS T2SUM,
CASE
WHEN T1SUM IS NULL AND T2SUM IS NULL THEN COALESCE(NULLSUM,0)
ELSE 0
END AS NULLSUM
FROM (
SELECT NAME,
SUM(CASE WHEN TYPE = 'T1' THEN VALUE END) AS T1SUM,
SUM(CASE WHEN TYPE = 'T2' THEN VALUE END) AS T2SUM,
SUM(CASE WHEN TYPE IS NULL THEN VALUE END) AS NULLSUM
FROM mytable
GROUP BY NAME) AS t
So in Giorgos's answer that totals are given in a pivoted, or single row be case form, not many rows per case, and this can be written simpler:
with this data:
WITH data_table(name, type, value) AS (
SELECT * FROM VALUES
(10, 1, 100 ),
(10, 2, 200 ),
(10, null, 400 ),
(11, 2, 100 ),
(11, null, 200 ),
(12, null, 100 )
)
and this SQL
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1;
we get:
NAME
T1_VAL
T2_VAL
TNULL_VAL
C1_SUM
C2_SUM
C3_SUM
10
100
200
400
500
200
null
11
null
100
200
null
300
null
12
null
null
100
null
null
100
which shows for the 10 row the null sum binds with 1 sum, for the 11 row the null sum binds with the 2 sum, and in the 12 row we get the null sum by itself.
We can unpivot these values if we wish, but joining to a mini table with 3 rows like so:
SELECT d.name,
p.c2 as type,
case p.c1
WHEN 1 then d.c1_sum
WHEN 2 then d.c2_sum
ELSE d.c3_sum
end as value
FROM (
SELECT name
,SUM(IFF(type=1, value, null)) as t1_val
,SUM(IFF(type=2, value, null)) as t2_val
,SUM(IFF(type is null, value, null)) as tnull_val
,IFF(t1_val is not null, t1_val + zeroifnull(tnull_val), null) as c1_sum
,IFF(t1_val is not null, t2_val, t2_val + zeroifnull(tnull_val)) as c2_sum
,IFF(t1_val is null AND t2_val is null, tnull_val, null) as c3_sum
FROM data_table
GROUP BY 1
) AS d
JOIN (
SELECT column1 as c1, column2 as c2
FROM VALUES (1,'T1'),(2,'T2'),(null,'null')
) AS p
ON ((d.c1_sum is not null AND p.c1 = 1)
OR (d.c2_sum is not null AND p.c1 = 2)
OR (d.c3_sum is not null AND p.c1 is null))
ORDER BY 1,2;
which gives the original requested output:
NAME
TYPE
VALUE
10
T1
500
10
T2
200
11
T2
300
12
null
100