join instances of same row - sql

I have a table h containing data like this (OK, not really, it's just an example):
subj_id q1 q2 q3 q4 q5 q6 num
1 1 0 0 1 0 0 1
1 0 0 0 1 0 0 2
2 1 1 1 1 0 1 1
2 1 0 0 1 0 0 2
2 1 1 1 0 0 1 3
3 0 1 0 0 1 1 1
I would like to sum up the q's for each subj_id resulting in a output like this:
subj_id num1 num2 num3
1 2 1 null
2 5 2 4
3 3 null null
but instead I get the following:
subj_id num1 num2 num3
1 2 1 null
1 2 1 null
2 5 2 4
2 5 2 4
2 5 2 4
3 3 null null
where the summed rows are repeated as many times as the subj_id appears in the table.
My query (postgres) looks like this:
select h.subj_id, n1.sum as num1, n2.sum as num2, n3.sum as num3 from ((( h
left join (select subj_id, q1+q2+q3+q4+q5+q6 as sum from h where num=1) as n1 on h.subj_id=n1.subj_id)
left join (select subj_id, q1+q2+q3+q4+q5+q6 as sum from h where num=2) as n2 on h.subj_id=n2.subj_id)
left join (select subj_id, q1+q2+q3+q4+q5+q6 as sum from h where num=3) as n3 on h.subj_id=n3.subj_id) order by h.subj_id
Left join is obvious not the trick to use here, but what to do to skip the repeating rows?
Thanks in advance!

Your query could be easily modified to this:
with cte as (
select subj_id, q1 + q2 + q3 + q4 + q5 + q6 as q, num
from h
)
select
subj_id,
sum(case when num = 1 then q end) as num1,
sum(case when num = 2 then q end) as num2,
sum(case when num = 3 then q end) as num3
from cte
group by subj_id
order by subj_id
I think plan would be much better - no joins at all.
=> sql fiddle demo
Brief explanation why your query is not working and how you can improve it:
You receive more rows that you want because basically what your query does is selecting each row from table h and then join to it sum from table h with num = 1, 2, 3. You have 6 rows in your initial table and it's logical you'll have 6 rows in your result;
If you ever will make some query like this, I strongly suggest you to use aliases for tables in the inner queries. I'll help you to understand queries. In some cases it'll also help you to avoid incorrect results - see my answer in this topic - SQL IN query produces strange result.
-
select
h.subj_id, n1.sum as num1, n2.sum as num2, n3.sum as num3
from h
left join (
select h1.subj_id, h1.q1+h1.q2+h1.q3+h1.q4+h1.q5+h1.q6 as sum
from h as h1
where h1.num=1
) as n1 on h.subj_id=n1.subj_id
left join (
select h2.subj_id, h2.q1+h2.q2+h2.q3+h2.q4+h2.q5+h2.q6 as sum
from h as h2
where h2.num=2
) as n2 on h.subj_id=n2.subj_id
left join (
select h3.subj_id, h3.q1+h3.q2+h3.q3+h3.q4+h3.q5+h3.q6 as sum
from h as h3
where h3.num=3
) as n3 on h.subj_id=n3.subj_id
order by h.subj_id

Related

How to select data with group by and subquery calculations?

I have two tables:
list_table:
id
name
1
a
2
b
3
c
vt_table:
id
list_id
amount
direction_id
1
1
20
1
2
1
12
2
3
1
15
1
4
2
23
1
5
1
20
1
6
1
20
2
7
1
18
1
I need this result:
amount (dir_id = 1 - dir_id = 2), list_id
amount
list_id
41
1
23
2
0
3
Amount is sum of all amount fields in table vt_table where direction_id = 1 minus sum of all amount fileds in table vt_table where direction_id = 2
And I need group this calculations by list_id, and if table have no rows with list_id 3, as example, amount must be 0.
I'm trying to do it with this query:
SELECT vt.list_id
, ((SELECT COALESCE(SUM(vt.amount), 0)
FROM table_name vt
WHERE vt.direction_id = 1)
-
(SELECT COALESCE(SUM(vt.amount), 0)
FROM table_name vt
WHERE direction_id = 2)) AS result
FROM table_name vt
GROUP BY vt.list_id
But I don't know how to group it correctly and make it so that if there were no entries for some list_id, then the amount was 0 for this list_id.
I use PostgreSQL 12.
Here the examples
You can try to use OUTER JOIN with condition aggregate function with COALESCE fucntion.
Query 1:
SELECT l.id,
SUM(COALESCE(CASE WHEN vt.direction_id = 1 THEN vt.amount END,0)) -
SUM(COALESCE(CASE WHEN vt.direction_id = 2 THEN vt.amount END,0)) AS result
FROM table_name vt
RIGHT JOIN list l ON vt.list_id = l.id
GROUP BY l.id
ORDER BY l.id
Results:
| id | result |
|----|--------|
| 1 | 41 |
| 2 | 23 |
| 3 | 0 |
Try something like this, as a start:
SELECT vt.list_id
, COALESCE(SUM(CASE WHEN direction_id = 1 THEN amount END), 0)
- COALESCE(SUM(CASE WHEN direction_id = 2 THEN amount END), 0) AS result
FROM table_name vt
GROUP BY vt.list_id
;
Result using your fiddle:
list_id
result
1
41
2
23
This just misses the cases where there are no vt rows for some list.
Use an outer join to address those cases.
SELECT SUM(CASE WHEN vt.direction_id = 1 THEN vt.amount ELSE 0 END) - SUM(CASE WHEN vt.direction_id = 2 THEN vt.amount ELSE 0 END) as amount,
lt.id as list_id
FROM list_table lt
LEFT OUTER JOIN vt_table vt
ON lt.id = vt.list_id
GROUP BY lt.id
ORDER BY lt.id

SQL Server - group and number matching contiguous values

I have a list of stock transactions and I am using Over(Partition By) to calculate the running totals (positions) by security. Over time a holding in a particular security can be long, short or flat. I am trying to find an efficient way to extract only the transactions relating to the current position for each security.
I have created a simplified sqlfiddle to show what I have so far. The cte query generates the running total for each security (code_id) and identifies when the holdings are long (L), short (s) or flat (f). What I need is to group and number matching contiguous values of L, S or F for each code_id.
What I have so far is this:
; WITH RunningTotals as
(
SELECT
*,
RunningTotal = sum(qty) OVER (Partition By code_id Order By id)
FROM
TradeData
), LongShortFlat as
(
SELECT
*,
LSF = CASE
WHEN RunningTotal > 0 THEN 'L'
WHEN RunningTotal < 0 THEN 'S'
ELSE 'F'
END
FROM
RunningTotals
)
SELECT
*
FROM
LongShortFlat r
I think what I need to do is create a GroupNum column by applying a row_number for each group of L, S and F within each code_id so the results look like this:
id code_id qty RunningTotal LSF GroupNum
1 1 5 5 L 1
2 1 2 7 L 1
3 1 7 14 L 1
4 1 -3 11 L 1
5 1 -5 6 L 1
6 1 -6 0 F 2
7 1 5 5 L 3
8 1 5 10 L 3
9 1 -2 8 L 3
10 1 -4 4 L 3
11 2 5 5 L 1
12 2 3 8 L 1
13 2 -4 4 L 1
14 2 -2 2 L 1
15 2 -2 0 F 2
16 2 6 6 L 3
17 2 -5 1 L 3
18 2 -5 -4 S 4
19 2 2 -2 S 4
20 2 4 2 L 5
21 2 -5 -3 S 6
22 2 -2 -5 S 6
23 3 5 5 L 1
24 3 2 7 L 1
25 3 1 8 L 1
I am struggling to generate the GroupNum column.
Thanks in advance for your help.
[Revised]
Sorry about that, I read your question too quickly. I came up with a solution using a recursive common table expression (below), then saw that you've worked out a solution using LAG. I'll post my revised query anyway, for posterity. Either way, the resulting query is (imho) pretty ugly.
;WITH cteBaseAgg
as (
-- Build the "sum increases over time" data
SELECT
row_number() over (partition by td.code_id order by td.code_id, td.Id) RecurseKey
,td.code_id
,td.id
,td.qty
,sum(tdPrior.qty) RunningTotal
,case
when sum(tdPrior.qty) > 0 then 'L'
when sum(tdPrior.qty) < 0 then 'S'
else 'F'
end LSF
from dbo.TradeData td
inner join dbo.TradeData tdPrior
on tdPrior.code_id = td.code_id -- All for this code_id
and tdPrior.id <= td.Id -- For this and any prior Ids
group by
td.code_id
,td.id
,td.qty
)
,cteRecurse
as (
-- "Set" the first row for each code_id
SELECT
RecurseKey
,code_id
,id
,qty
,RunningTotal
,LSF
,1 GroupNum
from cteBaseAgg
where RecurseKey = 1
-- For each succesive row in each set, check if need to increment GroupNum
UNION ALL SELECT
agg.RecurseKey
,agg.code_id
,agg.id
,agg.qty
,agg.RunningTotal
,agg.LSF
,rec.GroupNum + case when rec.LSF = agg.LSF then 0 else 1 end
from cteBaseAgg agg
inner join cteRecurse rec
on rec.code_id = agg.code_id
and agg.RecurseKey - 1 = rec.RecurseKey
)
-- Show results
SELECT
id
,code_id
,qty
,RunningTotal
,LSF
,GroupNum
from cteRecurse
order by
code_id
,id
Sorry for making this question a bit more complicated than it needed to be but for the sake of closure I have found a solution using the lag function.
In order to achieve what I wanted I continued my cte above with the following:
, a as
(
SELECT
*,
Lag(LSF, 1, LSF) OVER(Partition By code_id ORDER BY id) AS prev_LSF,
Lag(code_id, 1, code_id) OVER(Partition By code_id ORDER BY id) AS prev_code
FROM
LongShortFlat
), b as
(
SELECT
id,
LSF,
code_id,
Sum(CASE
WHEN LSF <> prev_LSF AND code_id = prev_code
THEN 1
ELSE 0
END) OVER(Partition By code_id ORDER BY id) AS grp
FROM
a
)
select * from b order by id
Here is the updated sqlfiddle.

Complicated SQL query request

I have a table
Number Letter KeyLetter
1 a 1
1 b 0
1 c 0
1 d 0
2 e 0
2 f 0
2 g 0
3 h 1
3 i 1
3 j 0
From it I want this:
Number Letter KeyLetter
1 a 1
2 e 0
2 f 0
2 g 0
3 h 1
3 i 1
For each set of numbers, if a letter is a KeyLetter, I want to ignore any non KeyLetters.
If a set of numbers doesn't have an entry where the letter is a KeyLetter, then show all of the entries in that set of numbers.
What SQL query would be able to do this?
Simple answer, return the rows with KeyLetter = 1, and also those with a Number not having a KeyLetter = 1.
select *
from tablename t1
where t1.KeyLetter = 1
or not exists (select * from tablename t2
where t1.Number = t2.Number
and t2.KeyLetter = 1)
Alternatively:
select t1.*
from tablename t1
join (select Number, max(KeyLetter) maxKeyLetter
from tablename
group by Number) t2
on t1.Number = t2.Number and t1.KeyLetter = t2.maxKeyLetter
Or...
select *
from tablename
where (Number, KeyLetter) in
(select Number, max(KeyLetter)
from tablename
group by Number)
The first two are Core ANSI SQL compliant. The latter one uses extension F641, "Row and table constructors".

SQL Rows to Separate Columns

I realise this maybe similar to other questions, but I am stuck!
I am having trouble organising some data into an appropriate format to export to another tool. Basically I have an ID column and then 2 response columns. I would like to separate the ID and then list the responses under each. See the example below for clarification.
I have played around with Pivot and UnPivot but can't get it quite right.
Here is how the data looks now.
ID X1 X2
1 2 Y
1 5 Y
1 3 N
1 7 N
1 6 Y
2 5 N
2 4 Y
2 8 Y
2 3 N
3 5 Y
3 1 N
3 9 N
Here is how I would like the data to look
ID1_X1 ID1_X2 ID2_X1 ID2_X2 ID3_X1 ID3_X2
2 Y 5 N 5 Y
5 Y 4 Y 1 N
3 N 8 Y 9 N
7 N 3 N null null
6 Y null null null null
Here is the code to create/populate the table.
create table #test (ID int, X1 int, X2 varchar(1))
insert into #test values
('1','2','Y'),('1','5','Y'),('1','3','N'),('1','7','N'),
('1','6','Y'),('2','5','N'),('2','4','Y'),('2','8','Y'),
('2','3','N'),('3','5','Y'),('3','1','N'),('3','9','N')
You can do this using aggregation and row_number() . . . assuming you know the ids in advance:
select max(case when id = 1 then x1 end) as x1_1,
max(case when id = 1 then x2 end) as x2_1,
max(case when id = 2 then x1 end) as x1_2,
max(case when id = 2 then x2 end) as x2_2,
max(case when id = 3 then x1 end) as x1_3,
max(case when id = 3 then x2 end) as x2_3
from (select t.*,
row_number() over (partition by id order by (select null)) a seqnum
from #test t
) t
group by seqnum;
I should note that SQL tables represent unordered sets. Your original data doesn't have an indication of the ordering, so this is not guaranteed to put the values in the same order as the original data (actually, there is no such order that that statement is a tautology). If you have another column with the ordering, then you can use that.
Here is a alternative approach to Gordan's good answer using OUTER JOIN's
Considering that there is a Identity column in your table to define the order of X1 in each ID and fixed number of ID's
;WITH FST
AS (SELECT ROW_NUMBER()OVER(ORDER BY IDENTITY_COL) RN,X1 AS ID1_X1,X2 AS ID1_X2
FROM #TEST A
WHERE ID = 1),
SCD
AS (SELECT ROW_NUMBER()OVER(ORDER BY IDENTITY_COL) RN,X1 AS ID2_X1,X2 AS ID2_X2
FROM #TEST A
WHERE ID = 2),
TRD
AS (SELECT ROW_NUMBER()OVER(ORDER BY IDENTITY_COL) RN,X1 AS ID3_X1,X2 AS ID3_X2
FROM #TEST A
WHERE ID = 3)
SELECT ID1_X1,ID1_X2,ID2_X1,ID2_X2,ID3_X1,ID3_X2
FROM FST A
FULL OUTER JOIN SCD B
ON A.RN = B.RN
FULL OUTER JOIN TRD C
ON C.RN = COALESCE(B.RN, A.RN)

Query to pickup certain criteria records

I have following table data
ID Type Code Opt Line Status
26985444 1 100 1 1 S0
26987422 1 25 1 1 S0
26987422 1 25 2 1 S1
26987422 1 25 2 2 S2
26987422 4 25 2 3 S0
26987422 2 30 1 1 S1
26987422 2 30 1 2 S2
26987422 2 30 1 3 S0
26987422 3 35 1 1 S0
26985333 1 75 1 1 S0
26985000 1 55 1 1 S0
26985000 1 65 1 1 S0
Out of above I need to select ONLY following records
26985444 1 100 1 1 S0
26985333 1 75 1 1 S0
How can I write SQL query to this.
Thanks
I am interpreting your question as finding IDs that only appear once in the data.
You can do this with aggregation, looking for the singletons:
select ID, min(Type) as Type, min(Code) as Code, min(Opt) as Opt,
min(Line) as Line, min(Status) as Status
from t
group by id
having count(*) = 1;
If your version of Sybase supports window functions:
select ID, Type, Code, Opt, Line, Status
from (select t.*,
count(*) over (partition by id) as cnt
from t
) t
where cnt = 1
The only sorting I see in your above table is if you were to sort by the Code field while it's converted to a char(). Here's a guess at what you might be looking for using max() and min():
select y.*
from yourtable y
join (
select max(cast(Code as char(3))) maxCode,
min(cast(Code as char(3))) minCode
from yourtable
) t on y.code = t.maxCode or y.code = t.minCode
Condensed Fiddle Example