Datediff with CASE and Group By

Datediff with CASE and Group By - hive

I am trying to collapse a table into a single row per id, having trouble including a DATEDIFF function with the GROUP BY and CASE statements:
SELECT
o.id1
,o.id2
,count(case when o.type = 'TEST' and DATEDIFF(o.dte, m.dte) < 30 then id3 end) as win_30
FROM table1 m
LEFT JOIN table2 0
ON (m.id = o.id2)
WHERE o.load_dt BETWEEN '20181001' AND '20181010'
GROUP BY 1,2;
I keep getting a 'Expression not in GROUP BY' error when I run this code, and the problem seems to be with the datediff (when I take out 'and DATEDIFF(o.dte, m.dte) < 30' it runs just fine). Do I need the datediff in the GROUP BY somehow?
Any help is appreciated. Thanks!

I am not getting any error for similar query.
hive> select * from test_d1;
OK
1 2 10
3 4 20
5 6 30
hive> select * from test_d2;
OK
1 5
3 10
Query - hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;
Output -
OK
1 2 1
3 4 1
5 6 1
Tried with position in group by instead of columns (you have to set set hive.groupby.orderby.position.alias = true)
hive> select t1.id1, t1.id2, count(case when t2.id3=1 and nvl(t1.dte,t2.dte) < 10 then 1 else 0 end) as col3 from test_d1 t1 left outer join test_d2 t2 on t1.id1=t2.id3 group by 1,2;
OK
1 2 1
3 4 1
5 6 1
One more observation - why do you want to go for left outer join when the columns in select list is from right side of the table

Related

SQL: How to join two columns in a specific way?

I am working with an Oracle Database and I am new to SQL in general.
I have a table with data and month columns. After filtering the data I have just a few rows left. But I want to get two columns: 1-st column with 12 months listed (1,2,3,4,5,6,7,8,9,10,11,12) and second column with values from original data (if exist) or zeroes.
F.e.: Original data:
MONTH VALUE
9 96
What I want:
MONTH VALUE
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 96
10 0
11 0
12 0
I have already tried to use join and union all functions but it didn't work out.

First generate a sequence of 12 months number then use left join
select monthNo, coalesce(Value,0) as value from
(
SELECT 1 MonthNo
FROM dual
CONNECT BY LEVEL <= 12
)A left join originaltable b on A.monthNo=b.month

is this what are you looking for?
WITH tab AS(SELECT LEVEL AS m , null as value FROM DUAL CONNECT BY LEVEL <= 12)
, tab2 AS(SELECT 9 as m, 96 as VALUE FROM DUAL)
select t1.m
,coalesce(t2.value,0) as value
from tab t1
left join tab2 t2 on t1.m = t2.m
order by 1

Bro enjoy...
select months.month ,original_data.VALUE
from original_data
Right JOIN (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) months(month) on
months.month = original_data.MONTH
order by months.month --optional

Running total (COUNT) SQL Server

I currently have this result
ID Code
1 AAA12
2 F5
3 GOFK568
4 G77
5 JLKJ4
6 FOG0
Now what i want to do is to create a third column that keeps a running total for codes that are above 4 in length.
Now, i have this code that gives me the sum of the code with above 4 in length.
SELECT * ,
SUM(CASE WHEN LENGTH(CODE) > 4 THEN 1 ELSE 0 END) AS [Count]
FROM Table1;
But this gives me this result
ID Code Count
1 AAA12 3
I am looking for a result like this
ID Code Running_Total
1 AAA12 1
2 F5 1
3 GOFK568 2
4 G77 2
5 JLKJ4 3
6 FOG0 3
I was working on something similar to this
SELECT * ,
CASE WHEN LENGTH(CODE) > 4 THEN (SUM(Code) OVER (PARTITION BY ID)) ELSE END
AS [Count]
FROM Table1;
But it still doesn't give me a running total.
I have an SQL Fiddle page
http://sqlfiddle.com/#!9/2746c/18
Any help would be great

Put the case in the sum:
SELECT Table1.* ,
SUM(case when len(Code) > 4 then 1 else 0 end) OVER (order BY ID) as counted
FROM Table1;

In Sql Server 2012+ you can use Sum() Over(Order by) function
SELECT Sum(CASE WHEN Len(code) > 4 THEN 1 ELSE 0 END)
OVER(ORDER BY id)
FROM Yourtable
for older versions
SELECT *
FROM Yourtable a
CROSS apply (SELECT Count(*)
FROM Yourtable b
WHERE a.ID >= b.ID
AND Len(code) > 4) cs (runn)
ANSI SQL method
SELECT ID,Code,
(SELECT count(*)
FROM Yourtable b
WHERE a.ID >= b.ID and char_length(code) > 4) AS runn
FROM Yourtable a

There are some good and efficient answers here.
But in case you want to try different approach then try following query:
SELECT
t1.*,
(Select sum(r.cnt) from
(SELECT COUNT(t2.code) as cnt FROM table1 AS t2
WHERE t2.Id <= t1.Id
group by t2.code
having len(t11.code) > 4) r
) AS Count
FROM table1 AS t1;
Here is the DEMO
Hope it helps!

SQL query count (recursive)

I have the following table on my database which contains some transactions for which I need to calc points and rewards.
Every time a TxType A occurs I should record 10 points.
Then I have to subtract from these points the value of the PP column every time a TxType B occurs.
When the calculation goes to zero a reward is reached.
ID TxType PP
1 A 0
2 B 2
3 B 1
4 B 1
5 B 1
6 B 3
7 B 1
8 B 1
9 A 0
10 B 4
11 B 3
12 B 2
13 B 1
14 A 0
15 B 2
I have created the sql query to calc points as follow
SELECT SUM(
CASE
WHEN TxType = 'A' THEN 10
WHEN TxType = 'B' THEN (PP * -1)
END)
FROM myTable
This query return the value of 8, which is exactly the number of points based on the sample data.
How do I calculate the rewards occurred (2 in the given example)?
thanks for helping

One way to do the calculation (in SQL Server 2008) using a correlated subquery:
select t.*,
(select sum(case when TxType = 'A' then 10
when TxType = 'B' then PP * -1
end)
from mytable t2
where t2.id <= t.id
) as TheSum
from mytable t;
You can then apply the logic of what happens when the value is 0. In SQL Server 2012, you could just use a cumulative sum.

To complete Gordon Linoff's the answer, you just need to count the records where TheSum is 0 to get how many rewards occurred:
SELECT COUNT(1)
FROM (
SELECT ID,
TxType,
PP,
( SELECT SUM(CASE TxType WHEN 'A' THEN 10 WHEN 'B' THEN -PP END)
FROM #myTable t2
WHERE t2.id <= t1.id
) AS TheSum
FROM #myTable t1
) Result
WHERE TheSum = 0

SQL: Outputting Multiple Rows When Joining From Same Table

My question is this: Is it possible to output multiple rows when joining from the same table?
With this code for example, I would like it to output 2 rows, one for each table. Instead, what it does is gives me 1 row with all of the data.
SELECT t1.*, t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
WHERE t1.id = '1'
UPDATE
Well the problem that I have with the UNION/UNION ALL is this: I don't know what the t1.oldId value is equal to. All I know is the id for t1. I am trying to avoid using 2 queries so is there a way I could do something like this:
SELECT t1.*
FROM table t1
WHERE t1.id = '1'
UNION
SELECT t2.*
FROM table t2
WHERE t2.id = t1.oldId
SAMPLE DATA
messages_users
id message_id user_id box thread_id latest_id
--------------------------------------------------------
8 1 1 1 NULL NULL
9 2 1 2 NULL 16
10 2 65 1 NULL 15
11 3 65 2 2 NULL
12 3 1 1 2 NULL
13 4 1 2 2 NULL
14 4 65 1 2 NULL
15 5 65 2 2 NULL
16 6 1 1 2 NULL
Query:
SELECT mu.id FROM messages_users mu
JOIN messages_users mu2 ON mu2.latest_id IS NOT NULL
WHERE mu.user_id = '1' AND mu2.user_id = '1' AND ((mu.box = '1'
AND mu.thread_id IS NULL AND mu.latest_id IS NULL) OR mu.id = mu2.latest_id)
This query fixes my problem. But it seems the answer to my question is to not use a JOIN but a UNION.

You mean one row for t1 and one row from t2?
You're looking for UNION, not JOIN.
select * from table where id = 1
union
select * from table where oldid = 1

If you are trying to multiply rows in a table, you need UNION ALL (not UNION):
select *
from ((select * from t) union all
(select * from t)
) t
I also sometimes use a cross join to do this:
select *
from t cross join
(select 1 as seqnum union all select 2) vals
The cross join is explicitly multiplying the number of rows, in this case, with a sequencenumber attached.

Well, since it's the same table, you could do:
SELECT t2.*
FROM table t1
JOIN table t2
ON t2.id = t1.oldId
OR t2.id = t1.id
WHERE t1.id = '1'

Get next minimum, greater than or equal to a given value for each group

given the following Table1:
RefID intVal SomeVal
----------------------
1 10 val01
1 20 val02
1 30 val03
1 40 val04
1 50 val05
2 10 val06
2 20 val07
2 30 val08
2 40 val09
2 50 val10
3 12 val11
3 14 val12
4 10 val13
5 100 val14
5 150 val15
5 1000 val16
and Table2 containing some RefIDs and intVals like
RefID intVal
-------------
1 11
1 28
2 9
2 50
2 51
4 11
5 1
5 150
5 151
need an SQL Statement to get the next greater intValue for each RefID and NULL if not found in Table1
following is the expected result
RefID intVal nextGt SomeVal
------------------------------
1 11 20 val01
1 28 30 val03
2 9 10 val06
2 50 50 val10
2 51 NULL NULL
4 11 NULL NULL
5 1 100 val14
5 150 150 val15
5 151 1000 val16
help would be appreciated !

Derived table a retrieves minimal values from table1 given refid and intVal from table2; outer query retrieves someValue only.
select a.refid, a.intVal, a.nextGt, table1.SomeVal
from
(
select table2.refid, table2.intval, min (table1.intVal) nextGt
from table2
left join table1
on table2.refid = table1.refid
and table2.intVal <= table1.intVal
group by table2.refid, table2.intval
) a
-- table1 is joined again to retrieve SomeVal
left join table1
on a.refid = table1.refid
and a.nextGt = table1.intVal
Here is Sql Fiddle with live test.

You can solve this using the ROW_NUMBER() function:
SELECT
RefID,
intVal,
NextGt,
SomeVal,
FROM
(
SELECT
t2.RefID,
t2.intVal,
t1.intVal AS NextGt,
t1.SomeVal,
ROW_NUMBER() OVER (PARTITION BY t2.RefID, t2.intVal ORDER BY t1.intVal) AS rn
FROM
dbo.Table2 AS t2
LEFT JOIN dbo.Table1 AS t1 ON t1.RefID = t2.RefID AND t1.intVal >= t2.intVal
) s
WHERE
rn = 1
;
The derived table matches each Table2 row with all Table1 rows that have the same RefID and an intVal that is greater than or equal to Table2.intVal. Each subset of matches is ranked and the first row is returned by the main query.
The nested query uses an outer join, so that those Table2 rows that have no Table1 matches are still returned (with nulls substituted for the Table1 columns).
Alternatively you can use OUTER APPLY:
SELECT
t2.RefID,
t2.intVal,
t1.intVal AS NextGt,
t1.SomeVal
FROM
dbo.Table2 AS t2
OUTER APPLY
(
SELECT TOP (1)
t1.intVal
FROM
dbo.Table1 AS t1
WHERE
t1.RefID = t2.RefID
AND t1.intVal >= t2.intVal
ORDER BY
t1.intVal ASC
) AS t1
;
This method is arguably more straightforward: for each Table2 row, get all matches from Table1 based on the same set of conditions, sort the matches in the ascending order of Table1.intVal and take the topmost intVal.

This can be done with a join, group by, and a case statement, and a trick:
select t1.refid, t2.intval,
min(case when t1.intval > t2.intval then t1.intval end) as min_greater_than_ref,
substring(min(case when t1.intval > t2.intval
then right('00000000'+cast(t1.intval as varchar(255)), 8)+t1.SomeVal)
end)), 9, 1000)
from table1 t1 left join
table2 t2
on t1.refid = t2.refid
group by t1.refid, t2.intval
SO, the trick is to prepend the integer value to SomeValue, zero-padding the integer value (in this case to 8 characters). You get something like: "00000020val01". The minimum on this column is based on the minimum of the integer. The final step is to extract the value.
For this example, I used SQL Server syntax for the concatenation. In other databases you might use CONCAT() or ||.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Datediff with CASE and Group By - hive

Related

SQL: How to join two columns in a specific way?

Running total (COUNT) SQL Server

SQL query count (recursive)

SQL: Outputting Multiple Rows When Joining From Same Table

Get next minimum, greater than or equal to a given value for each group

Categories

Resources