SQL: case when statement with over (partition by) - sql

I'm quite new here and have tried various hints from Stackoverflow posts on SQL but haven't been able to solve this one.
I have a table that is result of tables joined, looks like this
Table A
cust_id prod_type
001 A
001 A
002 A
002 B
003 A
003 C
I need to apply logic: If for each cust_id there is at least one value where prod_type is B or C, then return corresponding prod_type value. If for each cust_id all values of prod_type are A, return A.
The final output i am trying to get is
Table B
cust_id prod_type
001 A
002 B
003 C
I have tried using
SELECT
A.cust_id
,CASE WHEN prod_type in ('B', 'C') THEN prod_type OVER (PARTITION BY A.cust_id)
ELSE 'A' OVER (PARTITION BY A.cust_id) END AS product
FROM ([Joined Tables]) AS A
and it seems that teradata does not allow to use over(clause) in a case statement: expects 'END' keyword between prod_type and OVER keyword.

You want to return only one row per customer with the best matching product_type?
If there are additional columns:
SELECT
A.cust_id
,prod_type
,...
FROM ([Joined Tables]) AS A
QUALIFY
ROW_NUMBER()
OVER (PARTITION BY CUST_ID
ORDER BY CASE WHEN prod_type in ('B', 'C') -- best match first
THEN 1
ELSE 2
END,
prod_type) = 1
Otherwise #Frisbee's MAX will work, but I assume that A/B/C are not your actual product names:
SELECT
A.cust_id
,COALESCE(MAX(CASE WHEN prod_type in ('B', 'C') THEN prod_type END)
,MAX(CASE WHEN prod_type not in ('B', 'C') THEN prod_type END))
FROM ([Joined Tables]) AS A
GROUP BY just_id

For completness here is an equivalent using the function first_value and using the inverse alphabetical order of your products.
Similarly to #Amgalan Bilegjav answer, 'b' is the sample table and 'a' the table with an extra column (finding the first product here).
The code was tested with Teradata version 16.20.53.55
SELECT
a.CUST_ID
, a.PROD_TYPE
FROM (
SELECT
b.CUST_ID
, first_value(b.PROD_TYPE) over (partition by b.CUST_ID order by b.PROD_TYPE desc) as PROD_TYPE
FROM (
/* recreate the exemple table */
select * from (SELECT '001' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '001' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '002' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '002' CUST_ID, 'B' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '003' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '003' CUST_ID, 'C' PROD_TYPE) as "DUAL"
) b
) a
group by a.CUST_ID, a.PROD_TYPE
order by a.CUST_ID
;

This query recreates your sample data (table b.) Add the row numbers based on specific product order (ranging from C, B to A in the table a) then filter on the first row to get the wanted table.
SELECT
a.CUST_ID
, a.PROD_TYPE
FROM (
SELECT
b.CUST_ID
, b.PROD_TYPE
, ROW_NUMBER() OVER (PARTITION BY CUST_ID ORDER BY DECODE(PROD_TYPE, 'A',3,'B',2,'C',1)) as RN
FROM (
/* recreate the exemple table */
select * from (SELECT '001' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '001' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '002' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '002' CUST_ID, 'B' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '003' CUST_ID, 'A' PROD_TYPE) as "DUAL" UNION all
select * from (SELECT '003' CUST_ID, 'C' PROD_TYPE) as "DUAL"
) b
) a
WHERE RN = 1
order by a.CUST_ID
;

Related

How to duplicate each row in table and add new column with 2 different values

I have a table that looks like this:
I want to turn it to this:
I would recommend a cross join with a fixed list of values:
select p.*, p.product || v.val pk
from mytable t
cross join (
select '20' val from dual
union all select '50' from dual
) v
You can just use the union all like so:
select t.product, t.qty, t.product || '20' pk from the_table t
union all
select t.product, t.qty, t.product || '50' pk from the_table t;

how to make calculation based on the information from other tables without common fields?

I'm fairly new to the world of GBQ, and I'm not sure how to best explain my situation but here are the sample of the 3 tables I'm currently working with:
I'm trying to add a new column to the "product_type" table with a count of all products ordered by the customers from the "delivery_1" table, and not too sure how to do that since there isn't any common fields.
Here is a visualization of my result:
Here are the queries to create the sample tables:
WITH customers_orders AS (
SELECT '00001' customer_no, 'yes' product_a, 'no' product_b, 'yes' product_c UNION ALL
SELECT '00002' customer_no, 'yes' product_a, 'yes' product_b, 'no' product_c UNION ALL
SELECT '00003' customer_no, 'no' product_a, 'no' product_b, 'no' product_c UNION ALL
SELECT '00004' customer_no, 'yes' product_a, 'yes' product_b, 'no' product_c UNION ALL
SELECT '00005' customer_no, 'yes' product_a, 'yes' product_b, 'yes' product_c
)
WITH product_type AS (
SELECT 'product_a' product, 'export' type UNION ALL
SELECT 'product_b' product, 'import' type UNION ALL
SELECT 'product_c' product, 'import' type
)
WITH delivery_1 AS (
SELECT '00001' customer_no, 'delivery_1' delivery UNION ALL
SELECT '00002' customer_no, 'delivery_1' delivery UNION ALL
SELECT '00005' customer_no, 'delivery_1' delivery
)
Any tips or helps are greatly appreciated!
Below is for BigQuery Standard SQL
#standardSQL
SELECT product, type, delivery_1_total_ordered
FROM `project.dataset.product_type`
LEFT JOIN (
SELECT TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') product,
COUNT(1) delivery_1_total_ordered
FROM `project.dataset.customers_orders`
JOIN `project.dataset.delivery_1`
USING(customer_no)
CROSS JOIN UNNEST(SPLIT(TRIM(TO_JSON_STRING(STRUCT(product_a, product_b, product_c)), '{}'))) kv
WHERE SPLIT(kv, ':')[OFFSET(1)] = '"yes"'
GROUP BY product
)
USING(product)
if to apply to sample data from your question - output is
Row product type delivery_1_total_ordered
1 product_a export 3
2 product_b import 2
3 product_c import 2
is there a way to select all product types instead of manually typing them out?
Sure. see below slightly adjusted query
#standardSQL
SELECT product, type, delivery_1_total_ordered
FROM `project.dataset.product_type`
LEFT JOIN (
SELECT TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') product,
COUNT(1) delivery_1_total_ordered
FROM `project.dataset.customers_orders` t /* added alias */
JOIN `project.dataset.delivery_1`
USING(customer_no)
CROSS JOIN UNNEST(SPLIT(TRIM(TO_JSON_STRING(t), '{}'))) kv /* used alias instead of explicit list of products */
WHERE SPLIT(kv, ':')[OFFSET(1)] = '"yes"'
GROUP BY product
)
USING(product)
The approach is to unpivot the results from the first table and join them to the second. Then, just aggregate to get the counts.
I would approach this as:
select pt.*,
(select count(*)
from customers_orders co join
delivery_1 d
using (customer_no) cross join
unnest(array[struct('product_a' as product, product_a as flag),
struct('product_b', product_b),
struct('product_c', product_c)
]
) u
where pt.product = u.product and flag = 'yes'
) as deliery_1_ordered
from product_type pt;

How to select a row after group by unioned tables?

I need to select the newest row from two tables, two tables have the same schema
Table A and Table B is the same schema, like this:
Table A :
user_id, time_stamp, order_id
1,20190101,100
2,20190103,201
3,20190102,300
5,20180209,99
Table B:
user_id, time_stamp, order_id
1,20190102,101
2,20190101,200
3,20190103,305
4,20190303,900
I want the output is A union B, then select the newer row of a user, order by time_stamp:
output should be:
1,20190102,101
2,20190103,201
3,20190103,305
4,20190303,900
5,20180209,99
How to write this SQL?
You can write as following sample query demo
with unionedTable as (
select * from tableA
union
select * from tableB)
,newerUsersTable as (
select distinct on (u.user_id)u.*
from unionedTable u
order by u.user_id, u.time_stamp desc
)select * from newerUsersTable
The main idea is using FULL OUTER JOIN among two tables, and then using UNION [ALL] for returning data set. So, consider the following SELECT statement with WITH clause :
with a( user_id, time_stamp, order_id ) as
(
select 1,20190101,100 union all
select 2,20190103,201 union all
select 3,20190102,300 union all
select 5,20180209,99
), b( user_id, time_stamp, order_id ) as
(
select 1,20190102,101 union all
select 2,20190101,200 union all
select 3,20190103,305 union all
select 4,20190303,900
), c as
(
select a.user_id as user_id_a, a.time_stamp as time_stamp_a, a.order_id as order_id_a,
b.user_id as user_id_b, b.time_stamp as time_stamp_b, b.order_id as order_id_b
from a full outer join b
on a.user_id = b.user_id
), d as
(
select user_id_a, time_stamp_a, order_id_a
from c
where coalesce(time_stamp_b,time_stamp_a) <= time_stamp_a
union all
select user_id_b, time_stamp_b, order_id_b
from c
where time_stamp_b >= coalesce(time_stamp_a,time_stamp_b)
)
select user_id_a as user_id, time_stamp_a as time_stamp, order_id_a as order_id
from d
order by user_id_a;
user_id time_stamp order_id
1 20190102 101
2 20190103 201
3 20190103 305
4 20190303 900
5 20180209 99
Demo
Use Group by(user_id) to show all user_id
Use max(time_stamp) get the newer row of user
SELECT aa.* from (select * from a union SELECT * from b ) aa
JOIN
(select user_id,max(time_stamp) as new_time
from (select * from a union SELECT * from b ) u
group by u.user_id) bb
on bb.new_time=aa.time_stamp and bb.user_id=aa.user_id
order by aa.user_id;
SQL Fiddle
I would simply do:
select user_id, time_stamp, order_id
from (select ab.*,
row_number() over (partition by user_id order by time_stamp desc) as seqnum
from (select a.* from a union all
select b.* from b
) ab
) ab
where seqnum = 1;

How to use union if i need to "order by" all selects

i have 3 separate select statements that i need to union. but all of them need to be ordered by a different column.
i tried doing this
select * from(
select * from (select columns from table1 order by column1 ) A
UNION
select * from (select columns from table2 order by column2 ) B
UNION
select * from (select columns from table3 order by column3 ) C
) Table
but this doesn't work
does anyone have any experience with this?
You can do something like this:
select *
from((select columns, 'table1' as which from table1 )
UNION ALL
(select columns, 'table2' from table2 )
UNION ALL
(select columns, 'table3' from table3 )
) t
order by which,
(case when which = 'table1' then column1
when which = 'table2' then column2
when which = 'table3' then column3
end);
This assumes that the columns used for ordering are all of the same type.
Note that this query uses union all instead of union. I see no reason why you would want to eliminate duplicates if you want the results from the three subqueries ordered independently.
EDIT:
You can also express the order by separately for each table:
order by which,
(case when which = 'table1' then column1 end) ASC,
(case when which = 'table2' then column2 end) DESC
(case when which = 'table3' then column3 end)
You should separate these columns in the one common column and then order
SELECT * FROM
(
SELECT A.*,columnA as ORDER_COL FROM A
UNION ALL
SELECT B.*,columnB as ORDER_COL FROM B
UNION ALL
SELECT C.*,columnC as ORDER_COL FROM C
) as T1
ORDER BY ORDER_COL
You have to order it AFTER the UNION's.
You can "trick it" like this:
select Artificial, a,b,c from(
select 1 as Artificial, a,b,c from (select columns from table1 ) A
UNION
select 2 as Artificial,a,b,c from (select columns from table2 ) B
UNION
select 3 as Artificial,a,b,c from (select columns from table3 ) C
) derivedTable
order by Artificial, c,b,a

Select query select based on a priority

Someone please change my title to better reflect what I am trying to ask.
I have a table like
Table (id, value, value_type, data)
ID is NOT unique. There is no unique key.
value_type has two possible values, let's say A and B.
Type B is better than A, but often not available.
For each id if any records with value_type B exists, I want all the records with that id and value_type B.
If no record for that id with value_Type B exists I want all records with that id and value_type A.
Notice that if B exists for that id I don't want records with type A.
I currently do this with a series of temp tables. Is there a single select statement (sub queries OK) that can do the job?
Thanks so much!
Additional details:
SQL Server 2005
RANK, rather than ROW_NUMBER, because you want ties (those with the same B value) to have the same rank value:
WITH summary AS (
SELECT t.*,
RANK() OVER (PARTITION BY t.id
ORDER BY t.value_type DESC) AS rank
FROM TABLE t
WHERE t.value_type IN ('A', 'B'))
SELECT s.id,
s.value,
s.value_type,
s.data
FROM summary s
WHERE s.rank = 1
Non CTE version:
SELECT s.id,
s.value,
s.value_type,
s.data
FROM (SELECT t.*,
RANK() OVER (PARTITION BY t.id
ORDER BY t.value_type DESC) AS rank
FROM TABLE t
WHERE t.value_type IN ('A', 'B')) s
WHERE s.rank = 1
WITH test AS (
SELECT 1 AS id, 'B' AS value_type
UNION ALL
SELECT 1, 'B'
UNION ALL
SELECT 1, 'A'
UNION ALL
SELECT 2, 'A'
UNION ALL
SELECT 2, 'A'),
summary AS (
SELECT t.*,
RANK() OVER (PARTITION BY t.id
ORDER BY t.value_type DESC) AS rank
FROM test t)
SELECT *
FROM summary
WHERE rank = 1
I get:
id value_type rank
----------------------
1 B 1
1 B 1
2 A 1
2 A 1
SELECT *
FROM table
WHERE value_type = B
UNION ALL
SELECT *
FROM table
WHERE ID not in (SELECT distinct id
FROM table
WHERE value_type = B)
The shortest query to do the job I can think of:
SELECT TOP 1 WITH TIES *
FROM #test
ORDER BY Rank() OVER (PARTITION BY id ORDER BY value_type DESC)
This is about 50% worse on CPU as OMG Ponies' and Christoperous 5000's solutions, but the same number of reads. It's the extra sort that is making it take more CPU.
The best-performing original query I've come up with so far is:
SELECT *
FROM #test
WHERE value_type = 'B'
UNION ALL
SELECT *
FROM #test T1
WHERE NOT EXISTS (
SELECT *
FROM #test T2
WHERE
T1.id = T2.id
AND T2.value_type = 'B'
)
This consistently beats all the others presented on CPU by about 1/3rd (the others are about 50% more) but has 3x the number of reads. The duration on this query is often 2/3rds the time of all the others. I consider it a good contender.
Indexes and data types could change everything.
declare #test as table(
id int , value [nvarchar](255),value_type [nvarchar](255),data int)
INSERT INTO #test
SELECT 1, 'X', 'A',1 UNION
SELECT 1, 'X', 'A',2 UNION
SELECT 1, 'X', 'A',3 UNION
SELECT 1, 'X', 'A',4 UNION
SELECT 2, 'X', 'A',5 UNION
SELECT 2, 'X', 'B',6 UNION
SELECT 2, 'X', 'B',7 UNION
SELECT 2, 'X', 'A',8 UNION
SELECT 2, 'X', 'A',9
SELECT * FROM #test x
INNER JOIN
(SELECT id, MAX(value_type) as value_type FROM
#test GROUP BY id) as y
ON x.id = y.id AND x.value_type = y.value_type
Try this (MSSQL).
Select id, value_typeB, null
from myTable
where value_typeB is not null
Union All
Select id, null, value_typeA
from myTable
where value_typeB is null and value_typeA is not null
Perhaps something like this:
select * from mytable
where id in (select distinct id where value_type = "B")
union
select * from mytable
where id in (select distinct id where value_type = "A"
and id not in (select distinct id where value_type = "B"))
This uses a union, combining all records of value B with all records that have only A values:
SELECT *
FROM mainTable
WHERE value_type = B
GROUP BY value_type UNION SELECT *
FROM mainTable
WHERE value_type = A
AND id NOT IN(SELECT *
FROM mainTable
WHERE value_type = B);