Returning several values within a CASE expression in subquery and separate columns again in main query - sql

My test table looks like this:
# select * from a;
source | target | id
--------+--------+----
1 | 2 | 1
2 | 3 | 2
3 | 0 | 3
My query is this one:
SELECT *
FROM (
SELECT
CASE
WHEN id<>1
THEN source
ELSE 0
END
AS source,
CASE
WHEN id<>1
THEN target
ELSE 0
END
AS target
FROM a
) x;
The query seems a bit odd because the CASE expression with the same criteria is repeated for every column. I would like to simplify this and tried the following, but it doesn't work as expected.
SELECT *
FROM (
SELECT
CASE
WHEN id<>1
THEN (source, target)
ELSE (0, 0)
END
AS r
FROM a
) x;
It yields one column with a row value, but I would rather get the two original columns. Separating them with a (r).* or similar doesn't work, because the "record type has not been registered".
I found several questions here with solutions regarding functions returning RECORD values, but none regarding this example with a sub-select.
Actually, there is a quite long list of columns, so repeating the same CASE expression many times makes the whole query quite unreadable.
Since the real problem - as opposed to this simplified case - consists of several CASE expressions and several column groups, a solution with a UNION won't help, because the number of UNIONs would be large and make it unreadable as well as several CASEs.
My actual question is: How can I get the original columns from the row value?

This answers the original question.
If I understood your needs, you want 0 and 0 for source and target when id = 1:
SELECT
0 AS source,
0 AS target
FROM tablename
WHERE id = 1
UNION ALL
SELECT
source,
target
FROM tablename
WHERE id <> 1

Revised answer: You can make your query work (fixing the record type has not been registered issue) by creating a TYPE:
CREATE TYPE stpair AS (source int, target int);
And cast the composite value column to that type:
SELECT id, (cv).source, (cv).target
FROM (
SELECT id, CASE
WHEN id <> 1 THEN (source, target)::stpair
ELSE (0, 0)::stpair
END AS cv
FROM t
) AS x
Having said that, it should be far more convenient to use arrays:
SELECT id, av[1] AS source, av[2] AS target
FROM (
SELECT id, CASE
WHEN id <> 1 THEN ARRAY[source, target]
ELSE ARRAY[0, 0]
END AS av
FROM t
) AS x
Demo on db<>fiddle

Will this work for you?
select source,target,id from a where id <>1 union all select 0 as source,0 as target,id from a where id=1 order by id
I have used union all to included cases where multiple records may have ID=1

Related

SQL Server 'AS' alias unexpected syntax

I've come across following T-SQL today:
select c from (select 1 union all select 1) as d(c)
that yields following result:
c
-----------
1
1
The part that got me confused was d(c)
While trying to understand what's going on I've modified T-SQL into:
select c, b from (select 1, 2 union all select 3, 4) m(c, b)
which yields following result:
c b
----------- -----------
1 2
3 4
It was clear that d & m are table reference while letters in brackets c & b are reference to columns.
I wasn't able to find relevant documentation on msdn, but curious if
You're aware of such syntax?
What would be useful use case scenario?
select c from (select 1 union all select 1) as d(c)
is the same as
select c from (select 1 as c union all select 1) as d
In the first query you did not name the column(s) in your subquery, but named them outside the subquery,
In the second query you name the column(s) inside the subquery
If you try it like this (without naming the column(s) in the subquery)
select c from (select 1 union all select 1) as d
You will get following error
No column name was specified for column 1 of 'd'
This is also in the Documentation
As for the usage, some like to write it the first method, some in the second, whatever you prefer. It's all the same
An observation: Using the table constructor values gives you no way of naming the columns, which makes it neccessary to use column naming after the table alias:
select * from
(values
(1,2) -- can't give a column name here
,(3,4)
) as tableName(column1,column2) -- gotta do it here
You've already had comments that point you to the documentation of how derived tables work, but not to answer you question regarding useful use cases for this functionality.
Personally I find this functionality to be useful whenever I want to create a set of addressable values that will be used extensively in your statement, or when I want to duplicate rows for whatever reason.
An example of addressable values would be a much more compelx version of the following, in which the calculated values in the v derived table can be used many times over via more sensible names, rather than repeated calculations that will be hard to follow:
select p.ProductName
,p.PackPricePlusVAT - v.PackCost as GrossRevenue
,etc
from dbo.Products as p
cross apply(values(p.UnitsPerPack * p.UnitCost
,p.UnitPrice * p.UnitsPerPack * 1.2
,etc
)
) as v(PackCost
,PackPricePlusVAT
,etc
)
and an example of being able to duplicate rows could be in creating an exception report for use in validating data, which will output one row for every DataError condition that the dbo.Product row satisfies:
select p.ProductName
,e.DataError
from dbo.Products as p
cross apply(values('Missing Units Per Pack'
,case when p.SoldInPacks = 1 and isnull(p.UnitsPerPack,0) < 1 then 1 end
)
,('Unusual Price'
,case when p.Price > (p.UnitsPerPack * p.UnitCost) * 2 then 1 end
)
,(etc)
) as e(DataError
,ErrorFlag
)
where e.ErrorFlag = 1
If you can understand what these two scripts are doing, you should find numerous examples of where being able to generate additional values or additional rows of data would be very helpful.

Oracle SQL - Multiple return from case

I may be trying it wrong. I am looking for any approach which is best.
Requirement:
My Query joins 4-5 tables based on few fields.
I have a column called product id. In my table there are 1.5 million rows. Out of those only 10% rows has product ids with the following attribute
A300X-%
A500Y-%
300,500, 700 are valid model numbers. X and Y are classifications. My query picks all the systems.
I have a check as follows
CASE
WHEN PID LIKE 'A300X%'
THEN 'A300'
...
END AS MODEL
Similarly
CASE
WHEN PID LIKE 'A300X%'
THEN 'X'
...
END AS GENRE
I am looking for the best option from the below
How do I Combine both case statement and add another[third] case which will have these two cases. i.e
CASE
WHEN desc in ('AAA')
First Case
Second Case
ELSE
don't do anything for other systems
END
Is there any regex way of doing this? Before first - take the string. Look for X, Y and also 300,500,700.
Is there any other way of doing this? Or doing via code is the best way?
Any suggestions?
EDIT:
Sample desc:
AAA,
SoftwARE,
sw-app
My query picks all the desc. But the case should be running for AAA alone.
And Valid models are
A300X-2x-P
A500Y-5x-p
A700X-2x-p
A50CE-2x-P
I have to consider only 300,500,700. And the above two cases.
Expected result:
MODEL GENRE
A300 X
A500 Y
A300 Y
Q: How do I Combine both CASE statement expressions
Each CASE expression will return a single value. If the requirement is to return two separate columns in the resultset, that will require two separate expressions in the SELECT list.
For example:
DESC PID model_number genre
---- ---------- ------------ ------
AAA A300X-2x-P 300 X
AAA A500Y-5x-p 500 Y
AAA A700X-2x-p 700 X
AAA A50CE-2x-P (NULL) (NULL)
FOO A300X-2x-P (NULL) (NULL)
There will need to be an expression to return the model_number column, and a separate expression to return the genre column.
It's not possible for a single expression to return two separate columns.
Q: and add another[third] case which will have these two cases.
A CASE expression returns a value; we can use a CASE expression almost anywhere in a SQL statement where we can use a value, including within another CASE expression.
We can also combine multiple conditions in a WHEN test with AND and OR
As an example of combining conditions and nesting CASE expressions ditions...
CASE
WHEN ( ( t.PID LIKE '_300%' OR t.PID LIKE '_500%' OR t.PID LIKE '_700%' )
AND ( t.DESC = 'AAA' )
)
THEN CASE
WHEN ( t.PID LIKE '____X%' )
THEN 'X'
WHEN ( t.PID LIKE '____Y%' )
THEN 'Y'
ELSE NULL
END
ELSE NULL
END AS `genre`
There are other expressions that will return an equivalent result; the example shown here isn't necessarily the best expression. It just serves as a demonstration of combining conditions and nesting CASE expressions.
Note that to return another column model we would need to include another expression in the SELECT list. Similar conditions will need to be repeated; it's not possible to reference the WHEN conditions in another CASE expression.
Based on your sample data, logic such as this would work:
(CASE WHEN REGEXP_LIKE(PID, '^A[0-9]{3}[A-Z]-')
THEN SUBSTR(PID, 1, 4)
ELSE PID
END) AS MODEL
(CASE WHEN REGEXP_LIKE(PID, '^A[0-9]{3}[A-Z]-')
THEN SUBSTR(PID, 5, 1)
ELSE PID
END) AS GENRE
This assumes that the "model number" always starts with "A" and is followed by three digits (as in your example data). If the model number is more complicated, you may need regexp_substr() to extract the values you want.

SQL Rows to Columns if column values are unknown

I have a table that has demographic information about a set of users which looks like this:
User_id Category IsMember
1 College 1
1 Married 0
1 Employed 1
1 Has_Kids 1
2 College 0
2 Married 1
2 Employed 1
3 College 0
3 Employed 0
The result set I want is a table that looks like this:
User_Id|College|Married|Employed|Has_Kids
1 1 0 1 1
2 0 1 1 0
3 0 0 0 0
In other words, the table indicates the presence or absence of a category for each user. Sometimes the user will have a category where the value if false, sometimes the user will have no row for a category, in which case IsMember is assumed to be false.
Also, from time to time additional categories will be added to the data set, and I'm wondering if its possible to do this query without knowing up front all the possible category names, in other words, I won't be able to specify all the column names I want to count in the result. (Note only user 1 has category "has_kids" and user 3 is missing a row for category "married"
(using Postgres)
Thanks.
You can use jsonb funcions.
with titles as (
select jsonb_object_agg(Category, Category) as titles,
jsonb_object_agg(Category, -1) as defaults
from demog
),
the_rows as (
select null::bigint as id, titles as data
from titles
union
select User_id, defaults || jsonb_object_agg(Category, IsMember)
from demog, titles
group by User_id, defaults
)
select id, string_agg(value, '|' order by key)
from (
select id, key, value
from the_rows, jsonb_each_text(data)
) x
group by id
order by id nulls first
You can see a running example in http://rextester.com/QEGT70842
You can replace -1 with 0 for the default value and '|' with ',' for the separator.
You can install tablefunc module and use the crosstab function.
https://www.postgresql.org/docs/9.1/static/tablefunc.html
I found a Postgres function script called colpivot here which does the trick. Ran the script to create the function, then created the table in one statement:
select colpivot ('_pivoted', 'select * from user_categories', array['user_id'],
array ['category'], '#.is_member', null);

Select subquery inside then of case when statement, sqlite bug?

I'm finding that when I try to select a column in a SQL case statement, it doesn't work unless I wrap it in a numeric function. The max(price) seems to select the value in the column, while just price, or price always returns blank.
I think its a bug.
This doesn't work:
SELECT
auction,
CASE WHEN auction='1'
THEN (select max(bid.amount) from bid where bid.auction_id = auction.id)
ELSE price
END as price_string
FROM product
This works:
SELECT
auction,
CASE WHEN auction='1'
THEN (select max(bid.amount) from bid where bid.auction_id = auction.id)
ELSE max(price)
END as price_string
FROM product
Edit: fixed comma.
This is not a bug.
In the query that doesn't work, the two halves of your CASE statement are incompatible. You have an aggregate function (MAX(bid.amount)) that returns a single value for one part of your CASE statement, and the name of a column, which will return a set, for the other part of your CASE statement. You cannot mix aggregates and sets like this.
The query that works does so because both halves of the CASE statement are returning aggregate values and are therefore compatible.
Take a simple table:
test_table
col1 | col2
1 7
5 14
8 3
3 9
If I query like this:
SELECT col1 FROM test_table
I'll get this result, a set:
col1
1
5
8
3
But if I query like this:
SELECT MAX(col1) FROM test_table
I'll get a single value:
8

Counting non-null columns in a rather strange way

I have a table which has 32 columns in an Oracle table.
Two of these columns are identity columns
the rest are values
I would like to get the average of all the value columns, which is complicated by the null (identity) columns. Below is the pseudocode for what I am trying to achieve:
SELECT
((nvl(val0, 0) + nvl(val1, 0) + ... nvl(valn, 0))
/ nonZero_Column_Count_In_This_Row)
Such that: nonZero_Column_Count_In_This_Row = (ifNullThenZeroElse1(val0) + ifNullThenZeroElse1(val1) ... ifNullThenZeroElse(valn))
The difficulty here is of course in getting 1 for any non-null column. It seems I need a function similar to NVL, but with an else clause. Something that will return 0 if the value is null, but 1 if not, rather than the value itself.
How should I go about about getting the value for the denominator?
PS: I feel I must explain some motivation behind this design. Ideally this table would have been organized as the identity columns and one value per row with some identifier for the row itself. This would have made it more normalized and the solution to this problem would have been pretty simple. The reasons for it not to be done like this are throughput, and saving space. This is a huge DB where we insert 10 million values per minute into. Making each of these values one row would mean 10M rows per minute, which is definitely not attainable. Packing 30 of them into a single row reduces the number of rows inserted to something we can do with a single DB, and the overhead data amount (the identity data) much less.
(Case When col is null then 0 else 1 end)
You could use NVL2(val0, 1, 0) + NVL2(val1, 1, 0) + ... since you are using Oracle.
Another option is to use the AVG function, which ignores NULLs:
SELECT AVG(v) FROM (
WITH q AS (SELECT val0, val1, val2, val3 FROM mytable)
SELECT val0 AS v FROM q
UNION ALL SELECT val1 FROM q
UNION ALL SELECT val2 FROM q
UNION ALL SELECT val3 FROM q
);
If you're using Oracle11g you can use the UNPIVOT syntax to make it even simpler.
I see this is a pretty old question, but I don't see a sufficient answer. I had a similar problem, and below is how I solved it. It's pretty clear a case statement is needed. This solution is a workaround for such cases where
SELECT COUNT(column) WHERE column {IS | IS NOT} NULL
does not work for whatever reason, or, you need to do several
SELECT COUNT ( * )
FROM A_TABLE
WHERE COL1 IS NOT NULL;
SELECT COUNT ( * )
FROM A_TABLE
WHERE COL2 IS NOT NULL;
queries but want it as a data set when you run the script. See below; I use this for analysis and it's been working great for me so far.
SUM(CASE NVL(valn, 'X')
WHEN 'X'
THEN 0
ELSE 1
END) as COLUMN_NAME
FROM YOUR_TABLE;
Cheers!
Doug
Generically, you can do something like this:
SELECT (
(COALESCE(val0, 0) + COALESCE(val1, 0) + ...... COALESCE(valn, 0))
/
(SIGN(ABS(COALESCE(val0, 0))) + SIGN(ABS(COALESCE(val1, 0))) + .... )
) AS MyAverage
The top line will return the sum of values (omitting NULL values) whereas the bottom line will return the number of non-null values.
FYI - it's SQL Server syntax, but COALESCE is just like ISNULL for the most part. SIGN just returns -1 for a negative number, 0 for zero, and 1 for a positive number. ABS is "absolute value".