Dynamically selecting the column to select from the row itself in SQL - sql

I have a SQL Server table with some data as follows. The number of P columns are fixed but there will be too many columns. There will be multiple columns in the fashion like S1, S2 etc
Id
SelectedP
P1
P2
P3
P4
P5
1
P2
3
8
4
15
7
2
P1
0
2
6
0
3
3
P3
1
15
2
1
11
4
P4
3
4
6
2
4
I need to write a SQL statement which can get the below result. Basically which column that needs to be selected from each row depends upon the SelectedP value in that row itself. The SelectedP contains the column to select for each row.
Id
SelectedP
Selected-P-Value
1
P2
8
2
P1
0
3
P3
2
4
P4
2
Thanks in advance.

You just need a CASE expression...
SELECT
id,
SelectedP,
CASE SelectedP
WHEN 'P1' THEN P1
WHEN 'P2' THEN P2
WHEN 'P3' THEN P3
WHEN 'P4' THEN P4
WHEN 'P5' THEN P5
END
AS SelectedPValue
FROM
yourTable
This will return NULL for anything not mentioned in the CASE expression.
EDIT:
An option with just a little less typing...
SELECT
id, SelectedP, val
FROM
yourTable AS pvt
UNPIVOT
(
val FOR P IN
(
P1,
P2,
P3,
P4,
P5
)
)
AS unpvt
WHERE
SelectedP = P
NOTE: If the value of SelectedP doesn't exist in the UNPIVOT, then the row will not appear at all (unlike the CASE expression which will return a NULL)
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=b693738aac0b594cf37410ee5cb15cf5
EDIT 2:
I don't know if this will perform much worse than the 2nd option, but this preserves the NULL behaviour.
(The preferred option is still to fix your data-structure.)
SELECT
id, SelectedP, MAX(CASE WHEN SelectedP = P THEN val END) AS val
FROM
yourTable AS pvt
UNPIVOT
(
val FOR P IN
(
P1,
P2,
P3,
P4,
P5
)
)
AS unpvt
GROUP BY
id, SelectedP
Demo : https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=f3f64d2fb6e11fd24d1addbe1e50f020

Related

Create dynamic binary columns in sql query

I'm using presto db
I have two tables, one looks like:
table 1:
item count
p1 20
p2 10
p3 5
p4 4
p5 2
and table 2:
person lic
c1 p2
c1 p1
c2 p3
c2 p4
c2 p2
c3 p1
c4 p2
I want to return a table that looks like:
person p1 p2 p3 p4 p5
c1 1 1 0 0 0
c2 0 1 1 1 0
c3 1 0 0 0 0
c4 0 1 0 0 0
c5 0 0 0 0 0
It looks like a pivot would do, but im not sure how to account for missing values in the column and get them to be '0' in the final table
The output schema for a SQL query must be fixed. Thus, if you want a column p1 to appear in the output, it has to be listed explicitly in the query.
I'm not sure how table1 is related to the output, but you can do a pivot like this:
SELECT person
, count_if(lic = 'p1') p1
, count_if(lic = 'p2') p2
...
FROM table2
GROUP BY person
The query needs to list each p column. Depending on your application, you might be able to generate the query programmatically by first running a query to get the unique values of p.

Eliminate duplicate rows by outer joining two table in Oracle 11i

I have following two table with sample Data.
PLACED_PERSON_INFO
*PLACED_PERSON_INFO_GUID CPR*
P1 0201026157
P2 0309929493
P3 0002170000
P4 0000011037
P5 1201006694
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420
PLACED_PERSON_PLACES
*PP_ID PLACEMENT_DATE PLACEMENT_STOP PLACED_PERSON_INFO_GUID*
1 01-01-2014 31-12-2014 P1
2 01-01-2014 31-12-2014 P1
3 01-01-2013 31-12-2013 P2
4 01-06-2014 30-10-2014 P3
5 01-02-2014 30-10-2014 P3
6 01-01-2013 01-01-2015 P4
7 01-01-2013 30-05-2013 P4
8 01-01-2012 30-03-2013 P5
I have written the following SQL Query to get the result combining these two tables.
SQL Query :
SELECT
PPI.PLACED_PERSON_INFO_GUID, PPI.CPR
FROM PLACED_PERSON_PLACES PPP, PLACED_PERSON_INFO PPI
WHERE (PPP.PLACEMENT_DATE <= SYSDATE OR PPP.PLACEMENT_DATE IS NULL)
AND (PPP.PLACEMENT_STOP >= SYSDATE OR PPP.PLACEMENT_STOP IS NULL)
AND PPP.PLACED_PERSON_INFO_GUID (+) = PPI.PLACED_PERSON_INFO_GUID
ORDER BY PPI.CPR;
Query Result:
PLACED_PERSON_INFO_GUID CPR
P1 0201026157
P1 0201026157
P3 0002170000
P3 0002170000
P4 0000011037
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420
But I want the following result where duplicate rows will not be shown. I do not want to use DISTINCT keyword. Can anyone help me in this result? I am using Oracle 11i.
Expected Result:
PLACED_PERSON_INFO_GUID CPR
P1 0201026157
P3 0002170000
P4 0000011037
P6 1201009887
P7 1110007144
P8 0309906353
P9 0101002420
First, you should write your query using explicit join syntax:
SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR
FROM PLACED_PERSON_INFO PPI LEFT JOIN
PLACED_PERSON_PLACES PPP
ON PPP.PLACEMENT_DATE <= SYSDATE AND
PPP.PLACEMENT_STOP >= SYSDATE AND
PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
ORDER BY PPI.CPR;
If you only want one row, then you can use row_number():
SELECT PLACED_PERSON_INFO_GUID, CPR
FROM (SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR,
ROW_NUMBER() OVER (PARTITION BY PPI.PLACED_PERSON_INFO_GUID, PPI.CPR ORDER BY PPI.CPR) as seqnum
FROM PLACED_PERSON_INFO PPI LEFT JOIN
PLACED_PERSON_PLACES PPP
ON PPP.PLACEMENT_DATE <= SYSDATE AND
PPP.PLACEMENT_STOP >= SYSDATE AND
PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
) p
WHERE seqnum = 1;
ORDER BY CPR;
You can add additional columns and still only get one row per pair.
Solution is :
SELECT PLACED_PERSON_INFO_GUID, CPR
FROM (SELECT PPI.PLACED_PERSON_INFO_GUID, PPI.CPR,
ROW_NUMBER() OVER (PARTITION BY PPI.PLACED_PERSON_INFO_GUID, PPI.CPR ORDER BY PPI.CPR) AS SEQNUM
FROM PLACED_PERSON_INFO PPI LEFT JOIN PLACED_PERSON_PLACES PPP
ON PPP.PLACED_PERSON_INFO_GUID = PPI.PLACED_PERSON_INFO_GUID
WHERE (PPP.PLACEMENT_DATE <= SYSDATE OR PPP.PLACEMENT_DATE IS NULL)
AND (PPP.PLACEMENT_STOP >= SYSDATE OR PPP.PLACEMENT_STOP IS NULL)
) P
WHERE SEQNUM = 1
ORDER BY CPR

Take the max value of a column in a sql table

I have this query:
SELECT DISTINCT S.PRODOTTO, D.CODPROD, D.IDPROD
FROM D_PROD D, APP_SALES S
WHERE D.CODPROD = S.PRODOTTO
The result is:
PRODOTTO CODPROD IDPROD
P2 P2 2
P1 P1 1
P3 P3 4
P3 P3 3
Now I would the result was
PRODOTTO CODPROD IDPROD
P2 P2 2
P1 P1 1
P3 P3 4
with the product P3 that take the max idprod it has encountered.
How can I say to the query to take the max value if there are more rows of one product?
I want the max idprod.
SELECT DISTINCT S.PRODOTTO, D.CODPROD, MAX(D.IDPROD)
FROM D_PROD D, APP_SALES S
WHERE D.CODPROD = S.PRODOTTO
GROUP BY S.PRODOTTO, D.CODPROD

Row value inconsistency

Scenario -
We have pack items, which is defined as a composite of one or more items. A complex pack is a one that has more than one component items. Each component item of a complex pack item should be linked to equal number of locations.
For example:
Pack P1 has component C1, C2, and C3. Each item C1,C2 and C3 is ranged to 10 locations 1,2....10, such that C1-1,C1-2,...,C1-10,C2-1,C2-2,...,C2-10,and C3-1,C3-2,...,C3-10 exists. In such case the pack item P1 also gets associated to locations 1 through 10, as P1-1,P1-2,...,P1-10.
The table PACK_BREAKOUT contains the Pack component mapping and the table ITEM_LOCATION contains the items to location association. Both Pack and Component are considered as "items" and would exist in ITEM_LOCATION.
Ideally, for the a scenario like above the below record-set would be valid
PACK_NO ITEM NO_OF_LOC
-------- ------ -------------
P1 C1 10
P1 C2 10
P1 C3 10
I have the query below that returns result like above for all such pack items.
select c.pack_no,c.item,count(a.loc )
from item_location a, pack_breakout c
where c.item=a.item
group by c.pack_no,c.item
order by 1,2;
However, there are some discrepant results like pack no. P2 , P4, and P5 below where the components are not associated with equal number of locations.
PACK_NO ITEM NO_OF_LOC
-------- ------ -------------
P1 C1 10
P1 C2 10
P1 C3 10
P2 C1 11
P2 C2 5
P2 C3 9
P2 C4 11
P3 C1 21
P3 C2 21
P3 C3 21
P3 C4 21
P3 C5 21
P4 C1 10
P4 C2 15
P5 C1 10
P5 C2 9
P5 C3 10
P5 C4 10
Note that a pack can have n-number of components (as you can see P1, P2, P3, P4, and P5 have different number of components).
I would like to get only the packs whose component locations are not all consistent. So the desired result set would be-
PACK_NO ITEM NO_OF_LOC
-------- ------ -------------
P2 C1 11
P2 C2 5
P2 C3 9
P2 C4 11
P4 C1 10
P4 C2 15
P5 C1 10
P5 C2 9
P5 C3 10
P5 C4 10
Note that even if one component does not match no. of locations as the other components within the pack, the entire pack must be considered inconsistent (like P5).
You want to use another group by with a having clause:
select pack_no
from (select c.pack_no, c.item, count(a.loc ) as numlocs
from item_location a join
pack_breakout c
on c.item=a.item
group by c.pack_no, c.item
) p
group by pack_no
having MIN(numlocs) <> MAX(numlocs)
This returns the packs.
If you want the details of the numbers, then use the analytic functions for the calculation:
select pi.*
from (select pi.*, min(numlocs) over (partition by pack_no) as minnumlocs,
max(numlocs) over (partition by packno) as maxnumlocs
from (select c.pack_no, c.item, count(a.loc ) as numlocs
from item_location a join
pack_breakout c
on c.item=a.item
group by c.pack_no, c.item
) pi
) pi
where minnumlocs <> maxnumlocs

Recursive Insert using connect by clause

I have hierarchical data (right) in table in following manner which creates Hierarchy as shown in left. Tables are kept in oracle 11g.
TREE Hierarchy Tree Table
-------------- Element Parent
------ ------
P0 P0
P1 P1 P0
P11 P2 P0
C111 P11 P1
C112 P12 P1
P12 P21 P2
C121 P22 P2
C122 C111 P11
P2 C112 P11
P21 C121 P12
C211 C122 P12
C212 C211 P21
P22 C212 P21
C221 C221 P22
C222 C222 P22
My data table has values as follows. It contains values for all leaf nodes.
Data Table
Element Value
C111 3
C112 3
C121 3
C122 3
C211 3
C212 3
C221 3
C222 3
P11 6
I need to generate insert statement, preferably single insert statement which will insert rows in data table based on sum of values of the children.
Please note we need to calculate sum for only those parents whose value is not present in data table.
Data Table (Expected After Insert)
Element Value
C111 3
C112 3
C121 3
C122 3
C211 3
C212 3
C221 3
C222 3
P11 6
-- Rows to insert
P12 6
P21 6
P22 6
P1 12
P2 12
P0 24
If all leaf nodes are at the same height (here lvl=4), you can write a simple CONNECT BY query with a ROLLUP:
SQL> SELECT lvl0,
2 regexp_substr(path, '[^/]+', 1, 2) lvl1,
3 regexp_substr(path, '[^/]+', 1, 3) lvl2,
4 SUM(VALUE) sum_value
5 FROM (SELECT sys_connect_by_path(t.element, '/') path,
6 connect_by_root(t.element) lvl0,
7 t.element, d.VALUE, LEVEL lvl
8 FROM tree t
9 LEFT JOIN DATA d ON d.element = t.element
10 START WITH t.PARENT IS NULL
11 CONNECT BY t.PARENT = PRIOR t.element)
12 WHERE VALUE IS NOT NULL
13 AND lvl = 4
14 GROUP BY lvl0, ROLLUP(regexp_substr(path, '[^/]+', 1, 2),
15 regexp_substr(path, '[^/]+', 1, 3));
LVL0 LVL1 LVL2 SUM_VALUE
---- ----- ----- ----------
P0 P1 P11 6
P0 P1 P12 6
P0 P1 12
P0 P2 P21 6
P0 P2 P22 6
P0 P2 12
P0 24
The insert would look like:
INSERT INTO data (element, value)
(SELECT coalesce(lvl2, lvl1, lvl0), sum_value
FROM <query> d_out
WHERE NOT EXISTS (SELECT NULL
FROM data d_in
WHERE d_in.element = coalesce(lvl2, lvl1, lvl0)));
If the height of the leaf nodes is unknown/unbounded this gets more hairy. The above approach wouldn't work since ROLLUP needs to know exactly how many columns are to be considered.
In that case, you could use the tree structure in a self-join :
SQL> WITH HIERARCHY AS (
2 SELECT t.element, path, VALUE
3 FROM (SELECT sys_connect_by_path(t.element, '/') path,
4 connect_by_isleaf is_leaf, ELEMENT
5 FROM tree t
6 START WITH t.PARENT IS NULL
7 CONNECT BY t.PARENT = PRIOR t.element) t
8 LEFT JOIN DATA d ON d.element = t.element
9 AND t.is_leaf = 1
10 )
11 SELECT h.element, SUM(elements.value)
12 FROM HIERARCHY h
13 JOIN HIERARCHY elements ON elements.path LIKE h.path||'/%'
14 WHERE h.VALUE IS NULL
15 GROUP BY h.element
16 ORDER BY 1;
ELEMENT SUM(ELEMENTS.VALUE)
------- -------------------
P0 24
P1 12
P11 6
P12 6
P2 12
P21 6
P22 6
Here is another option using the SQL MODEL clause. I've taken some hints from what Vincent has done in his answer (use of regexp_subsr) to simplify my code.
The first part, within the WITH clause just rejigs the data and extracts out the hierarchy at each level.
The model clause, at the end of the query, brings the data up from the lowest levels. This will need additional columns added if there are more than four levels but should work no matter at what level the values are held.
I'm not entirely sure that this will work in all circumstances since I'm not that experienced with the MODEL clause but it does at least seem to work in this case.
with my_hierarchy_data as (
select
element,
value,
path,
parent,
lvl0,
regexp_substr(path, '[^/]+', 1, 2) as lvl1,
regexp_substr(path, '[^/]+', 1, 3) as lvl2,
regexp_substr(path, '[^/]+', 1, 4) as lvl3
from (
select
element,
value,
parent,
sys_connect_by_path(element, '/') as path,
connect_by_root element as lvl0
from
tree
left outer join data using (element)
start with parent is null
connect by prior element = parent
order siblings by element
)
)
select
element,
value,
path,
parent,
new_value,
lvl0,
lvl1,
lvl2,
lvl3
from my_hierarchy_data
model
return all rows
partition by (lvl0)
dimension by (lvl1, lvl2, lvl3)
measures(element, parent, value, value as new_value, path)
rules sequential order (
new_value[lvl1, lvl2, null] = sum(value)[cv(lvl1), cv(lvl2), lvl3 is not null],
new_value[lvl1, null, null] = sum(new_value)[cv(lvl1), lvl2 is not null, null],
new_value[null, null, null] = sum(new_value)[lvl1 is not null, null, null]
)
The insert statement you can use is
INSERT INTO data (elelment, value)
select element, newvalue
from <the_query>
where value is null;