Oracle PIVOT, twice? - sql

I have been trying to move away from using DECODE to pivot rows in Oracle 11g, where there is a handy PIVOT function. But I may have found a limitation:
I'm trying to return 2 columns for each value in the base table. Something like:
SELECT somethingId, splitId1, splitName1, splitId2, splitName2
FROM (SELECT somethingId, splitId
FROM SOMETHING JOIN SPLIT ON ... )
PIVOT ( MAX(splitId) FOR displayOrder IN (1 AS splitId1, 2 AS splitId2),
MAX(splitName) FOR displayOrder IN (1 AS splitName1, 2 as splitName2)
)
I can do this with DECODE, but I can't wrestle the syntax to let me do it with PIVOT. Is this even possible? Seems like it wouldn't be too hard for the function to handle.
Edit: is StackOverflow maybe not the right Overflow for SQL questions?
Edit: anyone out there?

From oracle-developer.net it would appear that it can be done like this:
SELECT somethingId, splitId1, splitName1, splitId2, splitName2
FROM (SELECT somethingId, splitId
FROM SOMETHING JOIN SPLIT ON ... )
PIVOT ( MAX(splitId) ,
MAX(splitName)
FOR displayOrder IN (1 AS splitName1, 2 as splitName2)
)

I'm not sure from what you provided what the data looks or what exactly you would like. Perhaps if you posted the decode version of the query that returns the data you are looking for and/or the definition for the source data, we could better answer your question. Something like this would be helpful:
create table something (somethingId Number(3), displayOrder Number(3)
, splitID Number(3));
insert into something values (1, 1, 10);
insert into something values (2, 1, 11);
insert into something values (3, 1, 12);
insert into something values (4, 1, 13);
insert into something values (5, 2, 14);
insert into something values (6, 2, 15);
insert into something values (7, 2, 16);
create table split (SplitID Number(3), SplitName Varchar2(30));
insert into split values (10, 'Bob');
insert into split values (11, 'Carrie');
insert into split values (12, 'Alice');
insert into split values (13, 'Timothy');
insert into split values (14, 'Sue');
insert into split values (15, 'Peter');
insert into split values (16, 'Adam');
SELECT *
FROM (
SELECT somethingID, displayOrder, so.SplitID, sp.splitname
FROM SOMETHING so JOIN SPLIT sp ON so.splitID = sp.SplitID
)
PIVOT ( MAX(splitId) id, MAX(splitName) name
FOR (displayOrder, displayOrder) IN ((1, 1) AS split, (2, 2) as splitname)
);

Related

Sample observations per group without replacement in SQL

Using the provided table I would like to sample let's say 2 users per day so that users assigned to the two days are different. Of course the problem I have is more sophisticated, but this simple example gives the idea.
drop table if exists test;
create table test (
user_id int,
day_of_week int);
insert into test values (1, 1);
insert into test values (1, 2);
insert into test values (2, 1);
insert into test values (2, 2);
insert into test values (3, 1);
insert into test values (3, 2);
insert into test values (4, 1);
insert into test values (4, 2);
insert into test values (5, 1);
insert into test values (5, 2);
insert into test values (6, 1);
insert into test values (6, 2);
The expected results would look like this:
create table results (
user_id int,
day_of_week int);
insert into results values (1, 1);
insert into results values (2, 1);
insert into results values (3, 2);
insert into results values (6, 2);
You can use window functions. Here is an example . . . although the details do depend on your database (functions for random numbers vary by database):
select t.*
from (select t.*, row_number() over (partition by day_of_week order by random()) as seqnum
from test t
) t
where seqnum <= 2;

PL/SQL update all records except with max value

Please help with SQL query. I've got a table:
CREATE TABLE PCDEVUSER.tabletest
(
id INT PRIMARY KEY NOT NULL,
name VARCHAR2(64),
pattern INT DEFAULT 1 NOT NULL,
tempval INT
);
Let's pretend it was filled with values:
INSERT INTO TABLETEST (ID, NAME, PATTERN, TEMPVAL) VALUES (1, 'A', 1, 10);
INSERT INTO TABLETEST (ID, NAME, PATTERN, TEMPVAL) VALUES (2, 'A', 1, 20);
INSERT INTO TABLETEST (ID, NAME, PATTERN, TEMPVAL) VALUES (3, 'A', 2, 10);
INSERT INTO TABLETEST (ID, NAME, PATTERN, TEMPVAL) VALUES (5, 'A', 2, 20);
INSERT INTO TABLETEST (ID, NAME, PATTERN, TEMPVAL) VALUES (4, 'A', 2, 30);
And I need to update all records (grouped by pattern) with NO MAX value TEMPVALUE. So as result I have to update records with Ids (1, 3, 5). Records with IDs (2, 4) has max values in there PATTERN group.
HELP PLZ
This select statement will help you get the IDs you need :
SELECT
*
FROM
(SELECT
id
,name
,pattern
,tempval
,MAX(tempval) OVER (PARTITION BY pattern) max_tempval
FROM
tabletest
)
WHERE 1=1
AND tempval != max_tempval
;
You should be able to build an update statement around that easily enough
Something like this:
update tabletest t
set ????
where t.tempval < (select max(tempval) from tabletest tt where tt.pattern = t.pattern);
It is unclear what values you want to set. The ???? is for the code that sets the values.

SQL merge statement with multiple conditions

I have a requirement with some business rules to implement on SQL (within a PL/SQL block): I need to evaluate such rules and according to the result perform the corresponding update, delete or insert into a target table.
My database model contains a "staging" and a "real" table. The real table stores records inserted in the past and the staging one contains "fresh" data coming from somewhere that needs to be merged into the real one.
Basically these are my business rules:
Delta between staging MINUS real --> Insert rows into the real
Delta between real MINUS staging--> Delete rows from the real
Rows which PK is the same but any other fields different: Update.
(Those "MINUS" will compare ALL the fields to get equality and distinguise the 3rd case)
I haven't figured out the way to accomplish such tasks without overlapping between rules by using a merge statement: Any suggestion for the merge structure? Is it possible to do it all together within the same merge?
Thank you!
If I understand you task correctly following code should do the job:
--drop table real;
--drop table stag;
create table real (
id NUMBER,
col1 NUMBER,
col2 VARCHAR(10)
);
create table stag (
id NUMBER,
col1 NUMBER,
col2 VARCHAR(10)
);
insert into real values (1, 1, 'a');
insert into real values (2, 2, 'b');
insert into real values (3, 3, 'c');
insert into real values (4, 4, 'd');
insert into real values (5, 5, 'e');
insert into real values (6, 6, 'f');
insert into real values (7, 6, 'g'); -- PK the same but at least one column different
insert into real values (8, 7, 'h'); -- PK the same but at least one column different
insert into real values (9, 9, 'i');
insert into real values (10, 10, 'j'); -- in real but not in stag
insert into stag values (1, 1, 'a');
insert into stag values (2, 2, 'b');
insert into stag values (3, 3, 'c');
insert into stag values (4, 4, 'd');
insert into stag values (5, 5, 'e');
insert into stag values (6, 6, 'f');
insert into stag values (7, 7, 'g'); -- PK the same but at least one column different
insert into stag values (8, 8, 'g'); -- PK the same but at least one column different
insert into stag values (9, 9, 'i');
insert into stag values (11, 11, 'k'); -- in stag but not in real
merge into real
using (WITH w_to_change AS (
select *
from (select stag.*, 'I' as action from stag
minus
select real.*, 'I' as action from real
)
union (select real.*, 'D' as action from real
minus
select stag.*, 'D' as action from stag
)
)
, w_group AS (
select id, max(action) as max_action
from w_to_change
group by id
)
select w_to_change.*
from w_to_change
join w_group
on w_to_change.id = w_group.id
and w_to_change.action = w_group.max_action
) tmp
on (real.id = tmp.id)
when matched then
update set real.col1 = tmp.col1, real.col2 = tmp.col2
delete where tmp.action = 'D'
when not matched then
insert (id, col1, col2) values (tmp.id, tmp.col1, tmp.col2);

SQL Expression Assistance

I am pulling data from using Telerik Standalone Reporting Application. I have a Table called "Product" one of the columns is called "ProductStatus". ProductStatus is an int value ranging from 1-12. Breakdown below:
"1"=Active
"2"=Retired
"3"=Processing
"5"=Archived
"6"=Active-Empty
"7"=Available
"8"=Resigned
"9"=Terminated
"10"=Legal Freeze
"11"=Admin Hold
"12"=Reserved
My question is: How can I write an Expression that will Look at "ProductStatus" and If = to 1 Then Return "Active" Or If = to 2 Then Return "Retired" etc.
You have 2 ways to do that
use CASE and code the translation table in the request
SELECT CASE WHEN 1 THEN 'Active' ... END FROM ...;
put the translation table in a database table and use a join :
CREATE TABLE product_status(id INTEGER, status VARCHAR(32));
INSERT INTO product_status(id, status) VALUES(1, 'Active');
...
SELECT ps.status, ... FROM product_status, ... WHERE product_status.id = ..., ... ;
You can create a table with all status texts like this:
CREATE TABLE statusText
([id] int, [text] varchar(12))
;
INSERT INTO statusText
([id], [text])
VALUES
(1, 'Active'),
(2, 'Retired'),
(3, 'Processing'),
(5, 'Archived'),
(6, 'Active-Empty'),
(7, 'Available'),
(8, 'Resigned'),
(9, 'Terminated'),
(10, 'Legal Freeze'),
(11, 'Admin Hold'),
(12, 'Reserved')
;
And then join Products table like this:
SELECT a.*, b.Text
FROM Products a INNER JOIN statusText b
ON a.ProductStatus=b.id

How many times are the results of this common table expression evaluated?

I am trying to work out a bug we've found during our last iteration of testing. It involves a query which uses a common table expression. The main theme of the query is that it simulates a 'first' aggregate operation (get the first row for this grouping).
The problem is that the query seems to choose rows completely arbitrarily in some circumstances - multiple rows from the same group get returned, some groups simply get eliminated altogether. However, it always picks the correct number of rows.
I have created a minimal example to post here. There are clients and addresses, and a table which defines the relationships between them. This is a much simplified version of the actual query I'm looking at, but I believe it should have the same characteristics, and it is a good example to use to explain what I think is going wrong.
CREATE TABLE [Client] (ClientID int, Name varchar(20))
CREATE TABLE [Address] (AddressID int, Street varchar(20))
CREATE TABLE [ClientAddress] (ClientID int, AddressID int)
INSERT [Client] VALUES (1, 'Adam')
INSERT [Client] VALUES (2, 'Brian')
INSERT [Client] VALUES (3, 'Charles')
INSERT [Client] VALUES (4, 'Dean')
INSERT [Client] VALUES (5, 'Edward')
INSERT [Client] VALUES (6, 'Frank')
INSERT [Client] VALUES (7, 'Gene')
INSERT [Client] VALUES (8, 'Harry')
INSERT [Address] VALUES (1, 'Acorn Street')
INSERT [Address] VALUES (2, 'Birch Road')
INSERT [Address] VALUES (3, 'Cork Avenue')
INSERT [Address] VALUES (4, 'Derby Grove')
INSERT [Address] VALUES (5, 'Evergreen Drive')
INSERT [Address] VALUES (6, 'Fern Close')
INSERT [ClientAddress] VALUES (1, 1)
INSERT [ClientAddress] VALUES (1, 3)
INSERT [ClientAddress] VALUES (2, 2)
INSERT [ClientAddress] VALUES (2, 4)
INSERT [ClientAddress] VALUES (2, 6)
INSERT [ClientAddress] VALUES (3, 3)
INSERT [ClientAddress] VALUES (3, 5)
INSERT [ClientAddress] VALUES (3, 1)
INSERT [ClientAddress] VALUES (4, 4)
INSERT [ClientAddress] VALUES (4, 6)
INSERT [ClientAddress] VALUES (5, 1)
INSERT [ClientAddress] VALUES (6, 3)
INSERT [ClientAddress] VALUES (7, 2)
INSERT [ClientAddress] VALUES (8, 4)
INSERT [ClientAddress] VALUES (5, 6)
INSERT [ClientAddress] VALUES (6, 3)
INSERT [ClientAddress] VALUES (7, 5)
INSERT [ClientAddress] VALUES (8, 1)
INSERT [ClientAddress] VALUES (5, 4)
INSERT [ClientAddress] VALUES (6, 6)
;WITH [Stuff] ([ClientID], [Name], [Street], [RowNo]) AS
(
SELECT
[C].[ClientID],
[C].[Name],
[A].[Street],
ROW_NUMBER() OVER (ORDER BY [A].[AddressID]) AS [RowNo]
FROM
[Client] [C] INNER JOIN
[ClientAddress] [CA] ON
[C].[ClientID] = [CA].[ClientID] INNER JOIN
[Address] [A] ON
[CA].[AddressID] = [A].[AddressID]
)
SELECT
[CTE].[ClientID],
[CTE].[Name],
[CTE].[Street],
[CTE].[RowNo]
FROM
[Stuff] [CTE]
WHERE
[CTE].[RowNo] IN (SELECT MIN([CTE2].[RowNo]) FROM [Stuff] [CTE2] GROUP BY [CTE2].[ClientID])
ORDER BY
[CTE].[Name] ASC,
[CTE].[Street] ASC
DROP TABLE [ClientAddress]
DROP TABLE [Address]
DROP TABLE [Client]
The query is designed to get all clients, and their first address (the address with the lowest ID). This appears to me that it should work.
I have a theory about why it sometimes will not work. The statement that follows the CTE refers to the CTE in two places. If the CTE is non-deterministic, and it gets run more than once, the result of the CTE may be different in the two places it's referenced.
In my example, the CTE's RowNo column uses ROW_NUMBER() with an order by clause that will potentially result in different orderings when run multiple times (we're ordering by address, the clients can be in any order depending on how the query is executed).
Because of this is it possible that CTE and CTE2 can contain different results? Or is the CTE only executed once and do I need to look elsewhere for the problem?
It is not guaranteed in any way.
SQL Server is free to evaluate CTE each time it's accessed or cache the results, depending on the plan.
You may want to read this article:
Generating XML in subqueries
If your CTE is not deterministic, you will have to store its result in a temporary table or a table variable and use it instead of the CTE.
PostgreSQL, on the other hand, always evaluates CTEs only once, caching their results.