Optimized paging query in SQL Server - sql-server-2012

I am trying to optimize a paging query for my query with total count of records in a stored procedure. Please give some optimized paging query to fetch 25 records per page from millions of records.
DDL Commands
create table pdf_details
(
prodid nvarchar(100),
prodname nvarchar(100),
lang nvarchar(100),
fmt nvarchar(5),
type varchar(2)
constraint pk_pdf Primary Key (proid, lang, fmt)
)
create table html_details
(
prodid nvarchar(100),
prodname nvarchar(100),
lang nvarchar(100),
fmt nvarchar(5),
type varchar(2)
constraint pk_html Primary Key(prodid, lang, fmt)
)
create index ix_pdf_details on pdf_details(prodname)
Sample records
insert into pdf_details
values ('A100', 'X', 'EN', 'HM', 'PDF'),
('A100', 'X', 'JP', 'GM', 'PDF'),
('A100', 'X', 'EN', 'HM', 'PDF'),
('B101', 'Y', 'EN', 'HM', 'PDF');
insert into html_details
values ('B100', 'X', 'EN', 'HM', 'HTML')
('B100', 'X', 'JP', 'GM', 'HTML')
('B100', 'X', 'EN', 'HM', 'HTML')
('C101', 'Y', 'EN', 'GH', 'HTML')
In reality, these tables contain millions of rows.
Original query
SELECT DISTINCT
TP.PRODID AS ID,
TP.PRODNAME AS NAME,
TP.LANG AS LANG,
TP.FMT,
TP.TYPE
FROM
PDF_DETAILS TP
WHERE
TP.PRODID = #PRODID
AND (#PRODUCTNAME IS NULL OR
REPLACE(REPLACE(REPLACE(REPLACE(TP.PRODNAME, '™', '|TM'), '®', '|TS'), '©', '|CP'), '°', '|DEG')
LIKE REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#PRODNAME, '[', '\['), '_', '\_'), '™', '|TM'), '®', '|TS'), '©', '|CP'), '°', '|DEG') ESCAPE '\'
UNION ALL
SELECT DISTINCT
TP.PRODID AS ID,
TP.PRODNAME AS NAME,
TP.LANG AS LANG,
TP.FMT,
TP.TYPE
FROM
HTML_DETAILS TP
WHERE
TP.PRODID = #PRODID
AND (#PRODUCTNAME IS NULL OR
REPLACE(REPLACE(REPLACE(REPLACE(TP.PRODNAME,'™','|TM'),'®','|TS'),'©','|CP'),'°','|DEG')
LIKE REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(#PRODNAME,'[','\['),'_','\_'),'™','|TM'),'®','|TS'),'©','|CP'),'°','|DEG') ESCAPE '\'

As of SQL Server 2012, you can use the OFFSET ... FETCH approach to paging - you Google for it, there are TONS of great articles about it.
Basically, you have to do something like this:
SELECT (list-of-columns)
FROM YourTable
(optionally add JOINs here)
WHERE (conditions)
ORDER BY (some column)
OFFSET n ROWS
FETCH NEXT y ROWS ONLY
Basically, you must have an ORDER BY (since offsetting / skipping only makes sense when you know what your data is ordered by), and then you can define with the OFFSET clause (with a fixed number of a SQL Server variable #offset) how many rows (in that defined ordering) to skip, and the FETCH NEXT clause (again with a fixed number of a SQL Server variable #numrows) defines how many rows will be returned.

Related

Replacing a 3 level deep nested cursors in SQL Server

I have three SQL Server tables that I need to loop trhough and update. I did it successfully with a cursor but it is so slow that it is pretty pointless sincethe main table with all the data to loop through is over 1,000 rows long.
The tables are (with some sample data):
-- The PK is InvoiceId and the IsMajorPart is '0' or '1'.
-- The MajorPartId and SubPartId1 to 4 are "technically" FKs for PartId but aren't hooked up and will not be ever due to some external issues outside of scope.
-- The part Id's can be NULL or empty.
-- This table exists elsewhere and is loaded with Id's being varchars but in transfering they will be going in as int's which is the proper way.
CREATE TABLE dbo.Invoices(
InvoicdeId varchar(50),
PartName varchar(255),
IsMajorPart varchar(1)
MajorPartId varchar(50),
SubPartId1 varchar(50),
SubPartId2 varchar(50),
SubPartId3 varchar(50),
SubPartId4 varchar(50));
-- Sampe inserts
INSERT INTO dbo.Invoices VALUES ('1', 'A Part', '0', '', '100', '105', '' ,''):
INSERT INTO dbo.Invoices VALUES ('5', 'E Part', '1', '101', '110', '', '' ,''):
INSERT INTO dbo.Invoices VALUES ('11', 'Z Part', '1', '201', '100', '115', '' ,''):
-- Essentially the old table above is being moved into a normalized, correct tables below.
- The PK is the PartId
CREATE TABLE dbo.Parts
PartsId int,
PartName varchar(255)
-- Sampe inserts (that will be updated or inserted by looping through the first table)
INSERT INTO dbo.Parts VALUES (100,'A Part'):
INSERT INTO dbo.Parts VALUES (110,'B Part'):
INSERT INTO dbo.Parts VALUES (201,'C Part'):
-- The PK is the combination of InvoiceId and PartId
CREATE TABLE dbo.InvoiceToParts
InvoiceId int,
PartsId int,
IsMajorPart bit);
-- Sampe inserts (that will be inserted from the first table but conflicts might occur if an InvoiceId from the first table has 2 PartId's that are the same)
INSERT INTO dbo.Parts VALUES (1, 100, 0):
INSERT INTO dbo.Parts VALUES (5, 100, 1):
INSERT INTO dbo.Parts VALUES (17, 201, 0):
The sample INSERTs above are just samples of the data for seeing what is in the tables.
The rules to move Invoices (I don't care what happens to this table), into the correct tables of Parts and InvoiceToParts are below (and these last two tables are the only ones that I care about.
Loop through Invoices and get all the data.
First, find out if IsMajorPart is '1' and then get the MajorPartId.
Push the MajorPartId with PartName in Parts table if it DOESN'T already exist.
Next check InvoiceToParts to see if the PK of InvoiceId and PartId exist.
If they do, update IsMaorPart to '1'.
If they don't exist, INSERT it.
Next do the same process for all SubPartId1 to SubPartId4.
I have a nested 3-level cursor which performance-wise ran for over 30min before I stopped it as it wasn't even close to finishing and was sucking up all the resources. I am trying to look for a faster way to do this. The Invoices table can have up to about 5,000 rows in it.
You need to unpivot your data and then just do what is called an UPSERT, which has two steps:
If exists, update record(s)
If not exists, insert record(s)
Plenty of examples if you search for examples online for UPSERT
Table Setup
DROP TABLE IF EXISTS #Invoice
DROP TABLE IF EXISTS #Unpivot
DROP TABLE IF EXISTS #InvoiceToParts
DROP TABLE IF EXISTS #Parts
CREATE TABLE #Parts(
PartsId int,
PartName varchar(255)
)
CREATE TABLE #InvoiceToParts(
InvoiceId int,
PartsId int,
IsMajorPart bit
);
CREATE TABLE #Invoice(
InvoiceId varchar(50),
PartName varchar(255),
IsMajorPart varchar(1),
MajorPartsID varchar(50),
SubPartsID1 varchar(50),
SubPartsID2 varchar(50),
SubPartsID3 varchar(50),
SubPartsID4 varchar(50)
);
INSERT INTO #Invoice
VALUES ('1', 'A Part', '0', '', '100', '105', '' ,'')
,('5', 'E Part', '1', '101', '110', '', '' ,'')
,('11', 'Z Part', '1', '201', '100', '115', '' ,'')
SQL to Process Data
Will first unpivot the data, then load into Parts table first so the ID's can be referenced before inserting into the junction table InvoicetoParts
SELECT A.InvoiceId
,B.*
INTO #Unpivot
FROM #Invoice AS A
CROSS APPLY (
VALUES
(NULLIF(MajorPartsID,''),PartName,IsMajorPart)
,(NULLIF(SubPartsID1,''),NULL,0)
,(NULLIF(SubPartsID2,''),NULL,0)
,(NULLIF(SubPartsID3,''),NULL,0)
,(NULLIF(SubPartsID4,''),NULL,0)
) AS B(PartsID,PartName,IsMajorPart)
WHERE B.PartsID IS NOT NULL /*If not data, filter out*/
/*INSERT into table Parts if not exists*/
INSERT INTO #Parts
SELECT PartsID,PartName
FROM #Unpivot AS A
WHERE A.IsMajorPart = 1
AND NOT EXISTS (
SELECT *
FROM #Parts AS DTA
WHERE A.PartsID = DTA.PartsID
)
GROUP BY PartsID,PartName
/*UPSERT into table dbo.InvoiceParts*/
UPDATE #InvoiceToParts
SET IsMajorPart = B.IsMajorPart
FROM #InvoiceToParts AS A
INNER JOIN #Unpivot AS B
ON A.InvoiceId = B.InvoiceId
AND A.PartsId = B.PartsID
INSERT INTO #InvoiceToParts(InvoiceId,PartsId,IsMajorPart)
SELECT InvoiceId
,PartsId
,IsMajorPart
FROM #Unpivot AS A
WHERE NOT EXISTS (
SELECT *
FROM #InvoiceToParts AS DTA
WHERE A.InvoiceId = DTA.InvoiceID
AND A.PartsID = DTA.PartsID
)
SELECT *
FROM #InvoiceToParts
SELECT *
FROM #Parts

CREATE FUNCTION for attribute to at most 3 people?

I am currently doing a hotel booking application on SQL Server 2018, and am trying to write a constraint for the RoomNo attribute of my SQL Server table. Essentially, I want each RoomNo to only be able to have at most 3 person, but ran into an error when trying to do the CREATE FUNCTION.
This are my current code:
CREATE TABLE Passenger
(
ID smallint ,
Name varchar (50) NOT NULL,
Email varchar (319) NULL,
DOB smalldatetime NOT NULL,
Gender char (1) NOT NULL CHECK (Gender IN ('M', 'F')),
RoomNo tinyint NOT NULL,
CONSTRAINT PK_Passenger PRIMARY KEY NONCLUSTERED (ID),
CONSTRAINT CHK_Passenger_Gender CHECK (Gender IN ('M', 'F'))
)
CREATE FUNCTION CalculateRoomNo
(
#value tinyint
)
RETURNS bit
AS
BEGIN
IF (SELECT COUNT(RoomNo) FROM Passenger GROUP BY RoomNo) <= 3
RETURN 0
RETURN 1
END
GO
ALTER TABLE Passenger
ADD CONSTRAINT CHK_RoomNoPax CHECK (dbo.CalculateRoomNo(RoomNo) = 0)
GO
When I add a passenger into the table, if it is formatted like this:
INSERT INTO Passenger
VALUES (1, 'Rob', 'Rob#gmail.com', '2017-10-04', 'M', 12)
INSERT INTO Passenger
VALUES (2, 'Darren', 'Darren#yahoo,com', '1976-12-21', 'F', 12)
INSERT INTO Passenger
VALUES (3, 'Peggy', '', '2006-03-15', 'F', 12)
INSERT INTO Passenger
VALUES (4, 'Carlos', '', '1981-04-06', 'F', 12)
It will stop at
INSERT INTO Passenger VALUES (3, 'Peggy', '', '2006-03-15', 'F', 12)
since RoomNo '12' has reached its maximum capacity.
But, if I added the values like such where the room numbers are different from each other:
INSERT INTO Passenger
VALUES (1, 'Rob', 'Rob#gmail.com', '2017-10-04', 'M', 69)
INSERT INTO Passenger
VALUES (2, 'Darren', 'Darren#yahoo,com', '1976-12-21', 'F', 74)
INSERT INTO Passenger
VALUES (3, 'Peggy', '', '2006-03-15', 'F', 45)
INSERT INTO Passenger
VALUES (4, 'Carlos', '', '1981-04-06', 'F', 72)
INSERT INTO Passenger
VALUES (5, 'John', 'johnny#hotmail.com', '1988-05-06', 'M', 69)
It will return an error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
Is there any way I can properly run this SQL?
The query with the GROUP BY can return more than 1 record if there's more than 1 RoomNo.
If you include a WHERE clause for the RoomNo then it can only be 1 COUNT
CREATE FUNCTION CalculateRoomNo
(
#RoomNo tinyint
)
RETURNS bit
AS
BEGIN
IF (SELECT COUNT(*) FROM Passenger WHERE RoomNo = #RoomNo) <= 3
RETURN 0
RETURN 1
END
Demo on db<>fiddle here
As mentioned by #LukStorms, your query has no filter on RoomNo, therefore it can return multiple rows. A scalar subquery must return a maximum of one row.
But the most correct way to achieve what you are trying to do, is not to use this function at all. Instead you can add another column, and create a unique constraint across that and the RoomNo
ALTER TABLE Passenger
ADD RoomNoPax tinyint NOT NULL
CONSTRAINT CHK_RoomNoPax CHECK (RoomNoPax >= 1 AND RoomNoPax <= 3);
ALTER TABLE Passenger
ADD CONSTRAINT UQ_RoomNo_RoomNoPax UNIQUE (RoomNo, RoomNoPax);
db<>fiddle
You now have an extra column which must have the value 1, 2 or 3. And there is a unique constraint over every pair of that value and the RoomNo, so you cannot now put more than 3 Passenger in each RoomNo.
You need to change the logic in the function.
You should be checking for the number of records for any given roomno, not the number of roomno in the entire table.
the subquery should return only one scalar value because you are performing a logical operation.
i.e.
if exists (select 1 from Passenger group by roomno having count(1) <= 3)
begin
return 1
end
else
begin
return 0
end
In the above query, we are checking for the number of persons assigned to each room number and if there is an existence of such case then it will return 1. In this case, it will not return more than one record.
Please modify the return value as per your requirement.
Please upvote if you find this answer useful

SQL Server pivot values with different data types

i am trying to pivot all values in different type in MSSQL 2016. I could not find a way how i can pivot different data types..
The first table are initial form / structure. The second table is the desired shape.
I was trying the following SQL code to pivot my values
SELECT
[id] AS [id],
FIRSTNAME,
LASTNAME,
BIRTHDATE,
ADDRESS,
FLAG,
NUMBER
FROM (
SELECT
[cm].[key] AS [id],
[cm].[column] AS [column],
[cm].[string] AS [string],
[cm].[bit] AS [bit],
[cm].[xml] AS [xml],
[cm].[number] AS [number],
[cm].[date] AS [date]
FROM [cmaster] AS [cm]
) AS [t]
PIVOT (
MAX([string]) --!?!?
FOR [column] IN (
FIRSTNAME,
LASTNAME,
BIRTHDATE,
ADDRESS,
FLAG,
NUMBER
)
) AS [p]
I think your best bet is to use conditional aggregation, e.g.
SELECT cm.id,
FIRSTNAME = MAX(CASE WHEN cm.[property] = 'firstname' THEN cm.[string] END),
LASTNAME = MAX(CASE WHEN cm.[property] = 'lastname' THEN cm.[string] END),
BIRTHDATE = MAX(CASE WHEN cm.[property] = 'birthddate' THEN cm.[date] END),
FLAG = CONVERT(BIT, MAX(CASE WHEN cm.[bit] = 'flag' THEN CONVERT(TINYINT, cm.[boolean]) END)),
NUMBER = MAX(CASE WHEN cm.[property] = 'number' THEN cm.[integer] END)
FROM cmaster AS cm
GROUP BY cm.id;
Although, as you can see, your query becomes very tightly coupled to your EAV model, and why EAV is considered an SQL antipattern. Your alternative is to create a single column in your subquery and pivot on that, but you have to convert to a single data type, and lose a bit of type safety:
SELECT id, FIRSTNAME, LASTNAME, BIRTHDATE, ADDRESS, FLAG, NUMBER
FROM (
SELECT id = cm.[key],
[column] = cm.[column],
Value = CASE cm.type
WHEN 'NVARCHAR' THEN cm.string
WHEN 'DATETIME' THEN CONVERT(NVARCHAR(MAX), cm.date, 112)
WHEN 'XML' THEN CONVERT(NVARCHAR(MAX), cm.xml)
WHEN 'BIT' THEN CONVERT(NVARCHAR(MAX), cm.boolean)
WHEN 'INT' THEN CONVERT(NVARCHAR(MAX), cm.integer)
END
FROM cmaster AS cm
) AS t
PIVOT
(
MAX(Value)
FOR [column] IN (FIRSTNAME, LASTNAME, BIRTHDATE, ADDRESS, FLAG, NUMBER)
) AS p;
In order to make the result as per your request, first thing is we need to bring the data in to one format which is compatible with all data types. VARCHAR is ideal for that. Then prepare the base table using a simple select query, then PIVOT the result.
In the last projection, if you want, you can convert the data back in to the original format.
This query can be written dynamically as well to obtain the result as records are added. Here I provide the static answer according to your data. If you need a more generic dynamic answer, let me know. So I can post here.
--data insert scripts I used:
CREATE TABLE First_Table
(
[id] int,
[column] VARCHAR(10),
[string] VARCHAR(20),
[bit] BIT,
[xml] [xml],
[number] INT,
[date] DATE
)
SELECT GETDATE()
INSERT INTO First_Table VALUES(1, 'FIRST NAME', 'JOHN' , NULL, NULL, NULL, NULL)
INSERT INTO First_Table VALUES(1, 'LAST NAME', 'DOE' , NULL, NULL, NULL, NULL)
INSERT INTO First_Table VALUES(1, 'BIRTH DATE', NULL , NULL, NULL, NULL, '1985-02-25')
INSERT INTO First_Table VALUES(1, 'ADDRESS', NULL , NULL, 'SDFJDGJOKGDGKPDGKPDKGPDKGGKGKG', NULL, NULL)
INSERT INTO First_Table VALUES(1, 'FLAG', NULL , 1, NULL, NULL, NULL)
INSERT INTO First_Table VALUES(1, 'NUMBER', NULL , NULL, NULL, 20, NULL)
SELECT
PIVOTED.* FROM
(
--MAKING THE BASE TABLE FOR PIVOT
SELECT
[id]
,[column] AS [COLUMN]
, CASE WHEN [column] = 'FIRST NAME' then [string]
WHEN [column] = 'LAST NAME' then [string]
WHEN [column] = 'BIRTH DATE' then CAST([date] AS VARCHAR(100))
WHEN [column] = 'ADDRESS' then CAst([xml] as VARCHAR(100))
WHEN [column] = 'FLAG' then CAST([bit] AS VARCHAR(100))
else CAST([number] AS VARCHAR(100)) END AS [VALUE]
FROM First_Table
) AS [P]
PIVOT
(
MIN ([P].[VALUE])
FOR [column] in ([FIRST NAME],[LAST NAME],[BIRTH DATE],[ADDRESS],[FLAG],[NUMBER])
) AS PIVOTED
RESULT:
SQL:
SELECT
            ID,
            FIRSTNAME,
            ...,
            FLAG = CAST (FLAG AS INT),
            ...
FROM
            (
            SELECT
                        *
            FROM
                        (
                        SELECT
                                    f.ID,
                                    f.PROPERTY,
                                    f.STRING + f."INTEGER" + f.DATETIME + f.BOLLEAN + f.XML AS COLS
                        FROM
                                    FIRSTTBL f)
            PIVOT(
                        min(COLS) FOR PROPERTY IN
                                    (
                                    'firstname' AS firstname,
                                    'lastname' AS lastname,
                                    'birthdate' AS birthdate,
                                    'address' AS address,
                                    'flag' AS flag,
                                    'number' AS "NUMBER"
                                    )
                        )
            )
According to the original table, there is one and only one non-null value among STRING, INTEGER, DATETIME, BOLLEAN and XML columns for any row, so we just need to get the first non-null value and assign it to the corresponding new column. It is not difficult to perform the transposition using PIVOT function, except that we need to handle different data types according to the SQL rule, which requires that each column have a consistent type. For this task, first we need to convert the combined column values into a string, perform row-to-column transposition, and then convert string back to the proper types. When there are a lot of columns, the SQL statement can be tricky, and dynamic requirements are even hard to achieve.
Yet it is easy to write the code using the open-source esProc SPL:
 
A
1
=connect("MSSQL")
2
=A1.query#x("SELECT * FROM FIRSTTBL")
3
=A2.pivot(ID;PROPERTY,~.array().m(4:).ifn();"firstname":"FIRSTNAME", "lastname":"LASTANME","birthdate":"BIRTHDAY","address":"ADDRESS","flag":"FLAG","number":"NUMBER")
SPL does not require that data in the same column have consistent type. It is easy for it to maintain the original data types while performing the transposition.

How to convert a varchar value to datatype int in SQL Server 2008 with inner join

I have two lookup tables MyProviders and MyGroups. In my stored procedure, I have a temp table (replaced with an actual table for this example) with data. One column EntityId refers to either provider or a group. EntityTypeId tells me in that temp table if the entity is 1 = Provider or 2 = Group. EntityId can either have numeric GroupId or alphanumeric ExternalProviderId.
I want to check if there is any record in my temp table that has an invalid combination of clientOid + entityid from myprovider and mygroup table.
create table MyProviders
(
id int,
clientoid varchar(20),
externalproviderid varchar(20),
name varchar(25)
)
create table MyGroups
(
id int,
clientoid varchar(20),
name varchar(25)
)
create table MyJobDetails
(
clientoid varchar(20),
entityid varchar(20),
entitytypeid int,
entityname varchar(30)
)
insert into MyJobDetails values ('M.OID', 'MONYE', 1, 'Mark')
insert into MyJobDetails values ('M.OID', 2, 1, 'Lori')
insert into MyJobDetails values ('M.OID', 2, 2, 'Group 1')
insert into MyJobDetails values ('M.OID', 44444, 2, 'Group 2')
insert into MyProviders values (1, 'M.OID', 'MONY', 'Richard')
insert into MyProviders values (2, 'M.OID', '2', 'Mike')
insert into MyProviders values (3, 'M.OID', '3', 'Lori')
insert into MyGroups values (1, 'M.OID', 'Group 1')
insert into MyGroups values (2, 'M.OID', 'Group 2')
I tried the following query to determine if there is an invalid entity or not.
select
COUNT(*)
from
MyJobDetails as jd
where
not exists (select 1
from MyProviders as p
where p.ClientOID = jd.ClientOID
and p.ExternalProviderID = CAST(jd.EntityId as varchar(20))
and jd.EntityTypeId = 1)
and not exists (select 1
from MyGroups as g
where g.ClientOID = jd.ClientOID
and g.Id = jd.EntityId
and jd.EntityTypeId = 2)
This works as expected until I get an alphanumeric data in my temp table that doesn't exist in provider table. I get the following error message:
Conversion failed when converting the varchar value 'MONYE' to data type int.
I have tried to update the solutions mentioned in other threads to use IsNumeric but it didn't work either. In this example, I need to return 1 for one invalid entry of MONYE which doesn't exist either in MyProvider or MyGroup table.
Also, if I can optimize the query in better way to achieve what I want?
This is a really bad design in my opinion.
Since you're referencing one out of two tables, you cannot enforce referential integrity.
And having different datatypes for your keys makes things even more horrible.
I would use
two separate foreign keys in MyJobDetails - one to MyProvider (varchar(20)) and another one to MyGroup (int)
make them both nullable
establish a proper foreign key relationship to the referenced table for each of those two
This way, both can be the correct datatype for each referenced table, and you won't need the EntityTypeId column anymore.
As a side note: whenever you use Varchar in SQL Server, whether you're defining a parameter, a variable, or using it in a CAST statement, I would recommend to always explicitly define a length for that varchar.
Or do you know what length this varchar in your conversion here is going to be?
CAST(jd.EntityId as varchar)
Use an explicit length - always - it's just a good, safe practice to employ:
CAST(jd.EntityId as varchar(15))
In the second AND NOT EXISTS section you compare g.Id, an int, with jd.EntityId, a varchar. Cast the g.Id as a varchar.
and not exists (select 1
from #MyGroups as g
where g.ClientOID = jd.ClientOID
and CAST(g.Id AS VARCHAR(20)) = jd.EntityId
and jd.EntityTypeId = 2)
Try this
select count(*)
from (
select clientoid,entityid from #MyJobDetails where entitytypeid=1
except
select p.ClientOID ,convert(varchar(200),p.ExternalProviderID) from #MyProviders p inner join #MyJobDetails jd on p.ClientOID = jd.ClientOID and p.ExternalProviderID = CAST(jd.EntityId as varchar(20)) where jd.EntityTypeId = 1
except
select g.ClientOID,convert(varchar(200),g.Id) from #MyGroups g inner join #MyJobDetails jd on g.ClientOID = jd.ClientOID and g.Id = jd.EntityId where jd.EntityTypeId = 2
)a

Is the following query possible with SQL Pivot?

Let's say I have the following tables:
create table student(
id number not null,
name varchar2(80),
primary key(id)
);
create table class(
id number not null,
subject varchar2(80),
primary key(id)
);
create table class_meeting(
id number not null,
class_id number not null,
meeting_sequence number,
primary key(id),
foreign key(class_id) references class(id)
);
create table meeting_attendance(
id number not null,
student_id number not null,
meeting_id number not null,
present number not null,
primary key(id),
foreign key(student_id) references student(id),
foreign key(meeting_id) references class_meeting(id),
constraint meeting_attendance_uq unique(student_id, meeting_id),
constraint present_ck check(present in(0,1))
);
I want a query for each class, which has a column for the student name, one column for every class_meeting for this class and for every class meeting the cells would show the present attribute, which should be 1 if the student was present at that meeting and 0 if the student was absent in that meeting. Here is a picture from excel for reference:
Is it possible to make an apex report like that?
From googling I figured I must use Pivot, however I'm having a hard time understanding how it could be used here. Here is the query I have so far:
select * from(
select s.name, m.present
from student s, meeting_attendance m
where s.id = m.student_id
)
pivot(
present
for class_meeting in ( select a.meeting_sequence
from class_meeting a, class b
where b.id = a.class_id )
)
However I'm sure it's way off. Is it even possible to do this with one query, or should I use pl sql htp and htf packages to create an html table?
Pretty inexperienced oracle developer here, so any help is very appreciated.
It took a while to answer, but I had to write this all up and test it!
Data I've worked with:
begin
insert into student(id, name) values (1, 'Tom');
insert into student(id, name) values (2, 'Odysseas');
insert into class(id, subject) values (1, 'Programming');
insert into class(id, subject) values (2, 'Databases');
insert into class_meeting (id, class_id, meeting_sequence) values (1, 1, 10);
insert into class_meeting (id, class_id, meeting_sequence) values (2, 1, 20);
insert into class_meeting (id, class_id, meeting_sequence) values (3, 2, 10);
insert into class_meeting (id, class_id, meeting_sequence) values (4, 2, 20);
insert into meeting_attendance (id, student_id, meeting_id, present) values (1, 1, 1, 1); -- Tom was at meeting 10 about programming
insert into meeting_attendance (id, student_id, meeting_id, present) values (2, 1, 2, 1); -- Tom was at meeting 20 about programming
insert into meeting_attendance (id, student_id, meeting_id, present) values (3, 1, 3, 0); -- Tom was NOT at meeting 10 about databases
insert into meeting_attendance (id, student_id, meeting_id, present) values (4, 1, 4, 0); -- Tom was NOT at meeting 20 about databases
insert into meeting_attendance (id, student_id, meeting_id, present) values (5, 2, 1, 0); -- Odysseas was NOT at meeting 10 about programming
insert into meeting_attendance (id, student_id, meeting_id, present) values (6, 2, 2, 1); -- Odysseas was at meeting 20 about programming
insert into meeting_attendance (id, student_id, meeting_id, present) values (7, 2, 3, 0); -- Odysseas was NOT at meeting 10 about databases
insert into meeting_attendance (id, student_id, meeting_id, present) values (8, 2, 4, 1); -- Odysseas was at meeting 20 about databases
end;
PIVOT , as it stands right now, does not allow a dynamic number of columns in a simple way. It only allows this with the XML keyword, resulting in an xmltype column.
Here are some excellent docs. http://www.oracle-base.com/articles/11g/pivot-and-unpivot-operators-11gr1.php
It always pays off to read those first.
How to, then?
You'll literally find tons of questions about the same thing once you start searching.
Dynamic SQL
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:4471013000346257238
Dynamically pivoting a table Oracle
Dynamic Oracle Pivot_In_Clause
A classic report can take a function body returning a sql statement as return. An interactive report can not. As it stands, an IR is out of the question as it is too metadata dependent.
For example, with these queries/plsql in a classic report region source:
static pivot
select *
from (
select s.name as student_name, m.present present, cm.meeting_sequence||'-'|| c.subject meeting
from student s
join meeting_attendance m
on s.id = m.student_id
join class_meeting cm
on cm.id = m.meeting_id
join class c
on c.id = cm.class_id
)
pivot ( max(present) for meeting in ('10-Databases' as "10-DB", '20-Databases' as "20-DB", '10-Programming' as "10-PRM", '20-Programming' as "20-PRM") );
-- Results
STUDENT_NAME '10-Databases' 20-DB 10-PRM 20-PRM
Tom 0 0 1 1
Odysseas 0 1 0 1
function body returning statement
DECLARE
l_pivot_cols VARCHAR2(4000);
l_pivot_qry VARCHAR2(4000);
BEGIN
SELECT ''''||listagg(cm.meeting_sequence||'-'||c.subject, ''',''') within group(order by 1)||''''
INTO l_pivot_cols
FROM class_meeting cm
JOIN "CLASS" c
ON c.id = cm.class_id;
l_pivot_qry :=
'select * from ( '
|| 'select s.name as student_name, m.present present, cm.meeting_sequence||''-''||c.subject meeting '
|| 'from student s '
|| 'join meeting_attendance m '
|| 'on s.id = m.student_id '
|| 'join class_meeting cm '
|| 'on cm.id = m.meeting_id '
|| 'join class c '
|| 'on c.id = cm.class_id '
|| ') '
|| 'pivot ( max(present) for meeting in ('||l_pivot_cols||') )' ;
RETURN l_pivot_qry;
END;
Take note however of the settings in the region source.
Use Query-Specific Column Names and Validate Query
This is the standard setting. It will parse your query and then store the columns found in the query in the report metadata. If you go ahead and create a report with the above plsql code, you can see that apex has parsed the query and has assigned the correct columns. What is wrong with this approach is that that metadata is static. The report's metadata is not refreshed every time the report is being ran.
This can be proven quite simply by adding another class to the data.
begin
insert into class(id, subject) values (3, 'Watch YouTube');
insert into class_meeting (id, class_id, meeting_sequence) values (5, 3, 10);
insert into meeting_attendance (id, student_id, meeting_id, present) values (10, 1, 5, 1); -- Tom was at meeting 10 about watching youtube
end;
Run the page without editing the report! Editing and saving will regenerate the metadata, which is clearly not a viable method. The data will change anyway, and you cannot go in and save the report metadata every time.
--cleanup
begin
delete from class where id = 3;
delete from class_meeting where id = 5;
delete from meeting_attendance where id = 10;
end;
Use Generic Column Names (parse query at runtime only)
Setting the source to this type will allow you to use a more dynamic approach. By changing the settings of the report to this type of parsing, apex will just generate an amount of columns in its metadata without being directly associated with the actual query. There'll just be columns with 'COL1', 'COL2', 'COL3',...
Run the report. Works fine. Now insert some data again.
begin
insert into class(id, subject) values (3, 'Watch YouTube');
insert into class_meeting (id, class_id, meeting_sequence) values (5, 3, 10);
insert into meeting_attendance (id, student_id, meeting_id, present) values (10, 1, 5, 1); -- Tom was at meeting 10 about watching youtube
end;
Run the report. Works fine.
However, the kink here are the column names. They're not really all that dynamic, with their ugly names. You can edit the columns, surely, but they're not dynamic. There is no class being displayed or anything, nor can you reliably set their headers to one. Again this makes sense: the metadata is there, but it is static. It could work for you if you're happy with this approach.
You can however deal with this. In the "Report Attributes" of the report, you can select a "Headings Type". They're all static, expect for "PL/SQL" of course! Here you can write a function body (or just call a function) which'll return the column headers!
DECLARE
l_return VARCHAR2(400);
BEGIN
SELECT listagg(cm.meeting_sequence||'-'||c.subject, ':') within group(order by 1)
INTO l_return
FROM class_meeting cm
JOIN "CLASS" c
ON c.id = cm.class_id;
RETURN l_return;
END;
Third party solution
https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:4843682300346852395#5394721000346803830
https://stackoverflow.com/a/16702401/814048
http://technology.amis.nl/2006/05/24/dynamic-sql-pivoting-stealing-antons-thunder/
In APEX: though the dynamic pivot is more straightforward after installing, the setup in apex remains the same as if you'd want to use dynamic SQL. Use a classic report with generic column names.
I'm not going to go into much detail here. I don't have this package installed atm. It's nice to have, but in this scenario it may not be that helpful. It purely allows you to write a dynamic pivot in a more concise way, but doesn't help much on the apex side of things. As I've demonstrated above, the dynamic columns and the static metadata of the apex reports are the limiting factor here.
Use XML
I myself have opted to use the XML keyword before. I use pivot to make sure I have values for all rows and columns, then read it out again with XMLTABLE, and then creating one XMLTYPE column, serializing it to a CLOB.
This may be a bit advanced, but it's a technique I've used a couple of times so far, with good results. It's fast, provided the base data is not too big, and it's just one sql call, so not a lot of context switches. I've used it with CUBE'd data aswell, and it works great.
(note: the classes I've added on the elements correspond with classes used on classic reports in theme 1, simple red)
DECLARE
l_return CLOB;
BEGIN
-- Subqueries:
-- SRC
-- source data query
-- SRC_PIVOT
-- pivoted source data with XML clause to allow variable columns.
-- Mainly used for convenience because pivot fills in 'gaps' in the data.
-- an example would be that 'Odysseas' does not have a relevant record for the 'Watch Youtube' class
-- PIVOT_HTML
-- Pulls the data from the pivot xml into columns again, and collates the data
-- together with xmlelments.
-- HTML_HEADERS
-- Creates a row with just header elements based on the source data
-- HTML_SRC
-- Creates row elements with the student name and the collated data from pivot_html
-- Finally:
-- serializes the xmltype column for easier-on-the-eye markup
WITH src AS (
SELECT s.name as student_name, m.present present, cm.meeting_sequence||'-'||c.subject meeting
FROM student s
JOIN meeting_attendance m
ON s.id = m.student_id
JOIN class_meeting cm
ON cm.id = m.meeting_id
JOIN class c
ON c.id = cm.class_id
),
src_pivot AS (
SELECT student_name, meeting_xml
FROM src pivot xml(MAX(NVL(present, 0)) AS is_present_max for (meeting) IN (SELECT distinct meeting FROM src) )
),
pivot_html AS (
SELECT student_name
, xmlagg(
xmlelement("td", xmlattributes('data' as "class"), is_present_max)
ORDER BY meeting
) is_present_html
FROM src_pivot
, xmltable('PivotSet/item'
passing meeting_xml
COLUMNS "MEETING" VARCHAR2(400) PATH 'column[#name="MEETING"]'
, "IS_PRESENT_MAX" NUMBER PATH 'column[#name="IS_PRESENT_MAX"]')
GROUP BY (student_name)
),
html_headers AS (
SELECT xmlelement("tr",
xmlelement("th", xmlattributes('header' as "class"), 'Student Name')
, xmlagg(xmlelement("th", xmlattributes('header' as "class"), meeting) order by meeting)
) headers
FROM (SELECT DISTINCT meeting FROM src)
),
html_src as (
SELECT
xmlagg(
xmlelement("tr",
xmlelement("td", xmlattributes('data' as "class"), student_name)
, ah.is_present_html
)
) data
FROM pivot_html ah
)
SELECT
xmlserialize( content
xmlelement("table"
, xmlattributes('report-standard' as "class", '0' as "cellpadding", '0' as "cellspacing", '0' as "border")
, xmlelement("thead", headers )
, xmlelement("tbody", data )
)
AS CLOB INDENT SIZE = 2
)
INTO l_return
FROM html_headers, html_src ;
htp.prn(l_return);
END;
In APEX: well, since the HTML has been constructed, this can only be a PLSQL region which calls the package function and prints it using HTP.PRN.
(edit) There's also this post on the OTN forum which does the same in a large part, but does not generate headings etc, rather using the apex functionalities:
OTN: Matrix report
PLSQL
Alternatively, you can just opt to go the good ol' plsql route. You could take the body from the dynamic sql above, loop over it, and put out a table structure by using htp.prn calls. Put out headers, and put out whatever else you want. For good effect, add classes on the elements which correspond with the theme you're using.
Disclaimer: I don't know apex specifically.
Here's a correct pivot query, assuming the class you want has an ID = 1, and that the meeting_id's for that class are 1,2,3.
select * from(
select s.name, a.present,m.id meeting_id
from student s, meeting_attendance a, class_meeting m, class c
where s.id = a.student_id
and m.id = a.meeting_id
and c.id = m.class_id
and c.id = 1
)
pivot(
sum(present)
for meeting_id in(1,2,3)
);
I don't believe you can use a sub-query to return the values for the "for in" of the pivot.