I have a two tables comments table and responses table #azure data warehouse.
Comments table
commenid
comment
1
Hi aaa
2
Hi xxx
3
Hi yyy
Responses table
Responseid
response
linkid
createddate
123
open ticket
1
10-25-2021
124
Activate
123 ( this is my privious responseid)
10-26-2021
3452
Close
124
10-30-2021
532
reply to xxx
2
10-25-2021
3214
closed
532
10-29-2021
654
hold
3
11-14-2021
Comment table and responses table has comment and first response relationship.
Responses table has first comment response and next response.
Now i need a query which gives
My response tables with commentid like below in Azure sql
I tried joins but not showing the below results
Responseid
response
linkid
comments
123
open ticket
1
1
124
Activate
123
1
3452
Close
124
1
532
reply to xxx
2
2
3214
closed
532
2
654
hold
3
3
Please help me. Thank you in advance.
With the following tables :
CREATE TABLE T_COMMENTS
(commenid INT, comment VARCHAR(256));
INSERT INTO T_COMMENTS VALUES
(1, 'Hi aaa'),
(2, 'Hi xxx'),
(3, 'Hi yyy');
CREATE TABLE T_RESPONSES
(Responseid int, response VARCHAR(256), linkid int, createddate DATE)
INSERT INTO T_RESPONSES VALUES
(123, 'open ticket', 1, '10-25-2021'),
(124, 'Activate', 123, '10-26-2021'),
(3452, 'Close', 124, '10-30-2021'),
(532, 'reply to xxx', 2, '10-25-2021'),
(3214, 'closed', 532, '10-29-2021'),
(654, 'hold', 3, '11-14-2021');
The above query do the job :
WITH
T AS
(
SELECT Responseid, linkid, linkid AS link_comment
FROM T_RESPONSES
WHERE linkid NOT IN (SELECT Responseid
FROM T_RESPONSES)
UNION ALL
SELECT R.Responseid, R.linkid, link_comment
FROM T_RESPONSES AS R
JOIN T ON T.Responseid = R.linkid
)
SELECT R.Responseid, response, R.linkid, T.link_comment, comment
FROM T
JOIN T_RESPONSES AS R ON T.Responseid = R.Responseid
JOIN T_COMMENTS AS C ON T.link_comment = C.commenid;
Related
I am working on a SQL query in Azure Databricks Environment, where considering the following dataset:
CREATE OR REPLACE TABLE tb_user_info
(
clientid INT,
visitid STRING,
channel STRING,
conversion INT,
index INT,
value STRING
);
INSERT INTO tb_user_info VALUES
(123, 'abc123', 'google', 1, 11, '1250'),
(123, 'abc123', 'google', 1, 22, '25000'),
(123, 'abc123', 'google', 1, 33, '1K and 3K'),
(456, 'def456', 'facebook', 3, 11, '2860'),
(456, 'def456', 'facebook', 3, 22, '78000'),
(456, 'def456', 'facebook', 3, 33, '3K and 5K');
SELECT * FROM tb_user_info ORDER BY clientid, index
clientid
visitid
channel
conversion
index
value
123
abc123
google
1
11
1250
123
abc123
google
1
22
25000
123
abc123
google
1
33
1K and 3k
456
def456
facebook
3
11
2860
456
def456
facebook
3
22
78000
456
def456
facebook
3
33
3K and 5k
I want to get the following output:
clientid
visitid
channel
conversion
salary (index=11)
savings (index=22)
salary range (index=33)
123
abc123
google
1
1250
25000
1K and 3k
456
def456
facebook
1
2860
78000
3K and 5k
where the columns clientid, visitid, channel and conversion are grouped and the columns index and value are the columns that are pivoted.
I've tried using the Pivot function and I read this Documentation but I haven't been successful.
Could you help me with how can I solve this task?
I am not sure what actual problem you have encounted, I write one query, it seems work normally,
SELECT * FROM (
SELECT clientid, visitid , channel , conversion , ind , value
FROM tb_user_info
) ss
PIVOT (
max(value)
FOR ind in (
[11] ,[22] ,[33]
)
) as a
I've been struggling to get my hand around the following scenario where I can't find a similar question with a similar scenario asked, so here we go:
Let's say I have two tables:
Messages:
P_ID
Message_ID
Message_sent_date
123
ABCD
2020/03/01
123
BCDE
2020/07/01
234
CDEF
2020/01/01
234
DEFG
2020/05/01
People:
P_ID
P_Achievement
Achievement_date
123
Level 1
2019/09/01
123
Level 4
2020/06/01
234
Level 2
2019/12/01
234
Level 3
2020/04/01
I want to join people on messages with P_ID, BUT with the condition that only the most recent achievement relevant to the message_sent_date is displayed from P_ID. So it should look like this:
P_ID
Message_ID
Message_sent_date
P_achievement
Achievment_date
123
ABCD
2020/03/01
Level 1
2019/09/01
123
BCDE
2020/07/01
Level 4
2020/06/01
234
CDEF
2020/01/01
Level 2
2019/12/01
234
DEFG
2020/05/01
Level 3
2020/04/01
My current code actually works, however, the problem is that the messages and people data tables in my real-life problem are incredibly large so the query takes more than an hour to run (I actually never fully finished running it because it takes too long, but tried with a specific P_ID example where it works). Filtering the two original tables doesn't help much.
I know that the subquery in the where is causing the long run time so I was wondering if anyone knows how to solve this type of problem in a more efficient way?
Thanks already in advance!
SELECT *
FROM messages m
LEFT JOIN people p
ON m.P_ID = p.P_ID
WHERE Achievment_date = (SELECT MAX(Achievement_date)
FROM people
WHERE Message_sent_date >= Achievement_date
)
Tested in PostgreSQL v14, should be the same for Presto.
CREATE TABLE messages (
"p_id" INTEGER,
"message_id" VARCHAR(4),
"message_sent_date" TIMESTAMP
);
INSERT INTO messages
("p_id", "message_id", "message_sent_date")
VALUES
('123', 'ABCD', '2020/03/01'),
('123', 'BCDE', '2020/07/01'),
('234', 'CDEF', '2020/01/01'),
('234', 'DEFG', '2020/05/01');
CREATE TABLE people (
"p_id" INTEGER,
"p_achievement" VARCHAR(7),
"achievement_date" TIMESTAMP
);
INSERT INTO people
("p_id", "p_achievement", "achievement_date")
VALUES
('123', 'Level 1', '2019/09/01'),
('123', 'Level 4', '2020/06/01'),
('234', 'Level 2', '2019/12/01'),
('234', 'Level 3', '2020/04/01');
Query: Don't do the ORDER BY if you're looking for speed, that was just to get it to match your result.
SELECT p_id, message_id, Message_sent_date, p_achievement, achievement_date
FROM (
SELECT p_id, message_id, MIN(Message_sent_date) Message_sent_date, MAX(achievement_date) achievement_date
FROM messages m
LEFT JOIN people p
USING(p_id)
WHERE message_sent_date >= achievement_date
GROUP BY p_id, message_id
) t
JOIN people
USING(p_id, achievement_date)
ORDER BY p_id, message_id
p_id
message_id
message_sent_date
p_achievement
achievement_date
123
ABCD
2020-03-01
Level 1
2019-09-01
123
BCDE
2020-07-01
Level 4
2020-06-01
234
CDEF
2020-01-01
Level 2
2019-12-01
234
DEFG
2020-05-01
Level 3
2020-04-01
View on DB Fiddle
Googling SQL PIVOT brings up answers to more complex situations than I need with aggregations, and although I did find this simple SQL Pivot Query , it's pivoting on a single table, whereas I have two, it's doing a rank partition which I don't know is necessary, I can't actually get it to work, plus it's 5 years old and I'm hoping there's an easier way.
I am sure this is a duplicate question so if someone can find it then please do!
People table:
PersonID
========
1
2
3
Device table:
DeviceID | PersonID
===================
1111 1
2222 1
3333 1
123 2
456 2
9999 3
I do a join like this:
SELECT p.PersonID, d.DeviceID FROM People p
LEFT JOIN Device d on d.PersonID = p.PersonID
Which gives me:
PersonID | DeviceID
===================
1 1111
1 2222
1 3333
2 123
2 456
3 9999
I know what you're thinking, it's just the Device table, but this is a minimal version of the query and tables, there's much more going on in the real ones,
I want to be able to inject a join on the People table to the Device table and get three columns:
Must I use PIVOT to get the results like this? (there will always be a max of three devices per person)
PersonID | 1 | 2 | 3
===============================================
1 1111 2222 3333
2 123 456
3 9999
(Where the blanks would be NULL)
I'm trying:
SELECT PersonID, [1], [2], [3]
FROM (
SELECT p.PersonID, d.DeviceID FROM People p
LEFT JOIN Device d on d.PersonID = p.PersonID) AS r
PIVOT
(
MAX(DeviceID)
FOR DeviceID IN([1], [2], [3])
) AS p;
But it's giving me NULL for all three columns.
The value list defined in the pivot clause must contain actual values from your table. [1], [2], [3] are values from your PersonId, not for DeviceId. So the part for DeviceId in [1], [2], [3] is not producing any results, hence all the null values.
Here is my solution. I constructed a new key_ column to pivot around.
Sample data with added person names
declare #person table
(
personid int,
personname nvarchar(100)
);
insert into #person (personid, personname) values
(1, 'Ann'),
(2, 'Britt'),
(3, 'Cedric');
declare #device table
(
personid int,
deviceid int
);
insert into #device (personid, deviceid) values
(1, 1111),
(1, 2222),
(1, 3333),
(2, 123),
(2, 456),
(3, 9999);
Solution
Run the CTE part on its own to see the intermediate result table. The key_ column contains values like DEVICE_* which are the same values used in the for key_ in part of the pivot clause.
with base as
(
select p.personname,
d.deviceid,
'DEVICE_' + convert(char, ROW_NUMBER() over(partition by p.personname order by d.deviceid)) as 'key_'
from #person p
join #device d
on d.personid = p.personid
)
select piv.personname, piv.DEVICE_1, piv.DEVICE_2, piv.DEVICE_3
from base
pivot( max(deviceid) for key_ in ([DEVICE_1], [DEVICE_2], [DEVICE_3]) ) piv;
Result
The intermediate CTE result table
personname deviceid key_
---------- ----------- ----------
Ann 1111 DEVICE_1
Ann 2222 DEVICE_2
Ann 3333 DEVICE_3
Britt 123 DEVICE_1
Britt 456 DEVICE_2
Cedric 9999 DEVICE_1
The final result
personname DEVICE_1 DEVICE_2 DEVICE_3
---------- ----------- ----------- -----------
Ann 1111 2222 3333
Britt 123 456 NULL
Cedric 9999 NULL NULL
I have a table named "customer" that looks like this:
ID ALPHA BRAVO CHARLIE DATE
-------------------------------------------------
1 111 222 333 02/02/2019
2 333 444 555 11/11/2019
3 666 555 777 12/12/2019
4 777 888 999 05/05/2020
5 100 101 110 12/25/2020
and I need to get the following output:
ID ALPHA BRAVO CHARLIE DATE NEW_COL ROW_NUM
-----------------------------------------------------------------------
1 111 222 333 02/02/2019 333 4
2 333 444 555 11/11/2019 333 3
3 666 555 777 12/12/2019 333 2
4 777 888 999 05/05/2020 333 1
5 100 101 110 12/25/2020 010 1
The ALPHA, BRAVO, and CHARLIE columns represent customer IDs. A given customer can have multiple IDs in the system. Records 1-4 represent IDs belonging to the same customer, let's say John. As per the table, John has 12 IDs, and his latest ID is 999. Record 5 represents another customer, let's say Jane. Jane has three IDs, and her last ID is 110.
The purpose of the ROW_NUM column is to get the last CUSTOMER.CHARLIE value. The idea is to use the first CHARLIE value as the partition. Basically, the goal is to get one parent:many children mapping. In this case, the ID 333 should be tied to 555, 777, and 999.
Here is the DDL/DML:
CREATE TABLE CUSTOMER
(ID NUMBER(20) NOT NULL,
ALPHA NUMBER(20) NOT NULL,
BRAVO NUMBER(20) NOT NULL,
CHARLIE NUMBER(20) NOT NULL,
CREATEDDATE DATE
);
INSERT INTO CUSTOMER
VALUES
(1, 111, 222, 333, to_date('02-FEB-19','DD-MON-RR'));
INSERT INTO CUSTOMER
VALUES
(2, 333, 444, 555, to_date('11-NOV-19','DD-MON-RR'));
INSERT INTO CUSTOMER
VALUES
(3, 666, 555, 777, to_date('12-DEC-19','DD-MON-RR'));
INSERT INTO CUSTOMER
VALUES
(4, 777, 888, 999, to_date('05-MAY-20','DD-MON-RR'));
INSERT INTO CUSTOMER
VALUES
(5, 100, 101, 110, to_date('25-DEC-20','DD-MON-RR'));
COMMIT;
I have tried the following query, but it fails to populate the partition column correctly:
WITH
charlies
AS
(SELECT DISTINCT charlie
FROM customer),
mult_customers
AS
(SELECT c.*, c.charlie AS NEW_COL
FROM customer c
UNION
SELECT c.*,
CASE WHEN c.alpha = e.charlie THEN c.alpha ELSE c.bravo END AS NEW_COL
FROM customer c
JOIN charlies e ON e.charlie = c.alpha OR e.charlie = c.bravo),
ranked
AS
(SELECT mc.*,
ROW_NUMBER ()
OVER (PARTITION BY NEW_COL ORDER BY createddate DESC) AS row_num
FROM mult_customers mc)
SELECT *
FROM ranked
ORDER BY ID;
Thanks for any help provided.
You task is known as connected components. I wrote about 7-8 years ago solution for this and even pl/sql package: http://orasql.org/2017/09/29/connected-components/
This PL/SQL solution is much more effective then pure SQL solutions: http://orasql.org/2014/02/28/straight-sql-vs-sql-and-plsql/
Let me know if you need help with adopting it for your task.
Hospital_Visit
hid pid HospitalName DoctorId
41 1 abc 1
42 2 xyx 2
Patient_Master
pid PatientName
1 jill
2 rosy
Doctor_Master
DoctorID DoctorName
1 John
2 Jack
Hospital_Study
sid hid exam status
1 41 jjj sfvn
2 41 fks jdjd
select Hospital_Visit.Pid,PatientName,DoctorName from Patient_Master
inner join Hospital_Visit on Hospital_Visit.pid=Patient_Master.pid
inner join Doctor_Master on Doctor_Master.DoctorID= Hospital_Visit.DoctorID
inner join Hospital_Study on Hospital_Study.hid=Hospital_Visit.hid
Pid PatientName DoctorName exam status
1 Jill John jjj sfvn
2 rosy John fks jdjd
**
//Correct output i want
Pid PatientName DoctorName exam status
1 Jill John jjj sfvn
2 rosy Jack fks jdjd
**
i am getting wrong result repeting doctor
name in result because of inner join hid on Hospital_Visit and Hospital_Study
How can i takle this problem
(DTU Edit - Current sample data in usable form):
create table Hospital_Visit(hid int,pid int,HospitalName char(3), DoctorId int)
insert into Hospital_Visit(hid, pid, HospitalName, DoctorId) values
(41, 1, 'abc', 1),
(42, 2, 'xyx', 2)
create table Patient_Master(pid int, PatientName char(4))
insert into Patient_Master(pid, PatientName) values
(1, 'jill'),
(2, 'rosy')
create table Doctor_Master(DoctorID int, DoctorName char(4))
insert into Doctor_Master(DoctorID, DoctorName) values
(1, 'John'),
(2, 'Jack')
create table Hospital_Study(sid int, hid int, exam char(3), status char(4))
insert into Hospital_Study(sid, hid, exam, status) values
(1, 41, 'jjj' ,'sfvn'),
(2, 41, 'fks' ,'jdjd')
With the sample data given right now (revision 4), it's impossible to get the output you want.
Right now, your query returns this:
Pid PatientName DoctorName
1 jill John
1 jill John
What you want is this:
//Correct output i want
Pid PatientName DoctorName exam status
1 Jill John jjj sfvn
2 rosy Jack fks jdjd
...but the data in the Hospital_Study table doesn't match this, because both lines have hid = 41:
Hospital_Study
sid hid exam status
1 41 jjj sfvn
2 41 fks jdjd
So they both reference the first row from the Hospital_Visit table, which belongs to the patient named "Jill".
--> with this data, it's impossible to select the patient named "rosy", because there is no row in the Hospital_Study table that refers to rosy's visit (hid = 42).
To get the desired output, the data in the Hospital_Study would need to look like this:
Hospital_Study
sid hid exam status
1 41 jjj sfvn
2 42 fks jdjd
/\
||
this is different
With this data, and the exact query from your question, you get this result:
Pid PatientName DoctorName
1 jill John
2 rosy Jack
I have a doubt on your join
inner join Hospital_Study on Hospital_Study.hid=Hospital_Visit.hid
Hospital_Study.hid is Foreign key that's right but Hospital_Visit.hid is a primary key or is it a foreign key.
If Hospital_Visit.hid is a foreign key of then you have to add one more inner join on your Hospital's Master table (Hospital_Master).