Create Rows to Column in (RAPID) SQL without PIVOT - sql

I am using the query below to pull up a list of accounts and the optional codes that go with them. There are 95 codes for each account, not just the 2 I am showing in the results below.
SELECT DISTINCT Ref1.ACCOUNT_ID as Acct_Numb,
Current_Date as DATA_DATE,
Cat.OPTIONAL_CTGRY_CD As Code,
Cat.OPTIONAL_CTGRY_CD || ' - ' || Cat.OPTIONAL_CTGRY_NM AS Code_Combo,
Class.OPTIONAL_CLASS_CD as Code_Answer,
Class.OPTIONAL_CLASS_NM as Code_Answer_Desc
FROM xxxxx.zzzzzz_OPT_REF Ref1
LEFT JOIN xxxxx.zzzzzz_OPT_CATEGORY Cat
ON xxxxx.zzzzzz_OPT_REF.OPTIONAL_CTGRY_CD = xxxxx.zzzzzz_OPT_CATEGORY.OPTIONAL_CTGRY_CD
LEFT JOIN xxxxx.zzzzzz_OPT_CLASS Class
ON xxxxx.zzzzzz_OPT_REF.OPTIONAL_CLASS_CD = xxxxx.zzzzzz_OPT_CLASS.OPTIONAL_CLASS_CD
AND xxxxx.zzzzzz_OPT_CATEGORY.OPTIONAL_CTGRY_CD = xxxxx.zzzzzz_OPT_CLASS.OPTIONAL_CTGRY_CD
LEFT JOIN xxxxx.HRTVACT_PCS Acct
ON xxxxx.ACCOUNT_ID = Acct.ACCOUNTID
WHERE xxxxx.ACCOUNTSTATUS = 'OPEN' AND xxxxx.ACCOUNTID = '123456' OR xxxxx.ACCOUNTID = '654321'
ORDER BY ACCT_NUMB ASC, CODE ASC;
Here are the results
DATA_DATE ACCT_NUMB CODE CODE_COMBO CODE_ANSWER CODE_ANSWER_DESC
11/8/2016 123456 1 1 - Reporting 0 NOT APPLICABLE
11/8/2016 123456 2 2 - System 4 SYSTEM 2
11/8/2016 654321 1 1 - Reporting 3 APPLIED
11/8/2016 654321 2 2 - System 3 N/A
I need to create the results as a pivot table that looks like the table below.
(CODE) (CODE_COMBO) (CODE) (CODE_COMBO)
DATA_DATE ACCT_NUMB 1 1 - Reporting 2 2 - System
11/8/2016 123456 0 NOT APPLICABLE 4 SYSTEM 2 (CODE_ANSWER)/(CODE_ANSWER_DESC)
11/8/2016 654321 3 APPLIED 3 N/A (CODE_ANSWER)/(CODE_ANSWER_DESC)
I have not tried this before and I am stumped

I have accomplished this in the past using table alias so that you can join a table to itself. I have done it with 4 or 5 columns not 95 so there may be a better way that I am not aware of.

Related

How to bring results from rows into columns with respective values in SQL

I currently have the below results table:
Company ID
External ID
Attribute
Int Value
1
101
Calls
3
1
101
Emails
14
1
101
Accounts
4
2
102
Calls
2
2
102
Emails
17
2
102
Accounts
5
And I would like to transform my query results to show as below:
Company ID
External ID
Calls
Emails
Accounts
1
101
3
14
4
2
102
2
17
5
Is this possible and if so, how would I do this? I'm a new SQL user and can't seem to figure this out :)
This is my current query to get my results:
SELECT
ic.company_id,
ic.external_id,
icca.attribute,
icca.int_value
FROM
intercom_companies AS ic
LEFT JOIN intercom_companies_custom_attributes AS icca
ON ic.company_id = icca.company_id
A pivot query should work here:
SELECT
ic.company_id,
ic.external_id,
MAX(CASE WHEN icca.attribute = 'Calls'
THEN icca.int_value END) AS Calls,
MAX(CASE WHEN icca.attribute = 'Emails'
THEN icca.int_value END) AS Emails,
MAX(CASE WHEN icca.attribute = 'Accounts'
THEN icca.int_value END) AS Accounts
FROM intercom_companies AS ic
LEFT JOIN intercom_companies_custom_attributes AS icca
ON ic.company_id = icca.company_id
GROUP BY
ic.company_id,
ic.external_id;

Join three tables and retrieve the expected result

I have 3 tables. User Accounts, IncomingSentences and AnnotatedSentences. Annotators annotate the incoming sentences and tag an intent to it. Then, admin reviews those taggings and makes the corrections on the tagged intent.
DB-Fiddle Playground link: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=00a770173fa0568cce2c482643de1d79
Assuming myself as the admin, I want to pull the error report per annotator.
My tables are as follows:
User Accounts table:
userId
userEmail
userRole
1
user1#gmail.com
editor
2
user2#gmail.com
editor
3
user3#gmail.com
editor
4
user4#gmail.com
admin
5
user5#gmail.com
admin
Incoming Sentences Table
sentenceId
sentence
createdAt
1
sentence1
2021-01-01
2
sentence2
2021-01-01
3
sentence3
2021-01-02
4
sentence4
2021-01-02
5
sentence5
2021-01-03
6
sentence6
2021-01-03
7
sentence7
2021-02-01
8
sentence8
2021-02-01
9
sentence9
2021-02-02
10
sentence10
2021-02-02
11
sentence11
2021-02-03
12
sentence12
2021-02-03
Annotated Sentences Table
id
annotatorId
sentenceId
annotatedIntent
1
1
1
intent1
2
4
1
intent2
3
2
2
intent4
4
3
4
intent4
5
1
5
intent2
6
3
3
intent3
7
5
3
intent2
8
1
6
intent4
9
4
6
intent1
10
1
7
intent1
11
4
7
intent3
12
3
9
intent3
13
2
10
intent3
14
5
10
intent1
Expected Output:
I want an output as a table which provides the info about total-sentences-annotated-per-each editor and the total-sentences-corrected-by-admin on top of editor annotated sentences. I don't want to view the admin-tagged-count in the same table. If it comes also, total-admin-corrected should return 0.
|userEmail |totalTagged|totalAdminCorrected|
|---------------|------------|---------------------|
|user1#gmail.com| 4 | 3 |
|user2#gmail.com| 2 | 1 |
|user3#gmail.com| 3 | 1 |
Query I wrote: I've tried my best. You can see that in the DB-Fiddle
My query is not resulting in the expected output. Requesting your help to achieve this.
My proposal...
SELECT UserEmail, SUM(EDICount), SUM(ADMCount)
FROM (SELECT UserAccounts.UserEmail, AnnotatedSentences.SentenceID, COUNT(*) AS EDICount
FROM AnnotatedSentences
LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
WHERE UserRole='editor'
GROUP BY UserAccounts.UserEmail, AnnotatedSentences.SentenceID) AS EDI
LEFT JOIN (SELECT AnnotatedSentences.SentenceID, COUNT(*) AS ADMCount
FROM AnnotatedSentences
LEFT JOIN UserAccounts ON UserAccounts.UserID=AnnotatedSentences.AnnotatorID
WHERE UserRole='admin'
GROUP BY AnnotatedSentences.SentenceID) AS ADM ON EDI.SentenceID=ADM.SentenceID
GROUP BY UserEmail
Because sentence_id might be reviewed by different users (role), you can try to use subquery (INNER JOIN between user_accounts & annotated_sentences) with window function + condition aggregate function, getting count by your logic.
if you don't want to see admin count information you can use where filter rows.
SELECT user_email,
count(Total_Tagged) Total_Tagged,
SUM(totalAdmin) totalAdmin
FROM (
SELECT ist.sentence_id,
user_email,
user_role,
count(CASE WHEN a.user_role = 'editor' THEN 1 END) over(partition by ist.sentence_id) + count(CASE WHEN a.user_role = 'admin' THEN 1 END) over(partition by ist.sentence_id) Total_Tagged,
count(CASE WHEN a.user_role = 'admin' THEN 1 END) over(partition by ist.sentence_id) totalAdmin
FROM user_accounts a
INNER JOIN annotated_sentences ats ON
a.user_id = ats.annotator_id
INNER JOIN incoming_sentences ist
ON ist.sentence_id = ats.sentence_id
) t1
WHERE user_role = 'editor'
GROUP BY user_email
ORDER BY user_email
sqlfiddle
Okay, i really rushed this so there might still be an error in the Code, but try something like this:
SELECT
a.user_email,
count(ist) Total_Tagged,
sum(innerTable.edits)
FROM
incoming_sentences ist
JOIN annotated_sentences ats ON
ist.sentence_id = ats.sentence_id
JOIN user_accounts a ON
a.user_id = ats.annotator_id
LEFT JOIN ( SELECT ics.sentence_id, count(anno.id) AS edits FROM annotated_sentences anno
LEFT JOIN user_accounts ua ON
ua.user_id = anno.annotator_id
LEFT JOIN incoming_sentences AS ics ON
ics.sentence_id = anno.sentence_id
WHERE user_role LIKE 'admin'
GROUP BY ics.sentence_id ) AS innerTable
ON innerTable.sentence_id = ist.sentence_id
GROUP BY a.user_email
The inner select should count how many admin-edits there are per post, the outer one then sums up that number for every post a user edited.
If it is guaranteed that one sentence can only be annotated once and only be reviewed once, then you can simply group by sentence and get the editor and admin. Then you group by editor and count.
select
editor,
count(*) as total_tagged,
count(admin) as total_admin_corrected
from
(
select
max(ua.user_email) filter (where ua.user_role = 'editor') as editor,
max(ua.user_email) filter (where ua.user_role = 'admin') as admin
from annotated_sentences ans
join user_accounts ua on ua.user_id = ans.annotator_id
group by ans.sentence_id
) with_editor_and_admin
group by editor
order by editor;
Demo: https://dbfiddle.uk/?rdbms=postgres_14&fiddle=e409ec49af25ac8329a99b02161832fb

How to 'create' NULL data in Teradata SQL for non existing relations

I have 2 tables, one lists features with a feature value that an account might or might not have (TBL_Feat), the other lists the accounts (TBL_Acct).
I'm looking for a query to give me all features for every account, and if the feature doesn't exist for this account, a line with the feature but with NULL as value. My list of features is fixed, so that's no concern.
Tbl_Feat
FEATURE_ID FEATURE_VALUE ACCOUNT_NBR
1 3 100
1 4 101
1 6 102
2 4 102
Tbl_Acct
Account_nbr
100
101
102
103
What I'm expecting to see is a result like this:
Account_nbr FEATURE_ID FEATURE_VALUE
100 1 3
100 2 null
101 1 4
101 2 null
102 1 6
102 2 4
103 1 null
103 2 null
One adittional question, would anything change to your answer if there is a feature that is not prevalent in the Tbl_Feat table? Eg. FEATURE_ID = 3 in my example here.
Use a cross join to generate the rows and left join to bring in the values:
select a.account_nr, f.feature_id, tf.feature_value
from tbl_acct a cross join
(select distinct feature_id from tbl_feat) f left join
tbl_feat tf
on tf.account_nbr = a.account_nbr and
tf.feature_id = f.feature_id
order by a.account_nr, f.feature_id;

SQL Remove row when other rows are subset, keep row when no subset

I have a data set of 300,000 rows, looking at harvested acreage in the United States. Some, but not all of my data is double counted and I am trying to remove the double counting. The data looks like this:
Year | State | Crop | Practice | Acres Harvested | Acres
-------------------------------------------------------------
2008 1 1 1 1000 or more 40
2008 1 1 1 1000 to 1999 10
2008 1 1 1 2000 to 2999 30
2008 2 1 1 1000 or more 87
2008 3 2 2 1.0 to 14.9 15
2008 3 2 2 1.0 to 4.9 5
2008 3 2 2 5.0 to 14.9 10
Some of the rows are subsets for other rows in the [Acres Harvested] column (rows 2 and 3 are a subset of row 1 and rows 6 and 7 are a subset of row 5). In situations where I have more detailed information for [Acres Harvested] (rows 2 and 3 provide more detail than row 1), I would like to keep the detailed information (row 2 and 3) and omit the general information (row 1). In other scenarios, I only have the general information (row 4), so that is what I will keep.
I am having trouble writing the code to omit the general information when the detailed information is present, but to keep the general information when the more detailed information does not exist.
I've been trying to write an "inner join" to join my table back with itself, but am unsure of how to omit rows when certain conditions are met. What I have:
SELECT *
FROM A
INNER JOIN (SELECT *
FROM A
GROUP BY [YEAR], [STATE], [CROP], [PRACTICE]
HAVING COUNT (*) > 1) AS B
ON A.Year = B.Year
AND A.State = B.State
AND A.Crop = B.Crop
AND A.Practice = B.Practice
And now I'm stuck...
Results should look like:
Year | State | Crop | Practice | Acres Harvested | Acres
-------------------------------------------------------------
2008 1 1 1 1000 to 1999 10
2008 1 1 1 2000 to 2999 30
2008 2 1 1 1000 or more 87
2008 3 2 2 1.0 to 4.9 5
2008 3 2 2 5.0 to 14.9 10
Appreciate any help!
Your question is a bit vague. This will return the result set you've specified for the input data you've specified:
select a.*
from a
where a.acres_harvested not like '% or more' or
not exists (select 1
from a a2
where a2.year = a.year and a2.state = a.state and a2.crop = a.crop and
a2.acres_harvested like '[0-9]%to%[0-9]'
);
Assuming your criteria for "more detailed information" is records for a matched set that don't end in "or more" as I guessed in my comment, you can get your desired output this way. You do the records sets with only one record and those with multiple records separately and UNION them instead of trying to do it with one SELECT.
SELECT A.*
FROM A
GROUP BY [YEAR], [STATE], [CROP], [PRACTICE]
HAVING
COUNT (*) = 1
UNION
SELECT A.*
FROM A
INNER JOIN
(SELECT [YEAR], [STATE], [CROP], [PRACTICE]
FROM A
GROUP BY [YEAR], [STATE], [CROP], [PRACTICE]
HAVING
COUNT (*) > 1
) AS B
ON A.[Year] = B.[Year]
AND A.[State] = B.[State]
AND A.[Crop] = B.[Crop]
AND A.[Practice] = B.[Practice]
WHERE [ACRES HARVESTED] not like '%%or more'
If your criteria aren't what I guess just change the WHERE clause.
Given your updated sample data you're also going to have to check for overlapping number ranges. This question has some options on how to do that: Discard existing dates that are included in the result, SQL Server. You'll need to split your "X to Y" values into two numeric fields as well.

Query to find Cumulative while subtracting other counts

Here is my table structure
Id INT
RecId INT
Dated DATETIME
Status INT
and here is my data.
Status table (contains different statuses)
Id Status
1 Created
2 Assigned
Log table (contains logs for the different statuses that a record went through (RecId))
Id RecId Dated Status
1 1 2013-12-09 14:16:31.930 1
2 7 2013-12-09 14:27:26.620 1
3 1 2013-12-09 14:27:26.620 2
3 8 2013-12-10 11:14:13.747 1
3 9 2013-12-10 11:14:13.747 1
3 8 2013-12-10 11:14:13.747 2
What I need to generate a report from this data in the following format.
Dated Created Assigned
2013-12-09 2 1
2013-12-10 3 1
Here the rows data is calculated date wise. The Created is calculated as (previous record (date) Created count - Previous date Assigned count) + Todays Created count.
For example if on date 2013-12-10 three entries were made to log table out of which two have the status Created while one has the status assigned. So in the desired view that I want to build for report, For date 2013-12-10, the view will return Created as 2 + 1 = 3 where 2 is newly inserted records in log table and 1 is the previous day remaining record count (Created - Assigned) 2 - 1.
I hope the scenario is clear. Please ask me if further information is required.
Please help me with the sql to construct the above view.
This matches the expected result for the provided sample, but may require more testing.
with CTE as (
select
*
, row_number() over(order by dt ASC) as rn
from (
select
cast(created.dated as date) as dt
, count(created.status) as Created
, count(Assigned.status) as Assigned
, count(created.status)
- count(Assigned.status) as Delta
from LogTable created
left join LogTable assigned
on created.RecId = assigned.RecId
and created.status = 1
and assigned.Status = 2
and created.Dated <= assigned.Dated
where created.status = 1
group by
cast(created.dated as date)
) x
)
select
dt.dt
, dt.created + coalesce(nxt.delta,0) as created
, dt.assigned
from CTE dt
left join CTE nxt on dt.rn = nxt.rn+1
;
Result:
| DT | CREATED | ASSIGNED |
|------------|---------|----------|
| 2013-12-09 | 2 | 1 |
| 2013-12-10 | 3 | 1 |
See this SQLFiddle demo