How to get data based on Case condition and MAX Date - sql

I have some data:
Declare #table table (RID VARCHAR(10),
CommType INT,
CommunicationType INT,
VALUE VARCHAR(20),
lastDate Datetime)
INSERT INTO #table (RID, CommType, CommunicationType, VALUE, lastDate)
VALUES
('00WAAS', 3, 0, 'mohan#gmail', '2012-06-15 15:23:49.653'),
('00WAAS', 3, 1, 'manasa#gmail', '2015-08-15 15:23:49.653'),
('00WAAS', 3, 2, 'mother#gmail', '2014-09-15 15:23:49.653'),
('00WAAS', 3, 2, 'father#gmail', '2016-01-15 15:23:49.653'),
('00WAAS', 3, 0, 'hello#gmail', '2013-01-15 15:23:49.653')
My query:
SELECT
TT.RID,
COALESCE(Homemail, BusinessMail, OtherMail) Mail
FROM
(SELECT
RID, MAX(Homemail) Homemail,
MAX(BusinessMail) BusinessMail,
MAX(OtherMail) OtherMail
FROM
(SELECT
RID,
CASE
WHEN CommType = 3 AND CommunicationType = 0 THEN VALUE
END AS Homemail,
CASE
WHEN CommType = 3 AND CommunicationType = 1 THEN VALUE
END AS BusinessMail,
CASE
WHEN CommType = 3 AND CommunicationType = 2 THEN VALUE
END AS OtherMail,
lastDate
FROM
#table) T
GROUP BY RID) TT
What I'm expecting
Here I need to get result if CommType = 3 and CommunicationType = 0 then related value based on latest date and if data is not available for
CommType = 3 and CommunicationType = 0
then I need to get data of CommunicationType = 1
related value based on latest date and if there is no data for
CommunicationType = 1
then CommunicationType = 2 based on latest date of that CommunicationTypes.
Here I have tried Case condition ,MAX and Coalesce
If combination data is present in CommunicationType = 0 is present get CommunicationType = 0 based on latest date
If combination data is not present in CommunicationType = 0 then get CommunicationType = 1 based on latest date
If combination data is not present in CommunicationType = 1 then get CommunicationType = 2 based on latest date

I'm not entirely sure I've understood the requirement. But I think you want:
One record returned for each RID.
The returned record should have a CommType of 3.
If there is more than one record with a CommType 3 you want the record with the lowest CommunicationType.
If there is still more than one record you want the one with the most recent lastDate.
This query uses the windowed function ROW_NUMBER to rank the available records, within a subquery. PARTITION BY ensures each RID is ranked sepearatly. The outer query returns all records with a rank of 1.
Query
SELECT
r.*
FROM
(
/* For each RID We want the lowest communication type with
* the most recent last date.
*/
SELECT
ROW_NUMBER() OVER (PARTITION BY RID ORDER BY CommunicationType, lastDate DESC) AS rn,
*
FROM
#table
WHERE
CommType = 3
) AS r
WHERE
r.rn = 1
;
Next Steps
This query is ok but could be better. For example what would happen if two records had a matching CommType, CommunicationType and lastDate? Reading up on the differences between ROW_NUMBER, RANK, DENSE_RANK and NTILE will help you figure out your options here.

If I understood you correctly, use ROW_NUMBER() :
SELECT tt.RID,COALESCE(tt.Homemail,tt.businessMail,tt.OtherMail)
FROM(
select s.RID,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 0 THEN s.VALUE END) AS Homemail,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 1 THEN s.VALUE END) AS BusinessMail,
MAX(CASE WHEN s.CommType = 3 AND s.CommunicationType = 2 THEN s.VALUE END) AS OtherMail
from (SELECT t.*,ROW_NUMBER() OVER(PARTITION BY t.rid,t.communicationType ORDER BY t.lastDate DESC)
FROM #table t
WHERE t.commType = 3) s
WHERE s.rnk = 1
GROUP BY s.rid) tt

Related

SQL to query historical table that the count of the number of times in the column is 1

I'm not even sure what to call this type of query and that's why the title might be misleading. Here's what I want to do. We have a history table that goes like this
id, mod_date, is_active
1, 2022-06-22:12:00:00, 1
1, 2022-06-22:13:00:00, 0
2, 2022-06-22:12:00:00, 0
3, 2022-07-07:00:00:00, 1
is_active means that the record was made active. For example, row 1 was made active at 2022-06-22:12:00:00 and then was made inactive at 13:00:00.
What I want is to get only the row that was made inactive on a specific day and not made active again on that day. I came up with this query
select distinct(id)
from history
where is_active = 0
and cast(ah.mod_date as date) = '2022-06-22'
It would return 1 and 2. But I only want 2 because 1 was toggled between states. So, I only want to find all of ids that was made inactive on a specific day and never made active again on that day or any of the toggling the same day.
You may phrase this using exists logic:
SELECT *
FROM history h1
WHERE is_active = 0 AND mod_date::date = '2022-06-22' AND
NOT EXISTS (SELECT 1
FROM history h2
WHERE h2.mod_date::date = '2022-06-22' AND
h2.id = h1.id AND h2.is_active = 1);
Count how many times an id has been activated and deactivated in a day. From the result select the ones that have been deactivated once and activated zero times.
with the_historical_table(id, mod_date, is_active) as
(
values
(1, '2022-06-22:12:00:00', 1),
(1, '2022-06-22:13:00:00', 0),
(2, '2022-06-22:12:00:00', 0),
(3, '2022-07-07:00:00:00', 1)
)
select id, mod_date from
(
select id, mod_date::date,
count(*) filter (where is_active = 1) activated,
count(*) filter (where is_active = 0) deactivated
from the_historical_table
group by id, mod_date::date
) t
where activated = 0 and deactivated = 1;
Result:
id
mod_date
2
2022-06-22
What I want is to get only the row that was made inactive on a
specific day and not made active again on that day
partition.: partition by id, mod_date::date order by id, mod_date
ordered set 1 0 1 row 0 the middle row, both lead and lag is 1. You don't want this situation in the partition.
Consider 3 case.
After partition only have one row, is_action = 0 that mean both lead and lag is NULL.
Partition have multi rows.
Partition have multi rows, ordered set multiple 1 followed by multiple 0
demo
The follow code is like compute base on these 3 logic and then union all.
WITH cte AS (
SELECT
*,
lag(is_active, 1) OVER w,
lead(is_active, 1) OVER w,
first_value(is_active) OVER (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date DESC)
FROM test1
WINDOW w AS (PARTITION BY id,
mod_date::date ORDER BY id,
mod_date)) (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE (lead = 0
OR lead IS NULL)
AND (lag = 1)
AND is_active = 0
ORDER BY
id,
mod_date)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead IS NULL
AND lag IS NULL
AND is_active = 0)
UNION ALL (
SELECT
id,
mod_date,
is_active
FROM
cte
WHERE
lead = 0
AND lag IS NULL
AND is_active = 0
AND first_value != 1)
ORDER BY
id,
mod_date;

How to merge two or more rows of same table into columns base on some conditions in SQL?

I want to get the result as merge of two or more rows in the same table based on some conditions.
Table Rows
Here I want to merge the rows with same Trade Head into single row. and we ignore the Interview Location in the Expected Result Merge Format(Below image). and the result table's column count will depend on the Interview Location.
Expected Result Merge Format
Explanation On Merge Table
Here nine columns just because of Welder trade has three interview location. and each location selection count is merged at end.
I have found some solutions like Using Case , But in my case the data in Trade Head, Interview Locations are not fixed. May be tomorrow one can put ASP.Net Programmer or Software Tester as of their requirement.
This following script will work for maximum 3 row. For more row, you need to add more logics in the query per trade head. But as there no id column exists to determine older and newer row between a same Trade Head, values will be put on 9 different column but the order of SelCnt columns and value will be not synchronized. But if there an ID or any auto increment column in the table, everything will just work like a charm.
You can check LIVE Demo Here
WITH CTE([Trade Head], [Select Count On Date 1], [Select Count On Date 2], [Select Count On Date 3],C1,C2,C3)
AS
(
SELECT [Trade Head], [Select Count On Date 1], [Select Count On Date 2],[Select Count On Date 3],
CASE WHEN RN = 1 THEN RN+0 WHEN RN = 2 THEN RN+2 WHEN RN = 3 THEN RN+4 END C1,
CASE WHEN RN = 1 THEN RN+1 WHEN RN = 2 THEN RN+3 WHEN RN = 3 THEN RN+5 END C2,
CASE WHEN RN = 1 THEN RN+2 WHEN RN = 2 THEN RN+4 WHEN RN = 3 THEN RN+6 END C3
FROM
(
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY [Trade Head]
ORDER BY [ID]
-- use ORDER BY [Trade Head] if no ID column exists
) RN
FROM your_table
) A
)
SELECT [Trade Head],
SUM(CASE WHEN C1 = 1 THEN [Select Count On Date 1] ELSE NULL END) SelCnt1,
SUM(CASE WHEN C2 = 2 THEN [Select Count On Date 2] ELSE NULL END) SelCnt2,
SUM(CASE WHEN C3 = 3 THEN [Select Count On Date 3] ELSE NULL END) SelCnt3,
SUM(CASE WHEN C1 = 4 THEN [Select Count On Date 1] ELSE NULL END) SelCnt4,
SUM(CASE WHEN C2 = 5 THEN [Select Count On Date 2] ELSE NULL END) SelCnt5,
SUM(CASE WHEN C3 = 6 THEN [Select Count On Date 3] ELSE NULL END) SelCnt6,
SUM(CASE WHEN C1 = 7 THEN [Select Count On Date 1] ELSE NULL END) SelCnt7,
SUM(CASE WHEN C2 = 8 THEN [Select Count On Date 2] ELSE NULL END) SelCnt8,
SUM(CASE WHEN C3 = 9 THEN [Select Count On Date 3] ELSE NULL END) SelCnt9
FROM CTE
GROUP BY [Trade Head]
use aggregation
select Thead,max(Selcnt1),max(selcnt2),....max(selcnN) from table
group by Thead

Extracting rows from fact table which have missing data, using metadata table

I have a situation where I have:
A fact table with an id column which is NOT unique but is never null. This fact also has a lot of other dimensions (columns) which may be with a default value -1 (which logically means null)
Example:
id | Dimension1 | Dimension2 | Dimension3
1 Value -1 Value
1 -1 -1 Value
2 -1 Value Value
A metadata table that has the same dimensions as the fact table. Each row in this table represents an unique id from the fact table. Rest of the columns are populated with either null or 1, where 1 means that this dimension is a required dimension in the fact table for this id.
Example:
id | Dimension1 | Dimension2 | Dimension3
1 1 1
2 1 1
My goal is to get ONLY the rows from the fact table that are missing required information according to the metadata table. So from the examples above I would get only the row with id = 1 where Dimension1 = -1, since metadata table says for id = 1 dimensions 1 and 3 are required.
Is there an easy way of doing this?
I have made a very complicated query where there is join between these two tables and a case checks between all dimensions (I have more than 100 of them). Then these checks assign a -1 if dimension is missing in fact but is required, and there is an outer query that would sum these for all rows and only pick up rows with negative sum.
It does not work to 100% and I think its way too complicated to run on a real big fact table, so I'm open to ideas.
edit: Dynamic SQL is not allowed :(
I would suggest using a cte and an except query... in this solution, you will have to add the cases as well, but the join seems far more simple to me and you don't need to sum up any dummy values...
DECLARE #t TABLE(
id int, Dimension1 int, Dimension2 int, Dimension3 int
)
DECLARE #tMeta TABLE(
id int, Dimension1 int, Dimension2 int, Dimension3 int
)
INSERT INTO #t VALUES (1, 123, -1, 345), (1, -1, -1, 246), (2, -1, 567, 987)
INSERT INTO #tMeta VALUES (1, 1, NULL, 1), (2, NULL, 1, 1)
;WITH cte AS(
SELECT id,
CASE WHEN Dimension1 = -1 THEN NULL ELSE 1 END Dimension1,
CASE WHEN Dimension2 = -1 THEN NULL ELSE 1 END Dimension2,
CASE WHEN Dimension3 = -1 THEN NULL ELSE 1 END Dimension3
FROM #t
EXCEPT
SELECT *
FROM #tMeta
EXCEPT
SELECT id, ISNULL(Dimension1,1), ISNULL(Dimension2,1), ISNULL(Dimension3,1)
FROM #tMeta
)
SELECT t.*
FROM #t t
JOIN cte c ON t.id = c.id
AND CASE WHEN t.Dimension1 = -1 THEN -1 ELSE 1 END = ISNULL(c.Dimension1, -1)
AND CASE WHEN t.Dimension2 = -1 THEN -1 ELSE 1 END = ISNULL(c.Dimension2, -1)
AND CASE WHEN t.Dimension3 = -1 THEN -1 ELSE 1 END = ISNULL(c.Dimension3, -1)
You can use UNPIVOT to simplify query also you don't have ROWId in your fact table so the first CTE to make ROW_NUMBER() works as a RowId in the fact table. Then we make unpivoted tables (fact and template table) and join them:
WITH TFBase AS
(
SELECT TF.*, ROW_NUMBER() OVER (ORDER BY ID) as TableRowID FROM TF
),
TFU AS
(
select id,TableRowID,dim,val
from TFBase
unpivot
(
val for dim in (Dimension1, Dimension2, Dimension3)
) u
WHERE U.Val <>-1
)
,
TFT AS
(
select id,dim,val
from TTemplate
unpivot
(
val for dim in (Dimension1, Dimension2, Dimension3)
) u
WHERE Val is NOT NULL
)
SELECT * FROM TFBase WHERE
TableRowID IN
(
SELECT TableRowID FROM TFU
LEFT JOIN TFT ON
(TFU.id=TFT.id) AND (TFU.dim = TFT.dim)
GROUP BY TableRowID, TFU.ID
HAVING COUNT(TFT.Val) <> (SELECT COUNT(*) FROM TFT WHERE ID = TFU.ID)
)

Calculation of occurrence of strings

I have a table with 3 columns, id, name and vote. They're populated with many registers. I need that return the register with the best balance of votes. The votes types are 'yes' and 'no'.
Yes -> Plus 1
No -> Minus 1
This column vote is a string column. I am using SQL SERVER.
Example:
It must return Ann for me
Use conditional Aggregation to tally the votes as Kannan suggests in his answer
If you really only want 1 record then you can do it like so:
SELECT TOP 1
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
FROM
#Table
GROUP BY
name
ORDER BY
VoteTotal DESC
This will not allow for ties but you can use this method which will rank the responses and give you results use RowNum to get only 1 result or RankNum to get ties.
;WITH cteVoteTotals AS (
SELECT
name
,SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) AS VoteTotal
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RowNum
,DENSE_RANK() OVER (PARTITION BY 1 ORDER BY SUM(CASE WHEN vote = 'yes' THEN 1 ELSE -1 END) DESC) as RankNum
FROM
#Table
GROUP BY
name
)
SELECT name, VoteTotal
FROM
cteVoteTotals
WHERE
RowNum = 1
--RankNum = 1 --if you want with ties use this line instead
Here is the test data used and in the future do NOT just put an image of your test data spend the 2 minutes to make a temp table or a table variable so that people you are asking for help do not have to!
DECLARE #Table AS TABLE (id INT, name VARCHAR(25), vote VARCHAR(4))
INSERT INTO #Table (id, name, vote)
VALUES (1, 'John','no'),(2, 'John','no'),(3, 'John','yes')
,(4, 'Ann','no'),(5, 'Ann','yes'),(6, 'Ann','yes')
,(9, 'Marie','no'),(8, 'Marie','no'),(7, 'Marie','yes')
,(10, 'Matt','no'),(11, 'Matt','yes'),(12, 'Matt','yes')
Use this code,
;with cte as (
select id, name, case when vote = 'yes' then 1 else -1 end as votenum from register
) select name, sum(votenum) from cte group by name
You can get max or minimum based out of this..
This one gives the 'yes' rate for each person:
SELECT Name, SUM(CASE WHEN Vote = 'Yes' THEN 1 ELSE 0 END)/COUNT(*) AS Rate
FROM My_Table
GROUP BY Name

Count of Type using if then else

I have this following table in which i need to count type = D as 1, but if that ID ends with R then it should count as 0.
ID 123 always starts with D then it can be R then can be D or A.
ID Decision Dt Type/Status
123 1/15/2014 D
123 1/20/2014 A
123 1/15/2014 R
i have written SQL as sum(if(type=d)then 1 else 0 end). I am getting the right count until type/status is R. And this is the only ID in DB which ends with status R and not moved to D. I need help in writing the sql.
Thanks for the help in advance.
You are interested in all records with type D and R. You also need to mark the last record somehow. This can be done with RANK. Rank 1 for the latest entry per ID.
Then group by ID and see if there is a last record with type R for that ID. If so the result is 0, else count type D.
select
id,
case when max(case when last_is_one = 1 and type = 'R' then 1 end) = 1 then
0
else
count(case when type = 'D' then 1 end)
end as d_count
from
(
select
id,
type,
rank() over (partition by id order by decision_dt desc) as last_is_one
from mytable
where type in ('D','R')
) d_and_r
group by id;
SQL fiddle: http://www.sqlfiddle.com/#!3/48c820/3.