SQL Query for repeat advertisers - sql

I want to identify first time and repeat advertisers.
This is the code I had written,
but I am getting duplicate values and the advertiser's first order is flagged as a new customer and the subsequent ones as repeat.
SELECT DISTINCT (order_id )
, advertiser_id
, advertiser_name
, order_start_date
, source_type
, MIN(order_start_date) OVER (PARTITION BY advertiser_id) AS firstorderdate
, CASE WHEN (order_start_date) = (firstorderdate) THEN 1
ELSE 0
END AS isNewCustomer,
FROM advertising;
Current output
Advertiser isnewcustomer
A 0
A 1
B 0
C 0
D 1
D 1
Expected output
Advertiser isnewcustomer
A 1
B 0
C 0
D 1

You can try this.
select subqry.advertiser_id,
case when subqry.adv_count > 1 Then 1
ELSE 0
END as isNewCustomer FROM
( select advertiser_id, count(advertiser_id) as adv_count
from advertising
group by advertiser_id ) subqry;

Related

Select table adding columns with data depending on duplicates in other column

Imagine this data.
Id
Type
1
A
1
B
1
B
2
A
3
B
I want to select table and ad two columns turning it to this. How can i do it? (In teradata)
Id
Type
Id with both A+B
Id with only A
1
A
1
0
1
B
1
0
1
B
1
0
2
A
0
1
3
B
0
0
I'm not familiar with teradata but in standard SQL next query should be working:
SELECT
T.*,
CASE WHEN Cnt = 2 THEN 1 ELSE 0 END AS BOTH_TYPES_PRESENT,
CASE WHEN Cnt = 1 AND Type = 'A' THEN 1 ELSE 0 END AS ONLY_A_PRESENT
FROM T
LEFT JOIN (
SELECT Id, COUNT(DISTINCT Type) Cnt FROM T WHERE Type IN ('A', 'B') GROUP BY Id
) CNT ON T.Id = CNT.Id;
SQL online editor

Adding a dummy identifier to data that varies by position and value

I am working on a project in SQL Server with diagnosis codes and a patient can have up to 4 codes but not necessarily more than 1 and a patient cannot repeat a code more than once. However, codes can occur in any order. My goal is to be able to count how many times a Diagnosis code appears in total, as well as how often it appears in a set position.
My data currently resembles the following:
PtKey
Order #
Order Date
Diagnosis1
Diagnosis2
Diagnosis3
Diagnosis 4
345
1527
7/12/20
J44.9
R26.2
NULL
NULL
367
1679
7/12/20
R26.2
H27.2
G47.34
NULL
325
1700
7/12/20
G47.34
NULL
NULL
NULL
327
1710
7/12/20
I26.2
J44.9
G47.34
NULL
I would think the best approach would be to create a dummy column here that would match up the diagnosis by position. For example, Diagnosis 1 with A, and Diagnosis 2 with B, etc.
My current plan is to rollup the diagnosis using an unpivot:
UNPIVOT ( Diag for ColumnALL IN (Diagnosis1, Diagnosis2, Diagnosis3, Diagnosis4)) as unpvt
However, this still doesn’t provide a way to count the diagnoses by position on a sales order.
I want it to look like this:
Diagnosis
Total Count
Diag1 Count
Diag2 Count
Diag3 Count
Diag4 Count
J44.9
2
1
1
0
0
R26.2
1
1
0
0
0
H27.2
1
0
1
0
0
I26.2
1
1
0
0
0
G47.34
3
1
0
2
0
You can unpivot using apply and aggregate:
select v.diagnosis, count(*) as cnt,
sum(case when pos = 1 then 1 else 0 end) as pos_1,
sum(case when pos = 2 then 1 else 0 end) as pos_2,
sum(case when pos = 3 then 1 else 0 end) as pos_3,
sum(case when pos = 4 then 1 else 0 end) as pos_4
from data d cross apply
(values (diagnosis1, 1),
(diagnosis2, 2),
(diagnosis3, 3),
(diagnosis4, 4)
) v(diagnosis, pos)
where diagnosis is not null;
Another way is to use UNPIVOT to transform the columns into groupable entities:
SELECT Diagnosis, [Total Count] = COUNT(*),
[Diag1 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis1' THEN 1 ELSE 0 END),
[Diag2 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis2' THEN 1 ELSE 0 END),
[Diag3 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis3' THEN 1 ELSE 0 END),
[Diag4 Count] = SUM(CASE WHEN DiagGroup = N'Diagnosis4' THEN 1 ELSE 0 END)
FROM
(
SELECT * FROM #x UNPIVOT (Diagnosis FOR DiagGroup IN
([Diagnosis1],[Diagnosis2],[Diagnosis3],[Diagnosis4])) up
) AS x GROUP BY Diagnosis;
Example db<>fiddle
You can also manually unpivot via UNION before doing the conditional aggregation:
SELECT Diagnosis, COUNT(*) As Total Count
, SUM(CASE WHEN Position = 1 THEN 1 ELSE 0 END) As [Diag1 Count]
, SUM(CASE WHEN Position = 2 THEN 1 ELSE 0 END) As [Diag2 Count]
, SUM(CASE WHEN Position = 3 THEN 1 ELSE 0 END) As [Diag3 Count]
, SUM(CASE WHEN Position = 4 THEN 1 ELSE 0 END) As [Diag4 Count]
FROM
(
SELECT PtKey, Diagnosis1 As Diagnosis, 1 As Position
FROM [MyTable]
UNION ALL
SELECT PtKey, Diagnosis2 As Diagnosis, 2 As Position
FROM [MyTable]
WHERE Diagnosis2 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis3 As Diagnosis, 3 As Position
FROM [MyTable]
WHERE Diagnosis3 IS NOT NULL
UNION ALL
SELECT PtKey, Diagnosis4 As Diagnosis, 4 As Position
FROM [MyTable]
WHERE Diagnosis4 IS NOT NULL
) d
GROUP BY Diagnosis
Borrowing Aaron's fiddle, to avoid needing to rebuild the schema from scratch, and we get this:
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=d1f7f525e175f0f066dd1749c49cc46d

Oracle query with group

I have a scenario where I need to fetch all the records within an ID for the same source. Given below is my input set of records
ID SOURCE CURR_FLAG TYPE
1 IBM Y P
1 IBM Y OF
1 IBM Y P
2 IBM Y P
2 TCS Y P
3 IBM NULL P
3 IBM NULL P
3 IBM NULL P
4 IBM NULL OF
4 IBM NULL OF
4 IBM Y ON
From the above settings, I need to select all the records with source as IBM within that same ID group.Within the ID group if there is at least one record with a source other than IBM, then I don't want any record from that ID group. Also, we need to fetch only those records where at least one record in that ID group with curr_fl='Y'
In the above scenario even though the ID=3 have a source as IBM, but there is no record with CURR_FL='Y', my query should not fetch the value.In the case of ID=4, it can fetch all the records with ID=4, as one of the records have value='Y'.
Also within the group which has satisfied the above condition, I need one more condition for source_type. if there are records with source_type='P', then I need to fetch only that record.If there are no records with P, then I will search for source_type='OF' else source_type='ON'
I have written a query as given below.But it's running for long and not fetching any results. Is there any better way to modify this query
select
ID,
SOURCE,
CURR_FL,
TYPE
from TABLE a
where
not exists(select 1 from TABLE B where a.ID = B.ID and source <> 'IBM')
and exists(select 1 from TABLE C where a.ID = C.ID and CURR_FL = 'Y') and
(TYPE, ID) IN (
select case type when 1 then 'P' when 2 then 'OF' else 'ON' END TYPE,ID from
(select ID,
max(priority) keep (dense_rank first order by priority asc) as type
from ( select ID,TYPE,
case TYPE
when 'P' then 1
when 'OF' then 2
when 'ON' then 3
end as priority
from TABLE where ID
in(select ID from TABLE where CURR_FL='Y') AND SOURCE='IBM')
group by ID))
I think you can just do a single aggregation over your table by ID and check for the yes flag as well as assert that no non IBM source appears. I do this in a CTE below, and then join back to your original table to return full matching records.
WITH cte AS (
SELECT
ID,
CASE WHEN SUM(CASE WHEN TYPE = 'P' THEN 1 ELSE 0 END) > 0
THEN 1
WHEN SUM(CASE WHEN TYPE = 'OF' THEN 1 ELSE 0 END) > 0
THEN 2
WHEN SUM(CASE WHEN TYPE = 'ON' THEN 1 ELSE 0 END) > 0
THEN 3 ELSE 4 END AS p_type
FROM yourTable
GROUP BY ID
HAVING
SUM(CASE WHEN CURR_FLAG = 'Y' THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN SOURCE <> 'IBM' THEN 1 ELSE 0 END) = 0
)
SELECT t1.*
FROM yourTable t1
INNER JOIN cte t2
ON t1.ID = t2.ID
WHERE
t2.p_type = 1 AND t1.TYPE = 'P' OR
t2.p_type = 2 AND t1.TYPE = 'OF' OR
t2.p_type = 3 AND t1.TYPE = 'ON';

Checking if the row has the max value in a group

I'm trying get to find out if a row has the max value in a group. Here's really simple example:
Data
VoteCount LocationId UserId
3 1 1
4 1 2
3 2 2
4 2 1
Pseudo-query
select
LocationId,
sum(case
when UserId = 1 /* and has max vote count*/
then 1 else 0
end) as IsUser1Winner,
sum(case
when UserId = 2 /* and has max vote count*/
then 1 else 0
end) as IsUser2Winner
from LocationVote
group by LocationID
It should return:
LocationId IsUser1Winner IsUser2Winner
1 0 1
2 1 1
I also couldn't find a way to generate dynamic column names here. What would be the simplest way to write this query?
You could also do this using a Case statement
WITH CTE as
(SELECT
MAX(VoteCount) max_votes
, LocationId
FROM LocationResult
group by LocationId
)
SELECT
A.LocationId
, Case When UserId=1
THEN 1
ELSE 0
END IsUser1Winner
, Case when UserId=2
THEn 1
ELSE 0
END IsUser2Winner
from LocationResult A
inner join
CTE B
on A.VoteCount = B.max_votes
and A.LocationId = B.LocationId
Try this:
select *
from table t
cross apply (
select max(votes) max_value
from table ref
where ref.group = t.group
)votes
where votes.max_value = t.votes
but if your table is huge and has no propriate indexes performance may be poor
Another way is to get max values by groups into table variable or temp table and then join it to original table.

SQL: Comparing count of 2 fields with specific value

I have 2 tables, one (Jobs) contains the list of the jobs and second contains the details of the records in each job.
Jobs
JobID Count
A 2
B 3
Records
JobID RecordID ToBeProcessed IsProcessed
A A1 1 1
A A2 1 1
B B1 1 1
B B2 1 0
B B3 1 0
How would I be able to create a query that would list all the jobs that have the count of ToBeProcessed which has a value of 1 is equal to the count of isProcessed that has a value of 1? Thanks in advance. Any help is greatly appreciated.
Start with the calculation of the number of items with ToBeProcessed set to 1 or IsProcessed set to one:
SELECT
JobID
, SUM(CASE WHEN ToBeProcessed=1 THEN 1 ELSE 0 END) ToBeProcessedIsOne
, SUM(CASE WHEN IsProcessed=1 THEN 1 ELSE 0 END) IsProcessedIsOne
FROM Records
GROUP BY JobID
This gives you all counts, not only ones where ToBeProcessedIsOne is equal to IsProcessedIsOne. To make sure that you get only the records where the two are the same, use either a HAVING clause, or a nested subquery:
-- HAVING clause
SELECT
JobID
, SUM(CASE WHEN ToBeProcessed=1 THEN 1 ELSE 0 END) ToBeProcessedIsOne
, SUM(CASE WHEN IsProcessed=1 THEN 1 ELSE 0 END) IsProcessedIsOne
FROM Records
GROUP BY JobID
HAVING SUM(CASE WHEN ToBeProcessed=1 THEN 1 ELSE 0 END)=SUM(CASE WHEN IsProcessed=1 THEN 1 ELSE 0 END)
-- Nested subquery with a condition
SELECT * FROM (
SELECT
JobID
, SUM(CASE WHEN ToBeProcessed=1 THEN 1 ELSE 0 END) ToBeProcessedIsOne
, SUM(CASE WHEN IsProcessed=1 THEN 1 ELSE 0 END) IsProcessedIsOne
FROM Records
GROUP BY JobID
) WHERE ToBeProcessedIsOne = IsProcessedIsOne
Note: if ToBeProcessed and IsProcessed are of type that does not allow values other than zero or one, you can replace the CASE statement with the name of the column, for example:
SELECT
JobID
, SUM(ToBeProcessed) ToBeProcessedIsOne
, SUM(IsProcessed) IsProcessedIsOne
FROM Records
GROUP BY JobID
HAVING SUM(ToBeProcessed)=SUM(IsProcessedD)
if im not misunderstanding your question it looks like you just need a WHERE clause in your statement to see when they are both equal to 1.
SELECT
r.JobID AS j_id,
r.RecordID as r_id,
r.ToBeProcessed AS tbp,
r.IsProcessed AS ip
FROM Records AS r
WHERE r.ToBeProcessed = 1 AND r.IsProcessed = 1
GROUP BY j_id;
let me know if this is not what you are asking for.
if its a count from a different table then just do a count of the tbp and ip rows grouped by jobID and then the where should still do the trick