Is it possible to get several COUNT in one single SQL request? - sql

I need help to write a simple procedure. Let me explain what I'm trying to do.
I have 3 tables
tJobOffer
tApplication
tApplicationStatus
I would like to create a procedure that return me a list of tJobOffer with the statistics of different status of this tJobOffer. tApplicationStatus is linked to tApplication that is linked to tJobOffer. An application can be CANDIDATE / ACCEPTED / REFUSED / IGNORED / ...
I created this query :
SELECT
[T].[JobOfferId],
[T].[JobOfferTitle],
COUNT([A].[ApplicationId]) AS [CandidateCount]
FROM [tJobOffer] AS [T]
LEFT JOIN [tApplication] AS [A]
INNER JOIN [tApplicationStatus] AS [S]
ON [S].[ApplicationStatusId] = [A].[ApplicationStatusId]
AND [S].[ApplicationStatusTechnicalName] = 'CANDIDATE'
ON [A].[JobOfferId] = [T].[JobOfferId]
GROUP BY
[T].[JobOfferId],
[T].[JobOfferTitle]
ORDER BY [T].[JobOfferTitle] ;
The result is
> 52ED7C67-21E1-49BB-A1F8-0601E6EED1EA Announce a 0
> F26B228D-0C81-4DA8-A287-F8F997CC1F9C Announce b 0
> 9DA60B23-F113-4C7F-9707-2B90C1556D5D Announce c 2
> 258E11A7-79C1-47B6-8C61-413AA54E2360 Announce d 0
> DA582383-5DF4-4E1D-837C-382371BDEF57 Announce e 1
The result is correct. I get my tJoboffers with statistic on status candidate. I have 2 candidates for Announce c and 1 candidate for announce e. If I change my string 'CANDIDATE' to 'ACCEPTED' or 'REFUSED' I can get the statistic on these status. Is it possible to get everything in one request?
Something like
> 52ED7C67-21E1-49BB-A1F8-0601E6EED1EA Announce a 0 0 2
> F26B228D-0C81-4DA8-A287-F8F997CC1F9C Announce b 0 0 1
> 9DA60B23-F113-4C7F-9707-2B90C1556D5D Announce c 2 0 0
> 258E11A7-79C1-47B6-8C61-413AA54E2360 Announce d 0 0 0
> DA582383-5DF4-4E1D-837C-382371BDEF57 Announce e 1 1 0

use SUM and CASE
SELECT
[T].[JobOfferId],
[T].[JobOfferTitle],
SUM(CASE WHEN [S].[ApplicationStatusTechnicalName] = 'CANDIDATE' THEN 1 ELSE 0 END) AS [CandidateCount],
SUM(CASE WHEN [S].[ApplicationStatusTechnicalName] = 'ACCEPTED' THEN 1 ELSE 0 END) AS [ACCEPTEDCount],
SUM(CASE WHEN [S].[ApplicationStatusTechnicalName] = 'REFUSED' THEN 1 ELSE 0 END) AS [REFUSEDCount]
FROM [tJobOffer] AS [T]
LEFT JOIN [tApplication] AS [A]
ON [A].[JobOfferId] = [T].[JobOfferId]
LEFT JOIN [tApplicationStatus] AS [S]
ON [S].[ApplicationStatusId] = [A].[ApplicationStatusId]
GROUP BY
[T].[JobOfferId],
[T].[JobOfferTitle]
ORDER BY [T].[JobOfferTitle] ;

Yes, it is. One way to do that is to use the PIVOT function. The other way to do this would be to use LEFT OUTER JOIN each time you need a count of items, something like that:
SELECT a.JobID, COUNT(b.JobID), COUNT(c.JobID)
FROM AllVacancies as a
LEFT OUTER JOIN
(SELECT JobID from AllVacancies WHERE ApplicationStatus = 'CANDIDATE') as b
ON a.JobID = b.JobID
LEFT OUTER JOIN
(SELECT JobID FROM AllVacancies WHERE ApplicationStatus = 'ACCEPTED') as c
ON a.JobID = cJobID
as many times as the categories that you need.

Yes you can carry as many counts as you want
try this
SELECT COUNT(1),COUNT(2) FROM demoTable;
this will give you the count of no of rows in column 1 and column two
usually this will result the same count unless you have any null values allowesd and existing in any of the column.
If any column has any null value then its count may differ , so basically the idea is to apply count on the primary Key column .
Select count(*) from demoTable ;
this line also results in count values but it applies for the complete table , so performance wise applying count on any particular column is better .
again on the accuracy issue this must be applied on the column with primary key or not null constraint .
moving further , you need not to restrain to a single table
SELECT COUNT(1),COUNT(2) FROM ( joins or any selection from any no of table);
just be aware of the no of columns existing in the selection set

Related

use distinct within case statement

I have a query that uses multiple left joins and trying to get a SUM of values from one of the joined columns.
SELECT
SUM( case when session.usersessionrun =1 then 1 else 0 end) new_unique_session_user_count
FROM session
LEFT JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = session.userid
LEFT JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
WHERE session.appid = '6279df3bd2d3352aed591583'
AND (session.uploadedon BETWEEN '2022-04-18 08:31:26' AND '2022-05-18 08:31:26')
But this obviously gives a redundant session.usersessionrun=1 counts since it's a joined resultset.
Here the logic was to mark the user as new if the sessionrun for that record is 1.
I grouped by userid and usersessionrun and it shows that the records are repeated.
userid. sessionrun. count
628212 1 2
627a01 1 4
So what I was trying to do was something like
SUM(CASE distinct(session.userid) AND WHEN session.usersessionrun = 1 THEN 1 ELSE 0 END) new_unique_session_user_count
i.e. for every unique user count, session.usersessionrun = 1 should only be done once.
As you have discovered, JOIN operations can generate combinatorial explosions of data.
You need a subquery to count your sessions by userid. Then you can treat the subquery as a virtual table and JOIN it to the other tables to get the information you need in your result set.
The subquery (nothing in my answer is debugged):
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
This subquery summarizes your session table and has one row per userid. The trick to avoiding JOIN-created combinatorial explosions is using subqueries that generate results with only one row per data item mentioned in a JOIN's ON-clause.
Then, you join it with the other tables like this
SELECT summary.new_unique_session_user_count
FROM (
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
) summary
JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = summary.userid
JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
There may be better ways to structure this query, but it's hard to guess at them without more information about your table definitions and business rules.

Looking for a way to not show duped rows using a SQL query

SELECT
AEC.gwd_people.id_people,
AEC.gwd_people.uid_people,
AEC.gwd_people.cod_people,
AEC.gwd_people.name_people,
AEC.gwd_people.surname_people,
AEC.gwd_people.email,
AEC.gwd_people.people_status,
AEC.gwd_people.people_type,
AEC.gwd_people.facility_reference,
AEC.gwd_people.sc_id_sap,
AEC.gwd_people.c_id_sap,
AEC.gwd_people.descr_people,
AEC.gwd_people.cod_sector,
AEC.gwd_people.descr_sector,
AEC.gwd_people.cod_org_sector,
AEC.gwd_people.descr_org_sector,
AEC.gwd_people.cod_company,
AEC.gwd_people.descr_company,
AEC.gwd_people.cod_company_sap,
AEC.gwd_people.cod_department,
AEC.gwd_department.descr_department,
AEC.gwd_people.cod_subdepartment,
AEC.gwd_people.descr_subdepartment,
AEC.gwd_people.cod_cdc,
AEC.gwd_cost_center.descr_cdc,
AEC.gwd_people.cod_category_job,
AEC.gwd_people.descr_category_job,
AEC.gwd_people.cod_people_job,
AEC.gwd_people.descr_people_job,
AEC.gwd_people.cod_position,
AEC.gwd_people.descr_position,
AEC.gwd_people.uohr,
AEC.gwd_people.qual_contract,
AEC.gwd_people.level_position,
AEC.gwd_people.cod_manager,
AEC.gwd_people.cod_validator,
AEC.gwd_people.cod_country,
AEC.gwd_people.descr_country,
AEC.gwd_people.cod_region_area,
AEC.gwd_people.descr_region_area,
AEC.gwd_people.descr_city,
AEC.gwd_people.descr_site,
AEC.gwd_people.address_1,
AEC.gwd_people.address_2,
AEC.gwd_people.descr_building,
AEC.gwd_people.descr_room,
AEC.gwd_people.validity_date,
AEC.aec_workstation.cod_workstation,
AEC.aec_workstation.geometry,
AEC.aec_workstation.drawing,
AEC.gwd_people.tax_code,
AEC.gwd_people.phone_1,
AEC.gwd_people.phone_2,
AEC.gwd_people.phone_3,
AEC.gwd_people.phone_4,
AEC.gwd_people.ext_email_1,
AEC.gwd_people.flagvip,
AEC.gwd_people.hiring_date,
AEC.gwd_people.cease_date,
AEC.gwd_people.cid_resp_liv_1,
AEC.gwd_people.cid_resp_liv_2,
AEC.gwd_people.id_resp,
AEC.gwd_people.descr_resp,
AEC.gwd_people.id_ref,
AEC.gwd_people.descr_ref,
AEC.gwd_people.descr_ext_people,
AEC.gwd_people.ext_email_2,
AEC.gwd_people.descr_sede,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NULL
THEN AEC.gwd_people.idplan
ELSE NULL
END) AS idplan,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NOT NULL
THEN SUBSTRING(AEC.aec_workstation.cod_workstation, 5, 7)
ELSE NULL
END) AS idplan_wrkst,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NULL
THEN AEC.view_iam_r_unitp_building.IDEDIFICIO
ELSE NULL
END) AS cod_building,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NOT NULL
THEN SUBSTRING(AEC.aec_workstation.cod_workstation, 5, 3)
ELSE NULL
END) AS cod_building_wrkst,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NOT NULL
THEN AEC.aec_workstation.id_room
ELSE NULL
END) AS id_room_wrkst,
(CASE WHEN AEC.aec_r_workstation_people.cod_people IS NOT NULL
THEN AEC.aec_workstation.id_room
ELSE NULL
END) AS id_room_wrkst2
FROM AEC.gwd_people
LEFT OUTER JOIN AEC.view_iam_r_unitp_building ON
AEC.view_iam_r_unitp_building.IDUNITPROD = AEC.gwd_people.cod_sector
LEFT OUTER JOIN AEC.aec_r_workstation_people ON AEC.gwd_people.cod_people =
AEC.aec_r_workstation_people.cod_people
LEFT OUTER JOIN AEC.aec_workstation ON AEC.aec_workstation.cod_workstation
= AEC.aec_r_workstation_people.cod_workstation
LEFT OUTER JOIN AEC.gwd_department ON AEC.gwd_department.cod_department =
AEC.gwd_people.cod_department
LEFT OUTER JOIN AEC.gwd_cost_center ON AEC.gwd_cost_center.cod_cost_center
= AEC.gwd_people.cod_cdc
This is my query and I'm using SQL Server 13, it returns 6752 rows, 44 of them are duped. I've tried everything I know to avoid showing those duped entries but I'm out of ideas, so I'm looking for some helpful tips :-) One of the biggest problem is taht all fields are necessary, so I can't get rid of "AEC.aec_workstation.geometry" that causes problems with SELECT DISTINCT.
Find a PK value from your first table that's returning a duplicate row and start with the following query:
SELECT
COUNT(1)
FROM
AEC.gwd_people
WHERE
AEC.gwd_people.PrimaryKeyColumn = 'SomeValue'
Now start adding joins one by one, checking the result of the COUNT(1) each time:
SELECT
COUNT(1)
FROM
AEC.gwd_people
LEFT OUTER JOIN AEC.view_iam_r_unitp_building ON AEC.view_iam_r_unitp_building.IDUNITPROD = AEC.gwd_people.cod_sector
WHERE
AEC.gwd_people.PrimaryKeyColumn = 'SomeValue'
And then...
SELECT
COUNT(1)
FROM
AEC.gwd_people
LEFT OUTER JOIN AEC.view_iam_r_unitp_building ON AEC.view_iam_r_unitp_building.IDUNITPROD = AEC.gwd_people.cod_sector
LEFT OUTER JOIN AEC.aec_r_workstation_people ON AEC.gwd_people.cod_people = AEC.aec_r_workstation_people.cod_people
WHERE
AEC.gwd_people.PrimaryKeyColumn = 'SomeValue'
Until you see the amount of rows jump up when you don't expect it to. You are most likely:
Not considering that duplicate rows can be expected.
Missing another join column on a table.
Having duplicate rows on a table.
... or combination of these.
Your table design makes it a bit hard to understand their relations. This is what it looks like to me:
gwd_department {1:n} gwd_people
gwd_people {m:n} aec_workstation
gwd_people {m:n} view_iam_r_unitp_building
gwd_people {?:n} gwd_cost_center
So for a person linked to 3 aec_workstations and 4 view_iam_r_unitp_buildings, you'd produce 3 x 4 = 12 result rows. Is there no further relation between an aec_workstation and a view_iam_r_unitp_building? If not, then why do you combine them in your query?
I don't know whether cod_cdc is supposed to be short for cod_cost_center or something different. If this is an m:n relation, too, you are doing the same thing again with gwd_cost_center related to aec_workstation and view_iam_r_unitp_building.
Having said this: Either add the missing criteria or ask yourself what you want to select after all.

SQL - How to find entries where a value is missing between two rows

We are using Presto SQL at my job. I have spent hours trying to search for the answer to this question but can't find an answer and it's quite difficult to search for. Solving this issue opens the door to fixing a lot of problems.
I need to write a query that tries to find all entries where REQUEST_CANCEL & CHARGED exist but CANCEL_ACCOUNT is missing.
CHARGED & CANCEL_ACCOUNT should always come after REQUEST_CANCEL.
Table Name: CUSTOMER_INFO
|DATE_TIME|CUST_ID |ACTION |
|20180726 |1234 |CHARGED |
|20180726 |1234 |CANCEL_ACCOUNT|
|20180726 |1234 |REQUEST_CANCEL|
All these values exist in the same table. Here's what I have so far.
SELECT *
FROM
(SELECT *
FROM CUSTOMER_INFO
WHERE
DATE_TIME = 20180726
AND ACTION = REQUEST_CANCEL) as a
JOIN
(SELECT *
FROM CUSTOMER_INFO
WHERE
DATE_TIME = 20180726
AND ACTION = CHARGED) as b
ON a.CUST_ID = b.CUST_ID
WHERE
a.TIME < b.TIME
Let me explain it in a way that makes sense.
A = REQUEST_CANCEL
B = CANCEL_ACCOUNT
C = CHARGED
How do you query for when A and C exist but B is missing. The sequence needs to be exact A > B > C. It's essentially querying for something that doesn't exist between two values that do exist. In my current query, B can be returned between the two values and that's NOT what I want.
I think you're searching for NOT EXISTS and a corellated subquery.
SELECT *
FROM (SELECT *
FROM customer_info
WHERE action = 'REQUEST_CANCEL') rc
INNER JOIN (SELECT *
FROM customer_info
WHERE action = 'CHARGED') c
ON c.cust_id = rc.cust_id
AND c.date_time >= rc.date_time
WHERE NOT EXISTS (SELECT *
FROM customer_info ca
WHERE ca.cust_id = rc.cust_id
AND ca.action = 'CANCEL_ACCOUNT'
AND ca.date_time >= rc.date_time
AND ca.date_time <= c.date_time);
Use group by and having:
select cust_id
from customer_info ci
where date_time = 20180726 and
action in ('REQUEST_CANCEL', 'CHARGED', 'CANCEL_ACCOUNT')
group by cust_id
having sum(case when action = 'REQUEST_CANCEL' then 1 else 0 end) > 0 and
sum(case when action = 'CHARGED' then 1 else 0 end) > 0 and
sum(case when action = 'CANCEL_ACCOUNT' then 1 else 0 end) = 0 ;
Each sum() counts the number of matching records for the customer with that action. The > 0 says that one exists. The = 0 says that none exist.
The database doesn't matter for this logic. Here is a SQL Fiddle using MySQL.

Multiple values in SQL CASE's THEN Statement

I am wondering if it is possible to specify multiple values in the then part of a case statement in T-SQL?
I have attached a chunk of code where I am using this to join in some tables in a query. I have included a comment in the snippet.
LEFT JOIN Business B ON v.BusID = B.BusID
LEFT JOIN BusinessTypeKey T ON B.BusinessTypeID = T.BusTypeID
LEFT JOIN Location L ON L.BusID = B.BusID
AND L.HeadQuarters = CASE
WHEN (SELECT COUNT(1) from Location L2
WHERE L2.BusID = B.BusID) = 1
THEN 1,0 -- Would like to specify either 1 or 0 here. I suppose I could also make it euqal to -> L.HeadQuarters but would like a better way to impose it
ELSE 1
END
This is a little ugly, but assuming HeadQuarters is not a decimal/numeric type and only integer values,
AND L.HeadQuarters BETWEEN CASE WHEN (SELECT COUNT...) = 1 THEN 0 ELSE 1 END AND 1;
You mean...?
LEFT JOIN whatever
ON...
CASE...WHEN...THEN...END = 1
OR
CASE...WHEN...THEN...END = 0

how to write this query using joins?

i have a table campaign which has details of campaign mails sent.
campaign_table: campaign_id campaign_name flag
1 test1 1
2 test2 1
3 test3 0
another table campaign activity which has details of campaign activities.
campaign_activity: campaign_id is_clicked is_opened
1 0 1
1 1 0
2 0 1
2 1 0
I want to get all campaigns with flag value 3 and the number of is_clicked columns with value 1 and number of columns with is_opened value 1 in a single query.
ie. campaign_id campaign_name numberofclicks numberofopens
1 test1 1 1
2 test2 1 1
I did this using sub-query with the query:
select c.campaign_id,c.campaign_name,
(SELECT count(campaign_id) from campaign_activity WHERE campaign_id=c.id AND is_clicked=1) as numberofclicks,
(SELECT count(campaign_id) from campaign_activity WHERE campaign_id=c.id AND is_clicked=1) as numberofopens
FROM
campaign c
WHERE c.flag=1
But people say that using sub-queries are not a good coding convention and you have to use join instead of sub-queries. But i don't know how to get the same result using join. I consulted with some of my colleagues and they are saying that its not possible to use join in this situation. Is it possible to get the same result using joins? if yes, please tell me how.
This should do the trick. Substitute INNER JOIN for LEFT OUTER JOIN if you want to include campaigns which have no activity.
SELECT
c.Campaign_ID
, c.Campaign_Name
, SUM(CASE WHEN a.Is_Clicked = 1 THEN 1 ELSE 0 END) AS NumberOfClicks
, SUM(CASE WHEN a.Is_Opened = 1 THEN 1 ELSE 0 END) AS NumberOfOpens
FROM
dbo.Campaign c
INNER JOIN
dbo.Campaign_Activity a
ON a.Campaign_ID = c.Campaign_ID
GROUP BY
c.Campaign_ID
, c.Campaign_Name
Assuming is_clicked and is_opened are only ever 1 or 0, this should work:
select c.campaign_id, c.campaign_name, sum(d.is_clicked), sum(d.is_opened)
from campaign c inner join campaign_activity d
on c.campaign_id = d.campaign_id
where c.flag = 1
group by c.campaign_id, c.campaign_name
No sub-queries.
Hmm. Is what you want as simple as this? I'm not sure I'm reading the question right...
SELECT
campaign_table.campaign_id, SUM(is_clicked), SUM(is_opened)
FROM
campaign_table
INNER JOIN campaign_activity ON campaign_table.campaign_id = campaign_activity.campaign_id
WHERE
campaign_table.flag = 1
GROUP BY
campaign_table.campaign_id
Note that with an INNER JOIN here, you won't see campaigns where there's nothing corresponding in the campaign_activity table. In that circumstance, you should use a LEFT JOIN, and convert NULL to 0 in the SUM, e.g. SUM(IFNULL(is_clicked, 0)).
I suppose this should do it :
select * from campaign_table inner join campaign_activity on campaign_table.id = campaign_activity.id where campaign_table.flag = 3 and campaign_activity.is_clicked = 1 and campaign_activity.is_opened = 1
Attn : this is not tested in a live situation
The SQL in it's simplest form and most robust form is this: (formatted for readability)
SELECT
campaign_table.campaign_ID, campaign_table.campaign_name, Sum(campaign_activity.is_clicked) AS numberofclicks, Sum(campaign_activity.is_open) AS numberofopens
FROM
campaign_table INNER JOIN campaign_activity ON campaign_table.campaign_ID = campaign_activity.campaign_ID
GROUP BY
campaign_table.campaign_ID, campaign_table.campaign_name, campaign_table.flag
HAVING
campaign_table.flag=1;