Original Query:
select StudyID, count(CompletedDate), count(Removed), count(RemovalReason)
from Study a
full outer join Households b
on a.HouseholdID = b.HouseholdID
where StudyID = '123456'
and Removed = 1
and RemovalReason = 5
group by StudyID
How do I write out this query so that for each column (CompletedDate, Removed, and RemovalReason) is not restricted to the conditions (i.e. Removed = 1, Removal Reason = 5) and only applies to the specific column. If I execute this query, it will not show me the total count for CompletedDate because I'm restricting it to these conditions. Is there a way to write it directly next to count?
Table/Columns - Study:
HouseholdID (primary key),
StudyID,
CompletedDate
Table/Columns - Households:
HouseholdID (primary key),
Removed,
RemovalReason
I think you are looking for something like this, but your question is a little loose with details:
select StudyID
, count(CompletedDate)
, sum(case when Removed = 1 then 1 else 0 end)
, sum(case when RemovalReason = 5 then 1 else 0 end)
from Study a
join Households b
on a.HouseholdID = b.HouseholdID
where StudyID = '123456'
group by StudyID
Related
I have a table with 5 columns like this table below:
I wanna fill the approval column based on conditions such as:
APPROVAL = "Y" IN CASE
(1) CUSTOMER_NEW <> CUSTOMER (as in the case CUSTOMER = 12467)
(2) CUSTOMER_NEW = CUSTOMER AND SALE_SHORT_ID COLUMN HAS DIFFERENT VALUE (as in the case CUSTOMER = 13579)
(3) CUSTOMER_NEW = CUSTOMER AND SALE_SHORT_ID HAS THE SAME VALUE THEN MAX(LEN(SALE_ID) (as in the case CUSTOMER = 65465)
ELSE "N"
My result expected to like this table:
Thanks for your support.
I am not sure about all your your conditions but this one could do it:
UPDATE a
SET approval =
CASE WHEN
a.customer <> a.customer_new OR
a.customer = a.customer_new AND b.sale_short_eq = 0 OR
a.customer = a.customer_new AND a.sale_id = b.max_len_sale_id
THEN 'Y'
ELSE 'N'
END
FROM
thetable a
INNER JOIN (
SELECT
customer,
CASE WHEN MIN(sale_short_id) = MAX(sale_short_id) THEN 1 ELSE 0 END AS sale_short_eq,
( SELECT TOP 1 sale_id
FROM thetable
WHERE customer = c.customer
ORDER BY LEN(sale_id) DESC
) AS max_len_sale_id
FROM thetable c
GROUP BY customer
) b
ON a.customer = b.customer
Note that the columns b.sale_short_eq and b.max_len_sale_id are returned by a sub-query joined to the main query.
Especially SALE_SHORT_ID HAS THE SAME VALUE THEN MAX(LEN(SALE_ID) is not clear. Did you mean that SALE_ID must be the same as the maximum length SALE_ID? Because SALE_SHORT_ID cannot be the same than any SALE_ID. And what happens if two different SALE_IDs have the same max length?
See: https://dbfiddle.uk/7Ct87-Mk
I have a query that uses multiple left joins and trying to get a SUM of values from one of the joined columns.
SELECT
SUM( case when session.usersessionrun =1 then 1 else 0 end) new_unique_session_user_count
FROM session
LEFT JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = session.userid
LEFT JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
WHERE session.appid = '6279df3bd2d3352aed591583'
AND (session.uploadedon BETWEEN '2022-04-18 08:31:26' AND '2022-05-18 08:31:26')
But this obviously gives a redundant session.usersessionrun=1 counts since it's a joined resultset.
Here the logic was to mark the user as new if the sessionrun for that record is 1.
I grouped by userid and usersessionrun and it shows that the records are repeated.
userid. sessionrun. count
628212 1 2
627a01 1 4
So what I was trying to do was something like
SUM(CASE distinct(session.userid) AND WHEN session.usersessionrun = 1 THEN 1 ELSE 0 END) new_unique_session_user_count
i.e. for every unique user count, session.usersessionrun = 1 should only be done once.
As you have discovered, JOIN operations can generate combinatorial explosions of data.
You need a subquery to count your sessions by userid. Then you can treat the subquery as a virtual table and JOIN it to the other tables to get the information you need in your result set.
The subquery (nothing in my answer is debugged):
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
This subquery summarizes your session table and has one row per userid. The trick to avoiding JOIN-created combinatorial explosions is using subqueries that generate results with only one row per data item mentioned in a JOIN's ON-clause.
Then, you join it with the other tables like this
SELECT summary.new_unique_session_user_count
FROM (
SELECT COUNT(*) new_unique_session_user_count,
session.userid
FROM session
WHERE session.appid = '6279df3bd2d3352aed591583'
AND session.uploadedon BETWEEN '2022-04-18 08:31:26'
AND '2022-05-18 08:31:26'
AND session.usersessionrun = 1
AND session.appid = '6279df3bd2d3352aed591583'
GROUP BY userid
) summary
JOIN appuser ON appuser.appid = '6279df3bd2d3352aed591583'
AND appuser.userid = summary.userid
JOIN userdevice ON userdevice.appid = '6279df3bd2d3352aed591583'
AND userdevice.userid = appuser.userid
There may be better ways to structure this query, but it's hard to guess at them without more information about your table definitions and business rules.
with the sample data below a collection_ref will carry 3 or 4 movements but the key 2 are the purchase and sale. The supply chain approved will be 1 once the movement is complete.
I want to CASE WHEN the movement is a Sale and Supply Chain Approved is 1 then a new column for all lines that have the same collection ref to show 'Completed' Else 'Outstanding'... any ideas?
Thanks in advance
SELECT DISTINCT a.bulk_type_code,
a.bulk_number,
a.supplier_contract_ref,
a.supplier_consignment_ref,
a.supplier_org_code,
a.supplier_org_name,
a.collection_ref,
a.raw_weight_tons,
a.financial_net_weight_tons,
a.purchase_weight_tons,
a.delivery_date,
b.delivery_term_code,
b.delivery_term_description,
c.week_number
FROM bi.bulk_subcontainer_pricing_group_summary a
LEFT JOIN bi.contracts b ON a.supplier_contract_number = b.contract_number
OR a.purchaser_contract_number = b.contract_number
INNER JOIN bi.weeks c ON a.delivery_date BETWEEN c.start_date AND c.end_date
WHERE a.delivery_date BETWEEN #DFrom AND #DTo
AND a.bulk_type_code IN (#BulkType)
AND a.business_unit_code = 'OLLIMP'
AND a.pricing_type_group_code = #pricing_hidden
ORDER BY a.collection_ref
It would be helpful if you could share the schema instead if writing out the column / key names.
It sounds like what you want is something like this
Select Case when Bulk_Type_Code = 'Sale' and Supply_Chain_Approved = 1 then 'Completed'
when Bulk_Type_Code = 'Sale' and Supply_Chain_Approved = 0 then 'Outstanding'
else NULL end as CalculatedColumn
If the Bulk_Type_Code is not sale, then this column will be null.
I'm trying to create a query with Postgresql. Unfortunately, the attributes in the attribute_table are listed as rows instead of columns which makes it harder to pull. I want to pull a count based on the three attributes I have listed below (1000 = gender - 2 = female, 1001 = age group - 5 = 55-64, 1002 = household size = 1). How do I adjust this query so that it only gives me one row vs three rows of the same personal_ID? Also when I use this query, it doesn't pull any values but if I put only 1 attribute it works.
select sa.country_id ,count(distinct sa.personal_id )
from study_table sa ,attribute_table a
where sa.country_id =a.country_id and sa.personal_id =a.personal_id
and to_char(sa.mailing_date,'yyyy-MM')='2021-01'
and attribute_id =1000 and a.attribute_number =2
and attribute_id =1001 and a.attribute_number =5
and attribute_id =1002 and a.attribute_number =1
and study_type ='Wave'
and status not in ('NEW','EXCLUDED','ERROR')
group by sa.country_id
First, use proper, explicit, standard JOIN syntax.
Second, your WHERE conditions are contradictory. You need ORs . . . and then a HAVING for the final filtering:
select sa.country_id ,count(distinct sa.personal_id )
from study_table sa join
attribute_table a
on sa.country_id = a.country_id and
sa.personal_id = a.personal_id
where to_char(sa.mailing_date,'yyyy-MM') = '2021-01' and
( (attribute_id = 1000 and a.attribute_number = 2) or
(attribute_id = 1001 and a.attribute_number = 5) or
(attribute_id = 1002 and a.attribute_number = 1)
) and
study_type ='Wave'
status not in ('NEW','EXCLUDED','ERROR')
group by sa.country_id
having count(distinct attribute_id) = 3;
It seems your attribute_table follows an Entity-Attribute-Value. (IMHO that is an extremely bad bad plan - but if that is what you got that is what you deal with). So to get/validate/set 3 attributes you need to reference that table 3 times in the query -once for each attribute.
select sa.country_id
, count(distinct sa.personal_id )
from study_table sa
join ( select g.personal_id
from attribute_table g
join attribute_table ag on ag.personal_id = ag.personal_id
join attribute_table hs on hs.personal_id = hs.personal_id
where 1=1
and (g.attribute_id =1000 and g.attribute_number =2)
and (ag.attribute_id =1001 and ag.attribute_number =5)
and (hs.attribute_id =1002 and hs.attribute_number =1)
) sf
on sf.personal_id = sa.personal_id
where 1=1
and to_char(sa.mailing_date,'yyyy-mm')='2021-01'
and sa.study_type ='Wave'
and sa.status not in ('NEW','EXCLUDED','ERROR')
group by sa.country_id;
The above of course assumes the column names study_type and status originate in the study table - seems likely.
But if not you will need 2 more references to the attribute table.
Hint: Always use tables aliases on ALL column references.
I am using TOAD for Oracle. While i implement some sql queries i encountered these problem:
I am using a few tables that each of them has approx. 10M rows for a select query. 2 tables have over 70M rows data.
Let's say i have;
a TRANSACTION table (prim. key: SQ_TRANSACTION_ID)
a TRANSACTION_DETAIL table (foreign keys: RF_TRANSACTION_ID,
RF_PRODUCT_ID)
a PRODUCT table (prim. key: SQ_PRODUCT_ID)
My select query is like;
SELECT TR.TRANSACTION_ID,
SUM(CASE WHEN PR.CD_PRODCUT_TYPE = 'A'
THEN TRD.CS_INVOICE_PRICE ELSE 0 END) A_PRODUCT_TOTAL,
SUM(CASE WHEN PR.CD_PRODCUT_TYPE <> 'A'
THEN TRD.CS_INVOICE_PRICE ELSE 0 END) B_PRODUCT_TOTAL
FROM TRANSACTION TR,
TRANSACTION_DETAIL TRD,
PRODUCT PR
WHERE TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID
AND TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
GROUP BY TR.TRANSACTION_ID,
CASE WHEN PR.CD_PRODCUT_TYPE = 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END,
CASE WHEN PR.CD_PRODCUT_TYPE <> 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END
Is there a way to split this query into two or more parts with referenced each other by using their foreign/primary keys? I mean like splitting into two parts that first part fetches A_PRODUCT_TOTAL and second part fetches B_PRODUCT_TOTAL. Each part's transaction id should match at the result data.
A direct translation of your query would be:
SELECT TR.TRANSACTION_ID, SUM(TRD.CS_INVOICE_PRICE) A_PRODUCT_TOTAL
FROM TRANSACTION TR join
TRANSACTION_DETAIL TRD
on TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID join
PRODUCT PR
on TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
WHERE PR.CD_PRODCUT_TYPE = 'A'
GROUP BY TR.TRANSACTION_ID,
CASE WHEN PR.CD_PRODCUT_TYPE = 'A' THEN TRD.CS_INVOICE_PRICE ELSE 0 END
However, I suspect that you don't want the second clause in the group by, because each transaction would be split into reows where the invoice price is the same:
SELECT TR.TRANSACTION_ID, SUM(TRD.CS_INVOICE_PRICE) A_PRODUCT_TOTAL
FROM TRANSACTION TR join
TRANSACTION_DETAIL TRD
on TR.SQ_TRANSACTION_ID = TRD.RF_TRANSACTION_ID join
PRODUCT PR
on TRD.RF_PRODUCT_ID = PR.SQ_PRODUCT_ID
WHERE PR.CD_PRODCUT_TYPE = 'A'
GROUP BY TR.TRANSACTION_ID;
The query for 'B' would be similar.