order by clause not showing expected result - sql

when i run the following query where i need to use trim function on date,
the order of output is not proper
select trim(man_date_created)as createddate,count(*) recordcount
from man
where man_date_created>sysdate-15
group by trim(man_date_created) ORDER BY createddate;
this the out put i am getting from this query
01-APR-16
02-APR-16
03-APR-16
04-APR-16
05-APR-16
06-APR-16
07-APR-16
08-APR-16
09-APR-16
10-APR-16
11-APR-16
27-MAR-16
28-MAR-16
29-MAR-16
30-MAR-16
31-MAR-16
where you can see that after 11 april its showing entries of march.
is there any solution for this so that i cant get the count of all status?

You should convert your string in date
SELECT TO_DATE('12-4-2016','YYYY-MM-DD');
select trim(DATE(date,'YYYY-MM-DD'))as createddate,count(*) recordcount
from man
where man_date_created>sysdate-15
group by trim(man_date_created) ORDER BY createddate;
in your case try this
select DATE(mandate,'YYYY-MM-DD') createddate, count(*) recordcount,
count(case when man_status = 'A' then 1 end) as a,
count(case when man_status = 'S' then 1 end) as s,
count(case when man_status = 'C' then 1 end) as c,
count(case when man_status = 'R' then 1 end) as r
from man
where man_status IN ('A','S','C','R') and mandate>sysdate-15
group bycreateddate ORDER BY createddate;

You have to convert the string to date in the ORDER BY clause:
select trim(date)as createddate,count(*) recordcount
from man
where man_date_created>sysdate-15
group by trim(man_date_created) ORDER BY TO_DATE(date, 'DD/Month/YYYY');

Related

Combine 2 queries together

I am struggling to work out combining a query that should give me 3 columns of Month, total_sold_products and drinks_sold_products
Query 1:
Select month(date), count(id) as total_sold_products
from Products
where date between '2022-01-01' and '2022-12-31'
Query 2
Select month(date), count(id) as drinks_sold_products
from Products where type = 'drinks' and date between '2022-01-01' and '2022-12-31'
I tried the union function but it summed count(id) twice and gave me only 2 columns
Many thanks!
Union is for attaching sets of data on top of each other. You need conditional aggregation or a join. See below.
SELECT MONTH(date),
COUNT(*) AS total_sold_products,
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END) AS drinks_sold_products,
FORMAT((CASE
WHEN COUNT(*) > 0 THEN
COUNT(CASE WHEN type = 'drinks' THEN 1 ELSE 0 END)/COUNT(*)
ELSE 0 END),
'P') AS Percentage
FROM Products
WHERE date BETWEEN'2022-01-01' AND '2022-12-31'
GROUP BY MONTH(date)

How to exclude 0 from count()? in sql?

I have a code as below where I want to count number of first purchases for a given period of time. I have a column in my sales table where if the buyer is not a first time buyer, then is_first_purchase = 0
For example:
buyer_id = 456391 is already an existing buyer who made purchases on 2 different dates.
Hence is_first_purchase column will show as 0 as per below.
If i do a count() on is_first_purchase for this buyer_id = 456391 then it should return 0 instead of 2.
My query is as follows:
with first_purchases as
(select *,
case when is_first_purchase = 1 then 'Yes' else 'No' end as first_purchase
from sales)
select
count(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
from first_purchases
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
It returned the below which is not an intended output
Appreciate if someone can help explain how to exclude is_first_purchase = 0 from the count, thanks.
Because COUNT function count when the value isn't NULL (include 0), if you don't want to count, need to let CASE WHEN return NULL
There are two ways you can count as your expectation, one is SUM other is COUNT but remove the part of else 0
SUM(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
COUNT(case when first_purchase = 'Yes' then 1 end) as no_of_first_purchases
From your question, I would combine CTE and main query as below
select
COUNT(case when is_first_purchase = 1 then 1 end) as no_of_first_purchases
from sales
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
I think that you are using COUNT() when you want SUM().
with first_purchases as
(select *,
case when is_first_purchase = 1 then 'Yes' else 'No' end as first_purchase
from sales)
select
SUM(case when first_purchase = 'Yes' then 1 else 0 end) as no_of_first_purchases
from first_purchases
where buyer_id = 456391
and date_id between '2021-02-01' and '2021-03-01'
order by 1 desc;
You could simplify your query as:
SELECT COUNT(*) AS
FROM sales no_of_first_purchases
WHERE is_first_purchase = 1
AND buyer_id = 456391
AND date_id BETWEEN '2021-02-01' AND '2021-03-01'
ORDER BY 1 DESC;
It is better to avoid the use of functions like IF and CASE when it can be done with WHERE.
The simplest approach for Trino (f.k.a. Presto SQL) is to use an aggregate with a filter:
count(name) FILTER (WHERE first_purchase = 'Yes') AS no_of_first_purchases

Sum of distinct values after grouping explodes a metric

I am using
with t1 as
(
SELECT
DATE_TRUNC(PARSE_DATE("%Y%m%d", date), MONTH) as month,
fullVisitorId,
product.productSKU,
product.v2ProductName,
case when hits.ecommerceaction.action_type = '2' then 1 else 0 end as pdp_visitor,
count(case when hits.ecommerceaction.action_type = '2' then fullvisitorid else null end) AS views_pdp,
count(case when hits.ecommerceaction.action_type = '3' then fullvisitorid else null end) AS add_cart,
count(case when hits.ecommerceaction.action_type = '6' then hits.transaction.transactionid else null end) AS conversions,
count(distinct(hits.transaction.transactionId)) as transaction_id_cnt,
FROM `table` AS nr,
UNNEST(hits) hits,
UNNEST(product) product
GROUP BY 1,2,3,4,5
)
select
month,
sum(views_pdp) as pdp
,sum(add_cart) as add_cart
,sum(conversions) as conversions
,sum(transaction_id_cnt)
from t1
group by 1
order by 1 desc;
Which returns
month pdp add_cart conversions f0_
2021-02-01 500 100 20 10
2021-01-01 600 200 30 20
I know that f0_ ( count(distinct(hits.transaction.transactionId)) ) is bad here because of product.productSKU and product.v2ProductName grouping.
In general, when user makes an order with 3 items in his basket, I want to count this as one order, whereas now it is counted as 3.
This count(distinct(hits.transaction.transactionId)) as transaction_id_cnt results in the correct output if I comment out product.productSKU and product.v2ProductName.
Running this query:
with t1 as
(
SELECT
DATE_TRUNC(PARSE_DATE("%Y%m%d", date), MONTH) as month,
fullVisitorId,
-- product.productSKU, # commented out
-- product.v2ProductName, # commented out
case when hits.ecommerceaction.action_type = '2' then 1 else 0 end as pdp_visitor,
count(case when hits.ecommerceaction.action_type = '2' then fullvisitorid else null end) AS views_pdp,
count(case when hits.ecommerceaction.action_type = '3' then fullvisitorid else null end) AS add_cart,
count(case when hits.ecommerceaction.action_type = '6' then hits.transaction.transactionid else null end) AS conversions,
count(distinct(hits.transaction.transactionId)) as transaction_id_cnt,
FROM `table` AS nr,
UNNEST(hits) hits,
UNNEST(product) product
GROUP BY 1,2,3,4,5
)
select
month,
sum(views_pdp) as pdp
,sum(add_cart) as add_cart
,sum(conversions) as conversions
,sum(transaction_id_cnt)
from t1
group by 1
order by 1 desc;
Returns what is expected, but now I don't have productSKU and v2ProductName which I need. I suspect that the problem is that each order is a new line in google big query and when I ask to to select it by product name and SKU, I count the uniques and then sum it.
How can I achieve the correct summation of count(distinct(hits.transaction.transactionId)) without losing the grouping by product.productSKU and product.v2ProductName which explodes this metric?
On the group by Query you could cherry pick them as array(so you don't group by them):
ARRAY_AGG(DISTINCT product.productSKU IGNORE NULLS) AS productSKU_list,
ARRAY_AGG(DISTINCT product.v2ProductName IGNORE NULLS) AS productName_list,
Update per your below comment: If you want to use them in further group by just save them as string instead of array.
STRING_AGG(DISTINCT product.productSKU, ',') AS productSKU_list,
STRING_AGG(DISTINCT product.v2ProductName, ',') AS productName_list,

Why my CASE WHEN gave me an AGGREGATION error message?

I'm trying to make a promo grouping using one promo_code field in a month where there's a chance that a single customer_ID would have more than one transaction and could have two different promo code
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND flag_promo = 0 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
It gave me an error message :
SELECT list expression references column flag_promo which is neither grouped nor aggregated at [4:41]
Below is for BigQuery Standard SQL
#standardSQL
SELECT customer_id AS buyer,
CASE
WHEN COUNT(DISTINCT flag_promo) > 1 THEN 'Mixed'
WHEN ANY_VALUE(flag_promo) = 1 THEN 'Promo'
WHEN ANY_VALUE(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM `project.dataset.table`
WHERE DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2
This is the query I think you intended to do:
SELECT
customer_id AS buyer,
CASE WHEN COUNT(DISTINCT flag_promo) = 2 THEN 'Mixed'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 1 THEN 'Promo'
WHEN COUNT(DISTINCT flag_promo) = 1 AND MIN(flag_promo) = 2 THEN 'Organic'
END AS promo_group
FROM TABLE
WHERE
DATE BETWEEN '2019-04-01' AND '2019-04-30'
GROUP BY 1
ORDER BY 2;
This assumes that a flag_promo value of 1 means Promo and a value of 2 means Organic. If not, then we can easily edit the above query.

How to get count of a particular row

I have table that contain Id,Date and Status i.e open/close
i just want a result in sql that contain month wise open,close and total count of Id's
e.g In Jan open count 15,close count 5 and total count 20
Use RollUp() and Group By as below:
;WITH T AS
(
SELECT
Id,
DATENAME(MONTH,[Date]) AS [MonthName],
Status
FROM #tblTest
)
SELECT
[MonthName],
[Status],
StatusCount
FROM
(
SELECT
MonthName,
CASE ISNULL(Status,'') WHEN '' THEN 'Total' ELSE Status END AS Status,
Count(Status) AS StatusCount
FROM T
GROUP BY ROLLUP([MonthName],[Status])
)X
WHERE X.MonthName IS NOT NULL
ORDER BY X.[MonthName],X.[Status]
Output:
Note: If required data in single row by month then apply PIVOT
select year(date), month(date),
sum(case when status = 'open' then 1 else 0 end) as open_count,
sum(case when status = 'closed' then 1 else 0 end) as closed_count,
count(*) as total_count
from your_table
group by year(date), month(date)