SQL - Trying to find customer by their shopping channel - sql

Hello I am trying to find the customers who just shop online and just shop in store and the customers who shop both online and in store. So when I add them up they should be equal to my total customers.
I am trying to find the new and returning customer by their shopping channel. I need a sql to give me all the new customer and returning customers who have shopped in store, and then in a separate table all the new/returning customers who have shopped only online and then people who have shopped both online and in store (crossover customers). So that when I add off them together they should be equal to my total customers in each category (new and returning).
It should look like below:
how data should look like
I have created a sample database as well. I am also trying to break the customer by new and returning customers and later by their age range.
https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=96a7b85c8ca0da7f7c40f20205964d9b
these are some of the queries which I have tried: Below is the one which shows me the new and the returning customers who have only bough online:
SELECT
DECODE(is_new, 1, 'New Customers', 'Returning Customers') type_of_customer,
COUNT(distinct individual_id) count_of_customers,
SUM(count_of_transactions) count_of_transactions,
SUM(sum_of_quantity) sum_of_quantity
FROM (
SELECT
individual_id,
SUM(dollar_value_us),
sum(quantity) sum_of_quantity,
count(distinct transaction_number) count_of_transactions,
CASE WHEN MIN(txn_date) = min_txn_date THEN 1 ELSE 0 END is_new
FROM (
SELECT
a.individual_id,
a.dollar_value_us,
a.txn_date,
a.quantity,
a.transaction_number,
b.gender,
b.age,
MIN(a.txn_date) OVER(PARTITION BY a.individual_id) min_txn_date,
A.TRANTYPE
FROM transaction_detail_mv a
join gender_details b on a.individual_id = b.individual_id
WHERE
a.brand_org_code = 'BRAND'
AND a.is_merch = 1
AND a.currency_code = 'USD'
AND a.line_item_amt_type_cd = 'S'
AND a.individual_id not in (select individual_id from transaction_detail_mv where trantype = 'POS' )
)
WHERE
txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
GROUP BY
individual_id,
min_txn_date
)
GROUP BY is_new
and to find the new and returnign customers who buy form POS is bewow:
SELECT
DECODE(is_new, 1, 'New Customers', 'Returning Customers') type_of_customer,
COUNT(distinct individual_id) count_of_customers,
SUM(count_of_transactions) count_of_transactions,
SUM(sum_of_quantity) sum_of_quantity
FROM (
SELECT
individual_id,
SUM(dollar_value_us),
sum(quantity) sum_of_quantity,
count(distinct transaction_number) count_of_transactions,
CASE WHEN MIN(txn_date) = min_txn_date THEN 1 ELSE 0 END is_new
FROM (
SELECT
a.individual_id,
a.dollar_value_us,
a.txn_date,
a.quantity,
a.transaction_number,
b.gender,
b.age,
MIN(a.txn_date) OVER(PARTITION BY a.individual_id) min_txn_date,
A.TRANTYPE
FROM transaction_detail_mv a
join gender_details b on a.individual_id = b.individual_id
WHERE
a.brand_org_code = 'BRAND'
AND a.is_merch = 1
AND a.currency_code = 'USD'
AND a.line_item_amt_type_cd = 'S'
AND a.individual_id not in (select individual_id from transaction_detail_mv where trantype = 'ONLINE' )
)
WHERE
txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
GROUP BY
individual_id,
min_txn_date
)
GROUP BY is_new
I am trying to find new and old customers who have shopped both online and in POS. Please HELP !

You are almost there. Try this:
SELECT
DECODE(is_new, 1, 'New Customers', 'Returning Customers') type_of_customer,
COUNT(distinct individual_id) count_of_customers,
SUM(count_of_transactions) count_of_transactions,
SUM(sum_of_quantity) sum_of_quantity
FROM (
SELECT
individual_id,
SUM(dollar_value_us),
sum(quantity) sum_of_quantity,
count(distinct transaction_number) count_of_transactions,
CASE WHEN MIN(txn_date) = min_txn_date THEN 1 ELSE 0 END is_new
FROM (
SELECT
a.individual_id,
a.dollar_value_us,
a.txn_date,
a.quantity,
a.transaction_number,
b.gender,
b.age,
MIN(a.txn_date) OVER(PARTITION BY a.individual_id) min_txn_date,
A.TRANTYPE
FROM transaction_detail_mv a
join gender_details b on a.individual_id = b.individual_id
WHERE
a.brand_org_code = 'BRAND'
AND a.is_merch = 1
AND a.currency_code = 'USD'
AND a.line_item_amt_type_cd = 'S'
AND a.individual_id not in (select individual_id from transaction_detail_mv where ((trantype = 'ONLINE') OR (trantype = 'POS') )
)
WHERE
txn_date >= TO_DATE('10-02-2019', 'DD-MM-YYYY')
AND txn_date < TO_DATE('17-02-2019', 'DD-MM-YYYY')
GROUP BY
individual_id,
min_txn_date
)
GROUP BY is_new

Related

UNION ALL IS NOT WORKING WITH GROUP BY IN SQL

I have to create a query that shows whether the employees has taken what kind of leave on time card or if they have taken any leave "Vacation".
The output should look like:
SELECT person_number,
Sum(
CASE
WHEN element = 'Overtime' THEN measure
END) AS overtime_measure_hours,
Sum(
CASE
WHEN elements LIKE 'Regular Pay' THEN measure
END) AS regular_measure_hours,
Sum(
CASE
WHEN elements IN ( 'Double_Time' ) THEN measure
END) AS Other_amount,
Max(
CASE
WHEN elements IN ( 'Double_Time' ) THEN 'Double'
END) AS Other_code
(
SELECT papf.person_number,
atrb.attribute_category element,
atrb.measure measure_hours,
rec.start_date,
rec.end_date
FROM per_all_people_f papf,
hwm_tm_rec rec,
fusion.hwm_tm_rep_atrbs atrb,
fusion.hwm_tm_rep_atrb_usages ausage,
hwm_tm_statuses status
WHERE 1=1
AND atrb.tm_rep_atrb_id = ausage.tm_rep_atrb_id
AND ausage.usages_source_id = rec.tm_rec_id
AND ausage.usages_source_version = rec.tm_rec_version
AND status.tm_bldg_blk_id = rec.tm_rec_id
AND status.tm_bldg_blk_version = rec.tm_rec_version
AND rec.tm_rec_type IN ( 'RANGE',
'MEASURE' )
AND papf.person_number = '101928'
AND trunc (status.date_to) = to_date ('31/12/4712', 'DD/MM/YYYY')
--and atrb.attribute_category in( 'Overtime','Regular Pay', 'Double_Time')
AND trunc (sh21.start_time) BETWEEN trunc(:P_From_Date) AND trunc(:P_To_Date) )
GROUP BY person_number
UNION ALL
SELECT person_number,
sum(
CASE
WHEN element = 'Overtime' THEN measure
END) AS overtime_measure_hours,
sum(
CASE
WHEN elements LIKE 'Regular Pay' THEN measure
END) AS regular_measure_hours,
sum(
CASE
WHEN absence_name IN ( 'Vacation' ) THEN abs_duration
END) AS Other_amount,
max(
CASE
WHEN absence_name IN ( 'Vacation' ) THEN absence_name
END) AS Other_code
(
SELECT papf.person_number,
atrb.attribute_category element,
atrb.measure measure_hours,
rec.start_date,
rec.end_date,
abs.NAME absence_name,
abs.duration abs_duration
FROM per_all_people_f papf,
hwm_tm_rec rec,
fusion.hwm_tm_rep_atrbs atrb,
fusion.hwm_tm_rep_atrb_usages ausage,
hwm_tm_statuses status,
anc_absence_types_vl abs
WHERE 1=1
AND atrb.tm_rep_atrb_id = ausage.tm_rep_atrb_id
AND ausage.usages_source_id = rec.tm_rec_id
AND ausage.usages_source_version = rec.tm_rec_version
AND status.tm_bldg_blk_id = rec.tm_rec_id
AND status.tm_bldg_blk_version = rec.tm_rec_version
AND rec.tm_rec_type IN ( 'RANGE',
'MEASURE' )
AND papf.person_number = '101928'
AND trunc (status.date_to) = to_date ('31/12/4712', 'DD/MM/YYYY')
--and atrb.attribute_category in( 'Overtime','Regular Pay', 'Double_Time')
AND trunc (sh21.start_time) BETWEEN trunc(:P_From_Date) AND trunc(:P_To_Date)
AND abs.absence_type_id = atrb.absence_type_id )
GROUP BY person_number
Although these queries are working separately it is not working with group by on each subquery.
If I merge the two queries, the issue I am having is with Other_code & other_amount. These columns have values from two different tables.
How can I solve this?
Output should be returned for the two dates p_from_date- 01-Jan-2021 and p_to_date- 31-Jul-2021 as:
Person_Number Overtime_measure_hours Regular_Measure_hours Other_code Other_amount
101928 18 15 Double_Time 34
101928 18 15 Vacation 1
i.e. Overtime_measure_hours, Regular_Measure_hours and Other_amount should have the sum of these values.

How to count distinct in different time window in AWS redshift

SELECT count(DISTINCT(c.visitid)) as count1,
t.prod
FROM x.t1 c
JOIN y.t1 t
ON c.headingid = t.prod_heading_id
WHERE
c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02'
AND c.evaluation > 0
GROUP BY t.prod
ORDER BY count1 DESC
LIMIT 100
I have anther time window from '2020-01-01' to '2020-04-02' and I want to do the same counting by group as count2.
You can use conditional aggregation:
SELECT count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02') as cnt1,
count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02') as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;
In Redshift (or many other databases), the syntax would be:
SELECT count(DISTINCT case when c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02' then c.visitid end) as cnt1,
count(DISTINCT case when c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02' then c.visitid end) as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;

Transfer table from vertical to horizontal in sql

I want to transfer table from
to what looks like this:
Not sure how to approach?
Code I had is this
FROM TABLE A,
(SELECT PART_NO, SUM(QUANTITY) AS Type_1
FROM TABLE
WHERE STYPE = 'Type_1'
GROUP BY PART_NO, STYPE) B,
(SELECT PART_NO, SUM(QUANTITY) AS Type_2
FROM TABLE
WHERE STYPE = 'Type_2'
GROUP BY PART_NO, STYPE) C,
(SELECT PART_NO, SUM(QUANTITY) AS Type_3
FROM TABLE
WHERE STYPE = 'Type_3'
GROUP BY PART_NO, STYPE) D,
(SELECT PART_NO, SUM(QUANTITY) AS Type_4
FROM TABLE
WHERE STYPE = 'Type_4'
GROUP BY PART_NO, STYPE) E
WHERE A.PART_NO = B.PART_NO
AND A.PART_NO = C.PART_NO
AND A.PART_NO = D.PART_NO
AND A.PART_NO = E.PART_NO
AND A.PART_NO = F.PART_NO
GROUP BY A.PART_NO, B.Type_1, C.Type_2, D.Type_2, E.Type_4
But it removes the rows with nan. Not sure where did i do wrong.
This is a type of pivoting that you can produce using conditional aggregation.
For example, you can do:
select
part_no,
sum(case when stype = 'Type 1' then quantity end) as type_1,
sum(case when stype = 'Type 2' then quantity end) as type_2,
sum(case when stype = 'Type 3' then quantity end) as type_3
from my_table
group by part_no

Oracle SQL: SUM of 'amount_sold' rows for each client

I need to display for each client the total /sum/ amount (from amount_sold) saved as a total_amount field. And in WHERE to set that total_amount is greater than or equal to cust_credit_limit :
total_amount >= cust_credit_limit
//
SELECT CONCAT (CONCAT(cust_first_name,' '),cust_last_name) AS customer_name,
amount_sold,
(CASE WHEN cust_credit_limit<=1500 THEN 'Low limit'
ELSE 'High limit'
END) AS credit_limit_level,
cust_valid
FROM sh.customers JOIN sh.sales
ON customers.cust_id = sales.cust_id
ORDER BY customer_name ASC;
The results now look like this:
But I need only one row for each client with the sum of all amount_sold for this client AS total_amount
**EDIT: I tried as recommended in comments and it worked. But I have another condition - to order the results by 'upper_income_level' and when I add
'lpad( substr(cust_income_level, instr( cust_income_level, '-') + 2 ), 9, '0') AS upper_income_level,'
it appears "not a GROUP BY expression".
'SELECT CONCAT (CONCAT(cust_first_name,' '),cust_last_name) AS customer_name,
lpad( substr(cust_income_level, instr( cust_income_level, '-') + 2 ), 9, '0') AS upper_income_level,
SUM(amount_sold) as total_sold,
(CASE WHEN cust_credit_limit<=1500 THEN 'Low limit'
ELSE 'High limit'
END) AS credit_limit_level,
cust_valid
FROM sh.customers JOIN sh.sales
ON customers.cust_id = sales.cust_id
WHERE cust_valid = 'A' AND cust_income_level LIKE '%-%'
GROUP BY
CONCAT (CONCAT(cust_first_name,' '),cust_last_name),
cust_valid,
cust_credit_limit
HAVING SUM(amount_sold) >= 50*cust_credit_limit
ORDER BY upper_income_level DESC, customer_name ASC;'
Your query shall deal with customers and their total sale. So, select from the customers table and join the aggregated total sale:
select
c.cust_first_name || ' ' || c.cust_last_name as customer_name,
to_number
(
regexp_substr(c.cust_income_level , '[0123456789,]+$'),
'999999999D999',
'nls_numeric_characters = '',.'''
) as upper_income_level,
s.total_sale,
case when c.cust_credit_limit <= 1500
then 'Low limit'
else 'High limit'
end as credit_limit_level,
c.cust_valid
from sh.customers c
join
(
select cust_id, sum(amount_sold) as total_sale
from sh.sales
group by cust_id
) s on s.cust_id = c.cust_id
and s.total_sale >= c.cust_credit_limit
where c.cust_valid = 'A'
and c.cust_income_level like '%-%'
order by upper_income_level desc, customer_name;
Add a SUM and a GROUP BY:
SELECT CONCAT (CONCAT(cust_first_name,' '),cust_last_name) AS customer_name,
SUM(amount_sold) as total_sold,
(CASE WHEN cust_credit_limit<=1500 THEN 'Low limit'
ELSE 'High limit'
END) AS credit_limit_level,
cust_valid
FROM sh.customers JOIN sh.sales
ON customers.cust_id = sales.cust_id
GROUP BY
CONCAT (CONCAT(cust_first_name,' '),cust_last_name),
cust_valid,
cust_credit_limit
HAVING SUM(amount_sold) >= cust_credit_limit
ORDER BY customer_name ASC;
Tips:
You might find that your concat accepts multiple parameters and concats all of them e.g. CONCAT(first_name, ' ', last_name)
Rules of GROUP BY: Anything not contained in a SUM, AVG, or similar aggregation function in your SELECT, must be in the GROUP BY. At the time GROUP BY is done, the aliases in the select list don't exist, so you must instead use the expression that prepares the result (or some child part of it such that the whole expression can be computed from grouped values)
HAVING is like a where clause that is done after a group by. WHERE is done before a group by. HAVING can hence reference grouped and aggregated expressions, but they must be grouped or aggregated
Try to use sum(), group by and having:
SELECT CONCAT (CONCAT(cust_first_name,' '),cust_last_name) AS customer_name,
sum(amount_sold) as total_amount,
(CASE WHEN cust_credit_limit<=1500 THEN 'Low limit'
ELSE 'High limit'
END) AS credit_limit_level,
cust_valid
FROM sh.customers JOIN sh.sales
ON customers.cust_id = sales.cust_id
GROUP BY customer_name, credit_limit_level, cust_valid
HAVING sum(amount_sold)>=max(cust_credit_level)
ORDER BY customer_name ASC;
if you have 2 cutomer with the same name,cust_id in group by is the must.
SELECT customers.cust_id,
CONCAT (CONCAT(cust_first_name,' '),cust_last_name) AS customer_name,
sum(isnull(amount_sold,0)) amount_sold
FROM sh.customers JOIN sh.sales
ON customers.cust_id = sales.cust_id
group by customers.cust_id,CONCAT (CONCAT(cust_first_name,' '),cust_last_name)
having sum(isnull(amount_sold,0)) >= max(cust_credit_limit)
ORDER BY customer_name ASC;

How should I combine these SQL queries: self-join or use temporary tables?

I'm creating a stored procedure pulling aggregate sum values from a few different tables. Separately, the queries are simplistic with different filters.
The queries need to be joined together and are as follows:
select distinct(bus_name), sum(act) as 'totrev', sum(budget) as 'budget rev'
from finance
where year = '2011'
and type_desc = 'rev'
group by bus_code, bus_name
order by bus_name asc
select distinct(bus_name), sum(act) as 'totalexp', sum(budget) as 'budget exp'
from finance
where year = '2011'
and type_desc = 'exp'
group by bus_code, bus_name
order by bus_name asc
select distinct(bus_name), sum(end_balance) as 'total assets'
from Balance
where year = '2011'
and type_desc = 'assets'
group by bus_code, bus_name
order by bus_name asc
select distinct(bus_name), sum(end_balance) as 'Cash'
from Balance
where year = '2011'
and type_desc = 'equity'
group by bus_code, bus_name
order by bus_name asc
select bus_code, bus_name, count(bus_code) as '#of bldgs'
from building
group by bus_code, bus_name
order by bus_name asc
I'm looking to merge/join all the columns to be viewed essentially in one table.
finance_table
columns = bus_code, bus_name, # of bldgs, tot_rev, budget_rev, totalexp, budget exp, total assets, cash
Try something like this by using nested queries:
SELECT T5.bus_code, T5.bus_name, T5.[# of bldgs], T1.tot_rev, T1.budget_rev, T2.totalexp, T2.[budget exp], T3.[total assets], T4.cash
FROM
(
select distinct(bus_name), sum(act) as 'totrev', sum(budget) as 'budget rev'
from finance
where year = '2011'
and type_desc = 'rev'
group by bus_code, bus_name
order by bus_name asc
) T1 INNER JOIN
(
select distinct(bus_name), sum(act) as 'totalexp', sum(budget) as 'budget exp'
from finance
where year = '2011'
and type_desc = 'exp'
group by bus_code, bus_name
order by bus_name asc
) T2 ON T1.bus_name = T2.bus_name
INNER JOIN
(
select distinct(bus_name), sum(end_balance) as 'total assets'
from Balance
where year = '2011'
and type_desc = 'assets'
group by bus_code, bus_name
order by bus_name asc
) T3 ON T2.bus_name = T3.bus_name
INNER JOIN
(
select distinct(bus_name), sum(end_balance) as 'Cash'
from Balance
where year = '2011'
and type_desc = 'equity'
group by bus_code, bus_name
order by bus_name asc
) T4 ON T3.bus_name = T4.bus_name
INNER JOIN
(
select bus_code, bus_name, count(bus_code) as '#of bldgs'
from building
group by bus_code, bus_name
order by bus_name asc
) T5 ON T4.bus_name = T5.bus_name
I assume inner joins, but you may need to use outer joins if some of these won't have an entry for a particular business. But the general technique would be the same.
If your SQL supports CASE expressions, you can use them to create "virtual" fields for each type, and then sum these up.
select bus_code, bus_name
,sum(case when type_desc = 'rev' then act else 0 end) as 'totrev'
,sum(case when type_desc = 'rev' then budgetelse 0 end) as 'budget rev'
,sum(case when type_desc = 'exp' then act else 0 end) as 'totexp'
,sum(case when type_desc = 'exp' then budgetelse 0 end) as 'budget exp'
... ... etc.
from finance
where year = '2011'
group by bus_code, bus_name
order by bus_name asc
The last (building) table can simple be joined to this one, on bus-code