SQL query when joining two tables - sql

I need to join two tables in SQL and I need to find the counts of how many customer id's in table A are also found in table B, extracting how many Customer id in A's also purchased in table B by year. My query is as follows:
SELECT
a.year, count(distinct(a.id),
count (distinct(b.id)
FROM
purchase as A,
purchase2 as B
WHERE
(a.id=b.id)
AND
a.year>2010
GROUP BY a.year
Is this correct? do I need to include count(distinct(b.id) in the select statement? do I also need to group by b.year?
thanks in advance for any assistance

Change your distinct, and you can do an inner join to be sure :
SELECT
A.year,
count(DISTINCT A.id),
count (DISTINCT B.id)
FROM
purchase as A
INNER JOIN purchase2 B ON B.id = A.id
WHERE
A.year>2010
GROUP BY
A.year,
B.year

I think union all and aggregation is the best approach. Start with the information per id/year:
select id, year, max(in_a), max(in_b)
from ((select distinct id, year, 1 as in_a, 0 as in_b
from purchase
) union all
(select distinct id, year, 0 as in_a, 1 as in_b
from purchase2
)
) ab
group by id, year;
Then aggregate this by year:
select year,
sum(in_a) as total_a,
sum(in_b) as total_b,
sum(in_a * in_b) as in_both
from (select id, year, max(in_a), max(in_b)
from ((select distinct id, year, 1 as in_a, 0 as in_b
from purchase
) union all
(select distinct id, year, 0 as in_a, 1 as in_b
from purchase2
)
) ab
group by id, year
) iy
group by year;

Related

How to assign 0 to summation of a field if no entry exists for the group by field in teradata SQL query result

I have below query which is used for getting summation of an amount column but as you can see also in the attached screenshot that, there's no entry for NATURAL PERSON for Corporates as there aren't any entry in the table for NATURAL PERSON for CUST_TYPE=Corporates. Please suggest how to get NATURAL PERSON row also for Coporates with 0 assigned against it. Searched for similar questions but didn't get the result with provided suggestions
SELECT CUST_TYPE,FINAL_SME_CATEGORY, SUM(CUST_COMPENSATABLE_AMT) AS TOTAL_SUM FROM ddewd10s.FSCS_LIMIT_UTIL_SCV WHERE FINAL_SME_CATEGORY IN ('SMALL','NATURAL PERSON') GROUP BY 1,2 ORDER BY 1,2;
I tried few queries with ZEROIFNULL, NVL, COALESCE but all of them also gave the same result. Even tried writing CASE statements still didn't get the desired result.
SELECT CUST_TYPE,FINAL_SME_CATEGORY, COALESCE(SUM(CUST_COMPENSATABLE_AMT), 0) AS TOTAL_SUM FROM ddewd10s.FSCS_LIMIT_UTIL_SCV WHERE FINAL_SME_CATEGORY IN ('SMALL','NATURAL PERSON') GROUP BY 1,2 ORDER BY 1,2;
SELECT CUST_TYPE,FINAL_SME_CATEGORY, ZEROIFNULL(SUM(CUST_COMPENSATABLE_AMT)) AS TOTAL_SUM FROM ddewd10s.FSCS_LIMIT_UTIL_SCV WHERE FINAL_SME_CATEGORY IN ('SMALL','NATURAL PERSON') GROUP BY 1,2 ORDER BY 1,2;
SELECT CUST_TYPE,FINAL_SME_CATEGORY, NVL(SUM(CUST_COMPENSATABLE_AMT),0) AS TOTAL_SUM FROM ddewd10s.FSCS_LIMIT_UTIL_SCV WHERE FINAL_SME_CATEGORY IN ('SMALL','NATURAL PERSON') GROUP BY 1,2 ORDER BY 1,2;
SELECT CUST_TYPE, FINAL_SME_CATEGORY, CASE WHEN SUM(CUST_COMPENSATABLE_AMT)=0 THEN 0 ELSE SUM(CUST_COMPENSATABLE_AMT) END AS TOTAL_SUM FROM ddewd10s.FSCS_LIMIT_UTIL_SCV WHERE FINAL_SME_CATEGORY IN ('SMALL','NATURAL PERSON') GROUP BY 1,2 ORDER BY 1,2;
first do cross join cust_type and final_sme_catgory.
select distinct cust_type from table cross join
(select distinct final_sme_catgory from table) temp.
After that left join with your group by query and join with cust_type and final_sme_catgory . use nvl function to display total_sum or 0 value.
sample query:
with cte cross_table (
select distinct cust_type from table cross join
(select distinct final_sme_catgory from table) temp.
),
group_by_result(
--
select cust_type ,final_sme_catgory,sum(value) as total from table group by ust_type ,final_sme_catgory),
select cte.cust_type ,cte.final_sme_catgory,nvl(r.total ,0) as total
from cross_table cte left join group_by_result r on cte.cust_type =r.cust_type
and cte.final_sme_catgory=r.final_sme_catgory

How to retrieve the most frequent value of a column for a specific ID in a table

I'm trying to fetch the most frequent value from a SQLite 3 database table for each specific ID (which is the ID of a company). I have tried with GROUP BY and ORDER BY as well as with COUNT() function.
SELECT company_id, max(car)
FROM car_orders
GROUP by company_id
ORDER by max(car)
For a specific company_id (9) I am expecting 'Audi' to be in result but this is not the case as its 'Volkswagen' (which is wrong)
Similar to your attempts, consider joining two aggregates that calculates COUNT per car and company and MAX of same counter per company. Below uses CTE introduced in SQLite in version 3.8.3, released in February 2014.
WITH cnt AS (
SELECT company_id, car, COUNT(*) AS car_count
FROM car_orders
GROUP by company_id, car
),
max_cnt AS (
SELECT cnt.company_id, MAX(cnt.car_count) as max_count
FROM cnt
GROUP BY cnt.company_id
)
SELECT cnt.company_id, cnt.car
FROM cnt
INNER JOIN max_cnt
ON cnt.company_id = max_cnt.company_id
AND cnt.car_count = max_cnt.max_count
In the more recent versions of SQLite, you can use window functions:
SELECT cc.*
FROM (SELECT company_id, car, COUNT(*) as cnt,
ROW_NUMBER() OVER (PARTITION BY company_id ORDER BY COUNT(*) DESC) as seqnum
FROM car_orders
GROUP by company_id, car
) cc
WHERE seqnum = 1;
In earlier versions, it is a little more complicated:
WITH cc as (
SELECT company_id, car, COUNT(*) as cnt
FROM car_orders
GROUP by company_id, car
)
SELECT cc.*
FROM cc
WHERE cc.cnt = (SELECT MAX(cc2.cnt)
FROM cc cc2
WHERE cc2.company_id = cc.company_id
);

SQL Select Group By Min() - but select other

I want to select the ID of the Table Products with the lowest Price Grouped By Product.
ID Product Price
1 123 10
2 123 11
3 234 20
4 234 21
Which by logic would look like this:
SELECT
ID,
Min(Price)
FROM
Products
GROUP BY
Product
But I don't want to select the Price itself, just the ID.
Resulting in
1
3
EDIT: The DBMSes used are Firebird and Filemaker
You didn't specify your DBMS, so this is ANSI standard SQL:
select id
from (
select id,
row_number() over (partition by product order by price) as rn
from orders
) t
where rn = 1
order by id;
If your DBMS doesn't support window functions, you can do that with joining against a derived table:
select o.id
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price;
Note that this will return a slightly different result then the first solution: if the minimum price for a product occurs more then once, all those IDs will be returned. The first solution will only return one of them. If you don't want that, you need to group again in the outer query:
select min(o.id)
from orders o
join (
select product,
min(price) as min_price
from orders
group by product
) t on t.product = o.product and t.min_price = o.price
group by o.product;
SELECT ID
FROM Products as A
where price = ( select Min(Price)
from Products as B
where B.Product = A.Product )
GROUP BY id
This will show the ID, which in this case is 3.

How to query specific values for some columns and sum of values in others SQL

I'm trying to query some data from SQL such that it sums some columns, gets the max of another column and the corresponding row for a third column. For example,
|dataset|
|shares| |date| |price|
100 05/13/16 20.4
200 05/15/16 21.2
300 06/12/16 19.3
400 02/22/16 20.0
I want my output to be:
|shares| |date| |price|
1000 06/12/16 19.3
The shares have been summed up, the date is max(date), and the price is the price at max(date).
So far, I have:
select sum(shares), max(date), max(price)
but that gives me an incorrect price.
EDIT:
I realize I was unclear in my OP, all the other relevant data is in one table, and the price is in other. My full code is:
select id, stock, side, exchange, max(startdate), max(enddate),
sum(shares), sum(execution_price*shares)/sum(shares), max(limitprice), max(price)
from table1 t1
INNER JOIN table2 t2 on t2.id = t1.id
where location = 'CHICAGO' and startdate > '1/1/2016' and order_type = 'limit'
group by id, stock, side, exchange
You can do this with window functions and aggregation. Here is an example:
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
EDIT:
If the results that you are looking at are in fact the result of a query, you can do:
with t as (<your query here>)
select sum(shared), max(date), max(case when seqnum = 1 then price end) as price
from (select t.*, row_number() over (order by date desc) as seqnum
from t
) t;
Heres one way to do it .... the join would obviously include the ticker symbol for the share also
select
a.sum_share,
a.max_date
b.price
FROM
(
select ticker , sum(shares) sum_share, max(date) max_date from table where ticker = 'MSFT' group by ticker
) a
inner join table on a.max_date = b.date and a.ticker = b.ticker

ORACLE SQL Return only duplicated values (not the original)

I have a database with the following info
Customer_id, plan_id, plan_start_dte,
Since some customer switch plans, there are customers with several duplicated customer_ids, but with different plan_start_dte. I'm trying to count how many times a day members switch to the premium plan from any other plan ( plan_id = 'premium').
That is, I'm trying to do roughly this: return all rows with duplicate customer_id, except for the original plan (min(plan_start_dte)), where plan_id = 'premium', and group them by plan_start_dte.
I'm able to get all duplicate records with their count:
with plan_counts as (
select c.*, count(*) over (partition by CUSTOMER_ID) ct
from CUSTOMERS c
)
select *
from plan_counts
where ct > 1
The other steps have me stuck. First I tried to select everything except the original plan:
SELECT CUSTOMERS c
where START_DTE not in (
select min(PLAN_START_DTE)
from CUSTOMERS i
where c.CUSTOMER_ID = i.CUSTOMER_ID
)
But this failed. If I can solve this I believe all I have to add is an additional condition where c.PLAN_ID = 'premium' and then group by date and do a count. Anyone have any ideas?
I think you want lag():
select c.*
from (select c.*,
lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium';
I'm not sure what output you want. For the number of times this occurs per day:
select plan_start_date, count(*)
from (select c.*, lag(plan_id) over (partition by customer_id order by plan_start_date) as prev_plan_id
from customers c
) c
where prev_plan_id <> 'premium' and plan_id = 'premium'
group by plan_start_date
order by plan_start_date;