How to combine 2 SQLs into single SQL - sql

Table C(id, type) has list of all unique clients ids, with and without transactions. Every id is unique and has a single type.
Table T(date, id, type, money) is the transaction table, the id is not unique here.
Table C has more unique ids than in T, because not all clients are doing transactions.
The unique ids in the T table are subset of id's in the C table.
SQL for AVG(money) and STD(money) per type for T table:
SELECT
type,
AVG(money) AS avg_for_active_clients,
STDEV(money) AS stdev_for_active_clients,
COUNT(DISTINCT id) as cnt_active_clients
FROM (
SELECT id , type, sum(money) as money
FROM T
GROUP BY id, type
) A
GROUP BY type
SQL for AVG(money) and STD(money) per type for C table:
SELECT
type,
AVG(money) AS avg_for_all_clients,
STDEV(money) stdev_for_all_clients,
COUNT(DISTINCT id) as cnt_all_clients
FROM (
SELECT C.id, C.type , COALESCE(A.money, 0) as money FROM C
LEFT JOIN (
SELECT id , sum(money) as money
FROM T
GROUP BY id
) A
ON C.id = A.id
) B
GROUP BY type
Is it possible to combine 2 SQLs above into single SQL ?
My database is Redshift.

You can combine your select #1 with your select #2 vertically or horizontally.
To combine them vertically you can use UNION ALL. For example:
select #1
union all
select #2
To combine them horizontally you can use FULL JOIN. For example:
select *
from (
select #1
) x
full join (
select #2
) y on y.type = x.type

Related

How to aggregate different CTEs in outer query SQL

i am trying to join two ctes to get the difference in performance of different countries and group on id here is my example
every campaign can be done in different countries, so how can i group by at the end to have 1 row per campaign id ?
CTE 1: (planned)
select
country
, campaign_id
, sum(sales) as planned_sales
from table x
group by 1,2
CTE 2: (Actual)
select
country
, campaign_id
, sum(sales) as actual_sales
from table y
group by 1,2
outer select
select
country,
planned_sales,
actual_sales
planned - actual as diff
from cte1
join cte2
on campaign_id = campaign_id
This should do it:
select
cte1.campaign_id,
sum(cte1.planned_sales),
sum(cte2.actual_sales)
sum(cte1.planned_sales) - sum(cte2.actual_sales) as diff
from cte1
join cte2
on cte1.campaign_id = cte2.campaign_id and cte1.country = cte2.country
group by 1
I would suggest using full join, so all data is included in both tables, not just data in one or the other. Your query is basically correct but it needs a group by.
select campaign_id,
sum(cte1.planned_sales) as planned_sales
sum(cte2.actual_sales) as actual_sales,
(coalesce(sum(cte1.planned_sales), 0) -
coalesce(sum(cte2.actual_sales), 0)
) as diff
from cte1 full join
cte2
using (campaign_id, country)
group by campaign_id;
That said, there is no reason why the CTEs should aggregate by both campaign and country. They could just aggregate by campaign id -- simplifying the query and improving performance.

using a select to find info from 2 tables that have similar columns

I have 2 tables very similar, they both partially have same column's name (and datatype), so instead of having to select tables 1 by 1, I wanted to make it so the first table's column become same like second's table column (so like if they have 4 columns with same name, instead of having 8 column after selecting, it shows only 3)
JOIN items i ON i.characterId=c.characterId
WHERE i.itemId=18011
SELECT c.accountId,c.characterId,c.name,b.itemId,b.maxUpgrade,b.amount FROM characters c
JOIN bankItems b ON b.accountId=c.accountId
WHERE b.itemId=18011
here is an example of request I do to select a same info from both tables, I need to do 2 different request and I wish I could fusion them
table 1 (characters):
characterId accountId name
table 2 (items):
characterId itemId maxUpgrade amount
table 3 (bankItems) :
accountId itemId maxUpgrade amount
And in result :
accountId characterId name itemId maxUpgrade amount
but all in 1 request, so no need to type the WHERE c.name= twice
You could do a union of items and bankItems tables within a CTE and then join the characters table on the CTE for example with either the accountId or characterId:
;WITH CTE AS (
SELECT itemId, NULL AS characterId, accountId, maxUpgrade, amount
FROM bankItems
UNION
SELECT itemId, characterId, NULL AS accountId, maxUpgrade, amount
FROM items
)
SELECT
c.accountId,
c.characterId,
c.name,
b.itemId,
b.maxUpgrade,
b.amount
FROM characters c
JOIN CTE b ON
b.accountId = c.accountId
OR b.characterId = c.characterId
WHERE b.itemId = 18011;
Considering the table structure this solution with optional fields should work.

SQL Select one row over a matching row from two tables

I have two tables with the same fields, but a final value that is calculated slightly differently. I need to combine the data from these two tables into one but need to prioritise one record over another when there is a match. Do you know how this might be possible?
Below is a mock up of two matching records:
ID Balance Type CCY Payment Final_Balance
28 1068376.037 F - CC GBP 78124 990252.0367
28 1068376.037 F - DD GBP 982905 85470.08293
Apologies if the format comes out poorly, I'm unsure how to format table data.
I have thousands of records in these two tables but for a handful of records I have the same information in both tables. Essentially what I'm trying to get to is where there is a match I want it to select F-CC over F-DD so I end up with unique records in my final table.
Thanks
I personally use ROW_NUMBER() for things like this, but there may be a better solution.
You can re-run this SQL to show how the final answer is slowly built up:
declare #t1 table (id int)
declare #t2 table (id int, txt varchar(2))
insert into #t1
select 1 union
select 2
insert into #t2
select 1, 'FC' union
select 1, 'FD' union
select 2, 'FC' union
select 2, 'FD'
select *, row_number() over (partition by id order by txt) as we_want_the_ones
from #t2
select * from (
select id, txt, row_number() over (partition by id order by txt) as we_want_the_ones
from #t2
) z
where we_want_the_ones = 1
select *
from #t1 a
join (
select * from (
select id, txt, row_number() over (partition by id order by txt) as we_want_the_ones
from #t2
) z
where we_want_the_ones = 1
) b on a.id = b.id
My understanding of the question is that you have two tables (A and B) which have the exact same columns. You want to UNION these tables into one dataset, but sometimes you have rows in the two tables which "match" each other. In this case you only take one of the rows based on some priority.
From your example it seems that..
Match: Occurs when the ID is the same.
Priority: Is based on the Type column, prioritized by lower alphabetical order.
Also I'm assuming SQL Server, since that's what I prefer and you didn't say.
Hopefully all that is correct.. Now, here is how I would approach it.
I would start by performing the UNION of the two tables. Taking all records and not worrying about matching yet, putting them in a temp table to use later.
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
INTO #AllRecords
FROM A
UNION
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
FROM B
Next, I would GROUP BY the fields which determine a match, then use MIN or MAX to get the correct value for priority columns. By my understanding of your problem that means..
SELECT ID, MIN(Type) AS Type
FROM #AllRecords
GROUP BY ID
With that query you now have the natural key for all the records you want to display in your final result. All that is left to do is look up the rest of the columns using those keys, we can do this by using that query as a subquery.
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
FROM #AllRecords r
INNER JOIN (
SELECT ID, MIN(Type) AS Type
FROM #AllRecords
GROUP BY ID ) final ON r.ID = final.ID AND r.Type = final.Type
So all together the resulting query is..
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
INTO #AllRecords
FROM A
UNION
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
FROM B
SELECT ID, Balance, Type, CCY, Payment, Final_Balance
FROM #AllRecords r
INNER JOIN (
SELECT ID, MIN(Type) AS Type
FROM #AllRecords
GROUP BY ID ) final ON r.ID = final.ID AND r.Type = final.Type

Avg used on count

I have a two tables that have following attributes
DOCTORS OPERATIONS
D_ID DATE
Name TYPE
Specialiation DOCTORS_D_ID
PACIENTS_PACIENT_ID
I want to return name and ID of doctores that operated more than the average number of operations per doctor.
I have created following SQL command
SELECT Name D_ID,COUNT(*) FROM DOCTORS
JOIN OPERATION
ON D_ID = DOCTORS_D_ID
GROUP BY D_ID,Name
HAVING COUNT(*) > ( SELECT AVG(COUNT(DOCTORs_D_ID))
FROM OPERATIONS GROUP by DOCTORS_D_ID )
this result in following table
D_ID COUNTS(*)
Dr. Martin 3
In column D_ID is name instead of ID = only one of two attributes is returned in table. How can I return both - name and D_ID from this command?
I am not a fan of nested aggregation functions. I would just do this by calculating the average directly:
SELECT Name, D_ID, COUNT(*)
FROM DOCTORS JOIN
OPERATION
ON D_ID = DOCTORS_D_ID
GROUP BY D_ID, Name
HAVING COUNT(*) > (SELECT COUNT(*) / COUNT(DISTINCT DOCTORs_D_ID))
FROM OPERATIONS
);
There is an issue of not counting doctors who do no operations in the average (in which case the average from just using the operations table [or an inner join with the operations table] will be higher than the actual answer from taking the number of operations in the operations table and the number of doctors in the doctors table).
To compensate for this you can do:
SELECT Name,
D_ID,
num_operations
FROM ( SELECT Name,
D_ID,
COUNT( 1 ) OVER () AS num_doctors
FROM doctors ) d
LEFT OUTER JOIN
( SELECT DISTINCT
DOCTORS_D_ID,
COUNT( 1 ) OVER ( PARTITION BY DOCTORS_D_ID ) AS num_operations,
COUNT( 1 ) OVER () AS total_operations
FROM operations ) o
ON ( d.d_id = o.doctors_d_id )
WHERE num_operations > total_operations / num_doctors;
It has the added bonus using analytic functions to calculate the counts rather than performing a third table scan.
with num_operations as
select doctors_d_id,count( * ) as operations from operations
group by doctors_d_id and having count(*)>
(select avg(count(doctors_d_id) from operations group by doctors_d_id )
select doctors_d_id,operations,name from num_operation a, doctors b
where a.doctors_d_id=b.d_id

Use column defined in a subquery

Sorry if the title is not clear, I'm a beginner and I didn't know exactly how to formule it...
I have this query working with Oracle :
SELECT
( SELECT COUNT(*)
FROM CATEGORY
) AS NBCATEGORIES,
( SELECT ROUND(AVG(FINANCIALOPERATIONBYPERSON),2)
FROM
(
SELECT SUM(AMOUNT) AS FINANCIALOPERATIONBYPERSON
FROM FINANCIALOPERATION
WHERE PERSONID IS NOT NULL
GROUP BY PERSONID
)
) AS AVERAGELOADAMOUNTBYPERSON
FROM DUAL
I'm looking for the equivalent for Sql Server...
The goal is to have multiple queries in a single query.
So I removed the "FROM DUAL" but I get an error on "FINANCIALOPERATIONBYPERSON" (Invalid column name), certainly because it's defined in the subquery...
How can I modify the query for SQL-Server ?
SQL Server requires aliases for subqueries. So, you can rewrite this as:
SELECT (SELECT COUNT(*)
FROM CATEGORY
) AS NBCATEGORIES,
(SELECT ROUND(AVG(FINANCIALOPERATIONBYPERSON),2)
FROM (SELECT SUM(AMOUNT) AS FINANCIALOPERATIONBYPERSON
FROM FINANCIALOPERATION
WHERE PERSONID IS NOT NULL
GROUP BY PERSONID
) t
) AS AVERAGELOADAMOUNTBYPERSON;
In both databases, though, I would be inclined to write this as:
SELECT c.NBCATEGORIES, ROUND(fo.AVERAGELOADAMOUNTBYPERSON, 2) AS AVERAGELOADAMOUNTBYPERSON
FROM (SELECT COUNT(*) as NBCATEGORIES
FROM CATEGORY c
) c CROSS JOIN
(SELECT SUM(AMOUNT) / COUNT(DISTINCT PERSONID) AS AVERAGELOADAMOUNTBYPERSON
FROM FINANCIALOPERATION fo
WHERE PERSONID IS NOT NULL
) fo;
One note for both these forms: SQL Server does integer arithmetic on integers. So, if AMOUNT is an integer, then you should convert it to an appropriate floating or fixed point numeric type.
You need to add a table alias for the subquery.
SELECT
( SELECT COUNT(*)
FROM CATEGORY
) AS NBCATEGORIES,
( SELECT ROUND(AVG(RESULTS.FINANCIALOPERATIONBYPERSON),2)
FROM
(
SELECT SUM(AMOUNT) AS FINANCIALOPERATIONBYPERSON
FROM FINANCIALOPERATION
WHERE PERSONID IS NOT NULL
GROUP BY PERSONID
) RESULTS
) AS AVERAGELOADAMOUNTBYPERSON