Best way to use CTEs to join two large tables? - sql

I have 2 tables like this:
table1
user_id
region
th54d5d
South West
table2
user_id
date
th54d5d
South West
The tables are too big to join together so I'm trying to use CTEs to query them, this is what I have tried:
'''
WITH a AS (
SELECT
DISTINCT region,
COUNT(DISTINCT user_id) AS users
FROM table1
GROUP BY 1
),
b AS (
SELECT
user_id AS users,
date
FROM table2
WHERE date BETWEEN '20220418' AND '20220821'
GROUP BY 1,2
)
SELECT
a.region,
a.users
FROM a
RIGHT JOIN b
ON a.users = b.users
WHERE b.datet BETWEEN '20220418' AND '20220821'
GROUP BY 1,2
ORDER BY 2
;
'''
This just returns two blank columns. I'm not that great at CTEs, maybe someone can advise on the correct/ a better way of going about this? (amazon redshift) Thanks!

Related

SQL query to sum by group and inner join two tables

I have two tables like so:
Each row in both tables is uniquely identified by the columns week and city.
I want to create one table with 5 columns (week, value_a, value_b, value1, value2) and 3 rows (1 row for each week, with the value columns being summed across each city). The final table should look exactly like this:
sum_a is the sum of value a for each week across all cities, sum_b is the sum of value_b across all cities and so on.
Here is my SQL query:
SELECT *
FROM table1
INNER JOIN table2
ON table1.week = table2.week AND
table1.city = table2.city
If you need to sum column relied by join you just need to sum your tables before to avoid repeat data
Considere that if you have a week in your table 1 and not in the table 2 the data will not be shawn in your example
SELECT
A1.week,
A1.city,
A1.value1,
A1.value2,
A2.value1,
A2.value2
FROM (
SELECT
Week,
city,
sum(value1),
sum(value2)
FROM table1
GROUP BY Week, city
) A1
INNER JOIN (
SELECT
Week,
city,
sum(valueA),
sum(valueB)
FROM table2
GROUP BY Week, city
) A2
ON a1.week = a2.week AND a1.city = a2.city
the below query can give you output as expected:
SELECT table1.week, sum(value_a) as sum_a, sum(value_b) as sum_b, sum(value1) as sum_1, sum(value2) as sum_2
FROM table1
INNER JOIN table2 ON table1.week = table2.week AND table1.city = table2.city
group by table1.week
Query can be validated by checking the link db<>fiddle<>example

COUNT of GROUP of two fields in SQL Query -- Postgres

I have a table in postgres with 2 fields: they are columns of ids of users who have looked at some data, under two conditions:
viewee viewer
------ ------
93024 66994
93156 93151
93163 113671
137340 93161
92992 93161
93161 93135
93156 93024
And I want to group them by both viewee and viewer field, and count the number of occurrences, and return that count
from high to low:
id count
------ -----
93161 3
93156 2
93024 2
137340 1
66994 1
92992 1
93135 1
93151 1
93163 1
I have been running two queries, one for each column, and then combining the results in my JavaScript application code. My query for one field is...
SELECT "viewer",
COUNT("viewer")
FROM "public"."friend_currentfriend"
GROUP BY "viewer"
ORDER BY count DESC;
How would I rewrite this query to handle both fields at once?
You can combine to columns from the table into a single one by using union all then use group by as below:
select id ,count(*) Count from (
select viewee id from vv
union all
select viewer id from vv) t
group by id
order by count(*) desc
Results:
This is a good place to use a lateral join:
select v.viewx, count(*)
from t cross join lateral
(values (t.viewee), (t.viewer)) v(viewx)
group by v.viewx
order by count(*) desc;
You can try this :
SELECT a.ID,
SUM(a.Total) as Total
FROM (SELECT t.Viewee AS ID,
COUNT(t.Viewee) AS Total
FROM #Temp t
GROUP BY t.Viewee
UNION
SELECT t.Viewer AS ID,
COUNT(t.Viewer) AS Total
FROM #Temp t
GROUP BY t.Viewer
) a
GROUP BY a.ID
ORDER BY SUM(a.Total) DESC

How to aggregate different CTEs in outer query SQL

i am trying to join two ctes to get the difference in performance of different countries and group on id here is my example
every campaign can be done in different countries, so how can i group by at the end to have 1 row per campaign id ?
CTE 1: (planned)
select
country
, campaign_id
, sum(sales) as planned_sales
from table x
group by 1,2
CTE 2: (Actual)
select
country
, campaign_id
, sum(sales) as actual_sales
from table y
group by 1,2
outer select
select
country,
planned_sales,
actual_sales
planned - actual as diff
from cte1
join cte2
on campaign_id = campaign_id
This should do it:
select
cte1.campaign_id,
sum(cte1.planned_sales),
sum(cte2.actual_sales)
sum(cte1.planned_sales) - sum(cte2.actual_sales) as diff
from cte1
join cte2
on cte1.campaign_id = cte2.campaign_id and cte1.country = cte2.country
group by 1
I would suggest using full join, so all data is included in both tables, not just data in one or the other. Your query is basically correct but it needs a group by.
select campaign_id,
sum(cte1.planned_sales) as planned_sales
sum(cte2.actual_sales) as actual_sales,
(coalesce(sum(cte1.planned_sales), 0) -
coalesce(sum(cte2.actual_sales), 0)
) as diff
from cte1 full join
cte2
using (campaign_id, country)
group by campaign_id;
That said, there is no reason why the CTEs should aggregate by both campaign and country. They could just aggregate by campaign id -- simplifying the query and improving performance.

Get Records By Most Recent Date From two tables

I have two SQL tables. Each has an ID with other columns and a Date.
Is there a way that I can get the result from these two tables in one query sorted by the date? For example, as a result, I may have one record from table 1 followed by two records from table 2 and then another record from table one and so on. I have tried the code below but I think that I am not on the right track.
I would appreciate your help.
SELECT
app.ID as 'AppraisalID',
app.CityName,
app.CountryName,
app.Street,
app.DateCreated,
subApp.ID as 'SubAppraisalID',
subApp.Message,
subApp.DateCreated
From
(
SELECT TOP 10
dbo.Appraisal.ID,
dbo.Appraisal.Street,
dbo.Country.Name as 'CountryName',
dbo.City.Name as 'CityName',
dbo.Appraisal.DateCreated
FROM dbo.Appraisal
INNER JOIN dbo.Country ON dbo.Appraisal.CountryID = dbo.Country.ID
INNER JOIN dbo.City ON dbo.Appraisal.CityID = dbo.City.ID
Order by dbo.Appraisal.DateCreated DESC
) app
Cross Join
(
SELECT TOP 10
dbo.Sub_Appraisal.ID,
dbo.Sub_Appraisal.Message,
dbo.Sub_Appraisal.DateCreated
FROM dbo.Sub_Appraisal
Order by dbo.Sub_Appraisal.DateCreated DESC
) subApp
Order By
app.DateCreated DESC,
subApp.DateCreated DESC
Thanks guys.
What you want to use is the UNION operator, although the column lists for each table (or at least the ones that you are selecting) must match up. You'll want to make sure that you do the ordering after the UNION.
A simplified example:
SELECT
col1,
col2,
some_date
FROM
(
SELECT
col1,
col2,
some_date
FROM
Table1
UNION ALL
SELECT
col1,
col2,
some_date
FROM
Table2
) AS SQ
ORDER BY
some_date
Look at union all. You'll need to make sure that your result columns are the same data type.
select a.id "id", null "message", a.cityname "city", a.countryname "country", a.street "street", a.datecreated "dt"
from dbo.appraisal a
union all
select s.id, s.message, null, null, null, s.datecreated
from dbo.sub_appraisal s
order by 6
However, I suspect that your sub_appraisal table is missing an ID linking it to the appraisal table. This is how you would ideally join the two tables allowing you to accurately get the data out, in the correct order because you cannot guarantee that sub_appraisal records are created directly after appraisal records and before another appraisal record is created. If this happened, your query would give you results you're possibly not expecting.

MS Access - return values by max date

I'm lost on this one, I'm a bit of a newcomer to Access and SQL, I have scoured the site and Google for the answer to this one.
I have a table with 3 columns containing IDs to other tables and then a date.
Column 1 (RoleID) Column 2 (ActionID) Column 3 (SettingID) Column 4 (Date)
I need to group by Column 1 and Column 2 (so the unique combinations of these). There may be multiple instances with different SettingID, differentiated by a date.
I think a Totals select query does the job, with Group by for Column1 and 2, then using Max for the date column. However I just want the value of Column 3, not a total.
Is there a simple way to do this that I'm missing?
select roleid, actionid, settingid
from your_table t1
inner join
(
select roleid, actionid, max(date) as mdate
from your_table
group by roleid, actionid
) t2 on t1.roleid = t2.roleid
and t1.actionid = t2.actionid
and t1.date = t2.mdate
If this is a really old version of Access then it won't support Subqueries very well
You can work round this by creating a seperate query
select roleid, actionid, max(date) as mdate
from your_table
group by roleid, actionid
Save it as MaxDateQuery or something similar
Then you can use that saved Access query in a subsequent query to get what you wannt
select
your_table.roleid,
your_table.actionid,
your_table.settingid
from your_table
inner join MaxDateQuery
on your_table.roleid = MaxDateQuery.roleid
and your_table.actionid = MaxDateQuery.actionid
and your_table.date = MaxDateQuery.mdate