I have a table contains weekly data, with a column to identify when the data was loaded in the week.
This is the sample data
====================
Load_Date |Code
====================
1 Oct 2018 |875465
8 Oct 2018 |875465
15 Oct 2018 |875465
Additionally, this table has data for all 52 weeks, so if I do a DISTINCT Load_Date, it gives me 52 rows with each week's date.
I am trying to join the table with itself such that I can identify the weeks where Code IS NULL.
The query that I tried forming is
SELECT * FROM
(--Query to get all distinct weeks
SELECT DISTINCT LOAD_DATE FROM ztable
) dates
FULL OUTER JOIN
(
SELECT DISTINCT CODE, LOAD_DATE
FROM ztable
) z
ON z.LOAD_DATE = dates.LOAD_DATE
I have tried using Left Join too but the number of records against each code is the number of rows that the code has in the table, not 52.
For example, if the code has 3 rows then I'll see only 3 rows with this join.
What am I missing?
I think you are trying to find code/load_date combinations that don't exist. Going by your description (not the sample code), I think this is:
select d.load_date, c.code
from (select distinct load_date from t) d cross join
(select distinct code from t) c left join
t
on d.load_date = t.load_date and c.code = t.code
where t.code is null;
Related
I have two tables,materials_students and components_students. Both of them has afinished_at column. material_student has a component_student_id column.
I need to count the number of components_students and materials_students (Where finished_at id is not NULL), extract month and year from finished_at, group the result by month and year and plot it in just one table, like this:
| Materials | Components | Month | Year
---------------------------------------------
| 45 3 1 2019
| 37 6 2 2019
| 63 8 3 2019
I know how to do this for one table only, but dont know how to join the results in just one table.
Find below how I did for one table:
FROM materials_students
LEFT JOIN students ON materials_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE materials_students.finished_at IS NOT NULL
GROUP BY YEAR, MONTH
ORDER BY YEAR, MONTH
Thanks!
The best is to assemble a subquery for each case, then join them.
select
ISNULL(M.yy, C.yy) [yy],
ISNULL(M.mm, C.mm) [mm],
ISNULL(number_material_students, 0) [number_material_students],
ISNULL(number_components_students, 0) [number_component_students]
from (
SELECT
year(materials_students.finished_at) yy,
month(materials_students.finished_at) mm,
count(*) number_material_students
FROM materials_students
LEFT JOIN students ON materials_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE materials_students.finished_at IS NOT NULL
GROUP BY year(materials_students.finished_at), month(materials_students.finished_at)
) M
full outer join (
SELECT
year(components_students.finished_at) yy,
month(components_students.finished_at) mm,
count(*) number_material_students
FROM components_students
LEFT JOIN students ON components_students.student_id = students.id
LEFT JOIN company_profiles ON students.company_profile_id = company_profiles.id
LEFT JOIN companies ON company_profiles.company_id = companies.id
WHERE components_students.finished_at IS NOT NULL
GROUP BY year(materials_students.finished_at), month(materials_students.finished_at)
) C
ON C.yy = M.yy AND C.mm = M.mm
ORDER BY 1, 2
I had to make a FULL OUTER JOIN between the subqueries, because there may have been year/months that appear only on materials, but not on components, and vice-versa.
To retrieve the year I use the ISNULL() function, so in case year is not filled from the materials subquery, I use it from the components subquery. Similar reasoning applies to all other resulting columns.
I'm a begginer with SQL and got a small problem. Let's assume I got a table looking like:
ID Month
1 Jan
2 Feb
2 June
3 Dec
Now I want to have that every ID-Value got every month, ie
ID Month
1 Jan
1 Feb
. .
.
.
2 Jan
2 Feb
.
.
and so on.
I tried to create another table including all months and the use of the command "Left Join". But this only includes all months once for the whole table, but not for every ID seperately.
I tried:
Create Table merged as
select ID, Month
from Data
Left outer join months On data.month=months.month;
You would typically do this with a cross join. If all months are in the table, then you can do:
select i.id, m.month
from (select distinct id from t) i cross join
(select distinct month from t) m;
You may have another source for the lists of ids and months.
EDIT:
Michael, normally when you are asking a new question, you should do it as another question. This question clearly does not have an outcome column; changing the question would invalidate this answer and hence draw downvotes -- so that is rude.
But, this is an easy change to the query:
select i.id, m.month, t.outcome
from (select distinct id from t) i cross join
(select distinct month from t) m left join
t
on t.id = i.id and t.month = m.month;
SELECT T1.ID,A._Month FROM #Table T1 CROSS JOIN (SELECT ID,_Month FROM #Table) A ORDER BY ID
I have information about accounts in two tables (A, B).
The records in A are all unique at the account level (account_id), but in table B, accounts are identified by account_id and month_start_dt, so each account may exist in zero or more months.
The trouble is, when I left outer join A to B so the joined table contains all records from A with the records from B (by account, by month) any account that does not exist in table B for a given month does not have a record for that month.
Desired outcome: If an account does not exist in table B for a given month, create a record for that account in the joined table with month_start_dt and 0 for all variables being selected from B.
As it stands, I can get the join to work where all accounts not appearing in B (not appearing at all, in any month) have 0 values for all variables being selected from B (using nvl(variable, 0) ) but, these accounts only have a single record. They should have one for each month.
Create a temp table with number of records you want for not-existing rows and right join the result of first query.
select tbl.* from ( select * from A left join B on a.col1 = b.col2) tbl join tmpTable on tbl.col2 = tmpTable.zerocol
try this.
I don't see why you need an outer join. This uses Standard SQL's EXCEPT (MINUS in Oracle):
SELECT account_id, month_start_dt, all_variables
FROM B
UNION
(
SELECT account_id, month_start_dt, 0 AS all_variables
FROM A
CROSS JOIN (
SELECT DISTINCT month_start_dt
FROM B
) AS DT1
EXCEPT
SELECT account_id, month_start_dt, 0 AS all_variables
FROM B
);
You could use a tally Calendar table, with months (of several years). See this similar question: How to create a Calender table for 100 years in Sql
And then have:
FROM
A
CROSS JOIN
( SELECT y
, m
FROM Calendar
WHERE ( y = #start_year
AND m >= #start_month
)
OR ( y > #start_year
AND y < #end_year
)
OR ( y = #end_year
AND m <= #end_month
)
) AS C
LEFT JOIN
B
ON B.account_id = A.account_id
AND YEAR(B.start_date) = C.y
AND MONTH(B.start_date) = C.m
I have a table from which I need to get the count grouped on two columns.
The table has two columns one datetime column and another one is success value(-1,1,0)
What i am looking for is something like this:
Count of success value for each month:
month----success-----count
11------- -1 ------- 50
11------- 1 --------- 50
11------- 0 ------- 50
12------- -1 ------- 50
12------- 1 ------- 50
12------- 0 ------- 50
If there is no success value for a month then the count should be null or zero.
I have tried with left outer join as well but of no use it gives the count incorrectly.
You need to cross join all the available months, against the 3 success values to build a virtual matrix, which can then be left joined to the actual data
select m.month, s.success, COUNT(t.month)
from (select distinct MONTH from tbl) m
cross join (select -1 success union all select 1 union all select 0) s
left join tbl t on t.month = m.month and t.success = s.success
group by m.month, s.success
If you need missing months as well, that can be done, just slightly more complicated by changing the subquery "m" above.
#updated
Count(*) will always return at least 1 for left joins. count(colname) from the right part of the left join to be correct.
You probably need a table with the just the values from 1-12 to join with so you can get a zero count.
I have 3 tables
bl_main (bl_id UNIQUE, bl_area)
bl_details (bl_id UNIQUE, name)
bl_data(bl_id, month, paper_tons, bottles_tons)
bl_id is not unique in the last table. There will be multiple rows of same bl_id.
I am trying to retrieve data in the following way
bl_id | name | bl_area | sum(paper_tons) | sum (bottles_tons) | paper_tons | bottles_tons
sum(paper_tons) should return the sum of all the paper tons for the same bl_id like Jan to December.
Using the below query i am able to retrieve all the data correctly except in the result, there are multiple occurances of bl_ids(From bl_data table).
SELECT bl_main.bl_id,name,bl_area,sums.SummedPaper, sums.SummedBottles,paper_tons,bottles_tons
FROM bl_main
JOIN bl_details ON
bl_main.bl_id= bl_details.bl_id
left outer JOIN bl_data ON
bl_data.bl_id= bl_main.bl_id
left outer JOIN (
SELECT bl_id, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles
FROM bl_data
GROUP by bl_id) sums ON sums.bl_id = bl_main.bl_id
I wanto retrieve only the unique values of bl_ids without repetition and it should contain the bl_id which has the max month and not all the months for the same bl_id.
For ex:
INCORRECT
**0601** University Hall 75.76 17051 1356 4040 1154 **11**
**0601** University Hall 75.76 17051 1356 9190 101 **12**
**0605** UIC Student 22.86 3331 14799 0 356 **8**
CORRECT
**0601** University Hall 75.76 17051 1356 9190 101 **12**
**0605** UIC Student 22.86 3331 14799 0 356 **8**
I know I can get the max value using
WHERE Month = (SELECT MAX(Month)
but where exactlt should i add this in the query and should i change the join definition.
Any help is highly appreciated as i am new to sql. Thanks in advance.
You have two tables that probably should be combined into one (bl_main and bl_details). But putting that aside, what you need is a self-join subquery to select the row with the max month. Something like the following (untested):
SELECT bl_main.bl_id, bl_details.name, bl_main.bl_area, sums.sum_paper_tons,
sums.sum_bottles_tons, maxmonth.paper_tons, maxmonth.bottles_tons
FROM bl_main
INNER JOIN bl_details ON bl_main.bl_id = bl_details.bl_id
LEFT OUTER JOIN (SELECT bl_id, SUM(paper_tons) AS sum_paper_tons,
SUM(bottles_tons) AS sum_bottles_tons
FROM bl_data
GROUP BY bl_id) sums ON bl_main.bl_id = sums.bl_id
LEFT OUTER JOIN (SELECT bl_id, paper_tons, bottles_tons
FROM bl_data data2
INNER JOIN (SELECT bl_id, MAX(month) AS max_month
FROM bl_data
GROUP BY bl_id) m
ON m.bl_id = data2.bl_id
AND m.max_month = data2.month) maxmonth
ON bl_main.bl_id = maxmonth.bl_id
You can join the table containing the month against itself, using a subquery of the form:
Select *
From mytable m
Inner Join (Select max(Month) as Month, myId
From mytable
Group By myId) mnth
On mnth.myId = m.myId and mnth.Month = m.Month
Your JOIN clause
left outer JOIN bl_data ON
bl_data.bl_id= bl_main.bl_id
does not specify which month to select for the data you are displaying with paper_tons and bottles_tons.
You could update that JOIN to only contain the max month, and this should limit the entries, like so:
left outer JOIN (SELECT bl_id, MAX(Month) as Month from bl_data GROUP BY bl_id) as Month
ON Month.bl_id = bl_main.bl_id
left outer JOIN bl_data ON
bl_data.bl_id = bl_main.bl_id AND bl_data.Month = Month.bl_Month
I think this query is what you are looking for
SELECT bl_main.bl_id,name, bl_area, sums.SummedPaper, sums.SummedBottles, paper_tons, bottles_tons
FROM bl_main
JOIN bl_details ON bl_main.bl_id= bl_details.bl_id
left outer JOIN bl_data ON bl_data.bl_id= bl_main.bl_id
left outer JOIN
(
SELECT bl_id, month, SUM(Paper_tons) As SummedPaper, SUM(bottle_tons) As SummedBottles
FROM bl_data
WHERE month in
(SELECT MAX(month) FROM bl_data GROUP BY bl_id)
GROUP BY bl_id, month
) sums ON sums.bl_id = bl_main.bl_id
I wanted to just add a comment to the answer lc gave, but I don't have 50 reputation points yet. It is a link to an article that I believe explains this question and adds the why the solution that lc gave is correct.
http://www.sqlteam.com/article/how-to-use-group-by-with-distinct-aggregates-and-derived-tables