A single query to count the number of distinct rows in one table and the highest value of a column from another table - sql

I have two SQL tables. Table 1 is as follows:
SALEREF
1 | 40303020
2 | 40303021
3 | 40303021
4 | 40303021
5 | 41210028
6 | 4120302701
7 | 41210030
8 | 4112700803
9 | 4112700803
10 | 41215030
11 | 41215026
12 | 41215026
13 | 41215026
14 | 41215026
15 | 41215026
16 | 41215026
17 | 41215026
18 | 41215027
19 | 41215027
20 | 41215027
Table 2 ("LEDGER") is as follows:
SALESREF SALEDATE
0 | 4081200201 | 20140804
1 | 40303020 | 20141015
2 | 40303021 | 20141017
3 | 40303021 | 20141017
4 | 40303021 | 20141017
5 | 41210028 | 20121214
6 | 4120302701 | 20130926
7 | 41210030 | 20130926
8 | 4112700803 | 20131107
9 | 4112700803 | 20131107
10 | 41215030 | 20120720
What I am looking for is a single line that outputs the following:
TotalDistinctSalesRefsInTable1 HighestSaleDateValueInTable2 (that has a matching value in table 1)
9 20141017
the total number of distinct SALESREF's in table 1 and the latest SALESDATE value from table 2.
I've tried selecting within a query but quickly found the limitation of my knowledge although I know I can get the latest overall sale date by doing:
SELECT MAX(LEDGER.SALEDATE) AS LAST_DATE FROM LEDGER
I just need help piecing the whole thing together.

you can use left join , count and max to get your desired result
select count(distinct t1.salesref) as TotalDistinctSalesRefsInTable1,
ifnull(max(l.saledate),0) as HighestSaleDateValueInTable
from table1 t1
left join ledger l
on t1.salesref = l.salesref

Related

full outer join in redshift

I have 2 tables A and B with columns, containing some details of students (all columns are integer):
A:
st_id,
st_subject_id,
B:
st_id,
st_subject_id,
st_count1,
st_count2
st_id means student id, st_subject_id is subject id.
For student id 15, there are following entries:
A:
15 | 1
15 | 2
15 | 3
B:
15 | 1 | 31 | 11
15 | 2 | 30 | 14
15 | 4 | 21 | 6
15 | 5 | 26 | 9
3 subjects in table A and 4 subjects(2 matching with table A and 2 extra) in table B.
I want to display the final result as:
15 | 1 | 31 | 11
15 | 2 | 30 | 14
15 | 3 | null | null
15 | 4 | 21 | 6
15 | 5 | 26 | 9
Can this be done using full outer join in SQL, or by another method?
I think something like this would suffice, but I can't test right now.
Coalesce means that the first non-null value will be selected from both tables.
select
coalesce(A.st_id, B.st_id) st_id,
coalesce(A.st_subject_id, B.st_subject_id) st_subject_id,
B.st_count1,
B.st_count2
from A
full outer join B
on A.st_id = B.st_id and A.st_subject_id = B.st_subject_id

How to return the same period last year data with SQL?

I am trying to create a view in postgreSQL with the requirements as below:
The table needs to show the same period last year data for every records.
Sample data:
date_sk | location_sk | division_sk | employee_type_sk | value
20180202 | 6 | 8 | 4 | 1
20180202 | 7 | 2 | 4 | 2
20190202 | 6 | 8 | 4 | 1
20190202 | 7 | 2 | 4 | 1
20200202 | 6 | 8 | 4 | 1
20200202 | 7 | 2 | 4 | 3
In the table, date_sk, location_sk, division_sk and employee_type_sk are super keys which form an unique record in the table.
You can check the required output as below:
date_sk | location_sk | division_sk | employee_type_sk | value | value_last_year
20180202 | 6 | 8 | 4 | 1 | NULL
20180203 | 7 | 2 | 4 | 2 | NULL
20190202 | 6 | 8 | 4 | 1 | 1
20190203 | 7 | 3 | 4 | 1 | NULL
20200202 | 6 | 8 | 4 | 1 | 1
20200203 | 7 | 3 | 4 | 3 | 1
The records start on 20180202, therefore, the data for the same period last year is unavailable. At the 4th record, there is a difference in division_sk comparing with the same period last year - hence, the head_count_last_year is NULL.
My current solution is to create a view from the sample data with an addition column as same_date_last_year then LEFT JOIN the same table. The SQL queries are below:
CREATE VIEW test_view AS
SELECT *,
CONCAT(LEFT(date_sk, 4) - 1, RIGHT(date_sk, 4)) AS same_date_last_year
FROM test_table
SELECT
test_view.date_sk,
test_view.location_sk,
test_view.division_sk,
test_view.employee_type_sk,
test_view.value,
test_table.value AS value_last_year
FROM test_view
LEFT JOIN test_table ON (test_view.same_date_last_year = test_table.date_sk)
We have a lot of data in the table. My solution above is unacceptable in terms of performance.
Is there a different query which yields the same result and might improve the performance ?
You could simply use a correlated subquery here which is likely best for performance:
select *,
(
select value from t t2
where t2.date_sk=t.date_sk - interval '1' year and
t2.location_sk=t.location_sk and
t2.division_sk=t.division_sk and
t2.employee_type_sk=t.employee_type_sk
) as value_last_year
from t
WITH CTE(DATE_SK,LOCATION_SK,DIVISION_SK,EMPLOYEE_TYPE_SK,VALUE)AS
(
SELECT CAST('20180202' AS DATE),6,8,4,1 UNION ALL
SELECT CAST('20180203'AS DATE),7,2,4,2 UNION ALL
SELECT CAST('20190202'AS DATE),6,8,4,1 UNION ALL
SELECT CAST('20190203'AS DATE),7,2,4,1 UNION ALL
SELECT CAST('20200202'AS DATE),6,8,4,1 UNION ALL
SELECT CAST('20200203'AS DATE),7,2,4,3
)
SELECT C.DATE_SK,C.LOCATION_SK,C.DIVISION_SK,C.EMPLOYEE_TYPE_SK,C.VALUE,
LAG(C.VALUE)OVER(PARTITION BY C.LOCATION_SK,C.DIVISION_SK,C.EMPLOYEE_TYPE_SK ORDER BY C.DATE_SK ASC)LAGG
FROM CTE AS C
ORDER BY C.DATE_SK ASC;
Could you please try if the above is suitable for you. I assume,DATE_SK is a date column or can be CAST to a date

SQL - summing up minutes in the table for all the rows with the same month as their date and store it in a column for each row

I have a table as follow:
id |minutes |sumOfMinutes|Date
_______________________________________
1 | 5 | | 20141106
1 | 7 | | 20141106
2 | 1 | | 20141106
2 | 9 | | 20141106
3 | 8 | | 20141106
How can I store sum of minutes in the third column for rows under the same month, so that i have:
id |minutes |sumOfMinutes| Date
_____________________________________
1 | 5 | 12 | 20141106
1 | 7 | 12 | 20141112
2 | 1 | 18 | 20141006
2 | 9 | 18 | 20141007
3 | 8 | 18 | 20141009
Use SUM() and Group by
SELECT table1.id, table1.minutes, SUM(monthTot.minutes), table1.Date
FROM table 1
JOIN table1 AS monthTot ON
MONTH(monthTot.date) = MONTH(table1.date)
GROUP BY table1.id, table1.minutes, table1.Date
sum with partition by option can be used to achieve this.
select id, [minutes],
sum([minutes]) over ( partition by month([date]) ) as sumOfMinutes,
[Date]
from Table1

SQL query to get the same set of results

This should be a simple one, but say I have a table with data like this:
| ID | Date | Value |
| 1 | 01/01/2013 | 40 |
| 2 | 03/01/2013 | 20 |
| 3 | 10/01/2013 | 30 |
| 4 | 14/02/2013 | 60 |
| 5 | 15/03/2013 | 10 |
| 6 | 27/03/2013 | 70 |
| 7 | 01/04/2013 | 60 |
| 8 | 01/06/2013 | 20 |
What I want is the sum of values per week of the year, showing ALL weeks.. (for use in an excel graph)
What my query gives me, is only the weeks that are actually in the database.
With SQL you cannot return rows that don't exist in some table. To get the effect you want you could create a table called WeeksInYear with only one field WeekNumber that is an Int. Populate the table with all the week numbers. Then JOIN that table to this one.
The query would then look something like the following:
SELECT w.WeekNumber, SUM(m.Value)
FROM MyTable as m
RIGHT OUTER JOIN WeeksInYear AS w
ON DATEPART(wk, m.date) = w.WeekNumber
GROUP BY w.WeekNumber
The missing weeks will not have any data in MyTable and show a 0.

Simple Sum & Group

I have a table that has 2 simple fields: RoomNumber & RoomEarned
I would like to group the rooms together that have multiple RoomEarned Values and combine their sum. Basically adding the value together inline.
basically making this table..
RoomNumber | RoomEarned
1 | 13.23
2 | 23.79
3 | 50.75
4 | 32.90
10 | 11.31
11 | 31.83
12 | 13.92
12 | 18.82
13 | 41.87
14 | 87.74
15 | 100.83
into this...
RoomNumber | RoomEarned
1 | 13.23
2 | 23.79
3 | 50.75
4 | 32.90
10 | 11.31
11 | 31.83
12 | 32.74
13 | 41.87
14 | 87.74
15 | 100.83
Obviously its a grouping function, but to my abilities.. I fall terribly short.
any ideas?
select RoomNumber, SUM(RoomEarned) from MyTable group by RoomNumber