TSQL query to return most recent record based on another columns value - sql

I have a table that contains a list of expiration dates for various companies. The table looks like the following:
ID CompanyID Expiration
--- ---------- ----------
1 1 2016-01-01
2 1 2015-01-01
3 2 2016-04-02
4 2 2015-04-02
5 3 2014-01-03
6 4 2015-04-09
7 5 2015-07-20
8 5 2016-05-01
I am trying to build a TSQL query that will return just the most recent record for every company (i.e. CompanyID). Such as:
ID CompanyID Expiration
--- ---------- ----------
1 1 2016-01-01
3 2 2016-04-02
5 3 2014-01-03
6 4 2015-04-09
8 5 2016-05-01

It looks like there is a exact correlation between ID and Expiration. If that is true, ie the later the Expiration the higher the ID, then you could simply pull Max(ID) and Max(Expiration) which are 1:1 and group by CompanyID:
Select max(ID), CompanyID, max(Expiration) from Table group by Company ID

Related

I need to count the average of each day's records and size in MB for each file created in a day. For a whole year

I ask for your help after several unsuccessful attempts.
I am learning with PL SQL. I am using Oracle SQL developer v.20
I have this situation. My data set looks like this:
id_file size_byte created_at
_________ _________ ____________________________
1 45323 17-FEB-22 17:21:13,726874000
2 41232 17-FEB-22 17:21:13,740587004
3 1234456 20-FEB-22 17:25:13,368874058
4 233545488 20-FEB-22 17:21:18,400049000
5 233545488 21-FEB-22 18:11:18,058746868
So my desired output would be something like this for year 2022:
TOT_records AVG_file_created_for_day TOT_size_files AVG_size_files_created_each_day
___________ ________________________ ______________ _______________________________
9.999.999 10.000 999.999.999 5 MB (default is byte)
ID is type NUMBER, SIZE_BYTE is type NUMBER, CREATED_AT is TIMESTAMP(6)
My table is partitioned for each year, PARTITION_DATE is type DATE
There's some ambiguity on things like "average file size per day"... That could be:
sum all file sizes / total number of days, or
average of files size per day, then take average of that average
Anyway, here's some stuff to get you going (I'm assuming the latter above)
SQL> create table t as
2 select
3 rownum id_file,
4 dbms_random.value(1000,20000000) bytes,
5 date '2021-01-01' + dbms_random.value(1,700) created_at
6 from dual
7 connect by level <= 5000;
Table created.
SQL>
SQL> select * from t
2 where rownum <= 20;
ID_FILE BYTES CREATED_A
---------- ---------- ---------
1 19305636.7 02-SEP-22
2 6305773.83 10-OCT-21
3 11939117.8 04-NOV-21
4 11039507.9 01-SEP-21
5 15555516.8 02-NOV-22
6 2809048.47 13-SEP-22
7 2070381.41 18-DEC-21
8 11116786.1 11-MAR-22
9 17519679.8 21-DEC-21
10 6728222.84 02-APR-22
11 7569442.31 07-AUG-22
12 16949454.2 06-JUL-21
13 8019443.02 03-JUN-21
14 13147674.9 31-AUG-21
15 14590702.5 16-JUL-22
16 13028609.7 11-MAY-21
17 5466477.07 06-APR-22
18 4469902.12 08-MAY-21
19 14511096 31-MAY-22
20 5245726.03 12-JUL-21
20 rows selected.
SQL> select
2 count(*) total_records,
3 avg(daily_size_avg)/1024/1024 avg_size_files_per_day_mb,
4 sum(bytes)/1024/1024/1024 tot_bytes_gb,
5 avg(files_per_day) avg_files_per_day
6 from
7 (
8 select
9 bytes,
10 avg(bytes) over ( partition by trunc(created_at) ) daily_size_avg,
11 count(*) over ( partition by trunc(created_at) ) files_per_day
12 from t
13 );
TOTAL_RECORDS AVG_SIZE_FILES_PER_DAY_MB TOT_BYTES_GB AVG_FILES_PER_DAY
------------- ------------------------- ------------ -----------------
5000 9.5313187 46.5396421 8.092

Creating a new calculated column in SQL

Is there a way to find the solution so that I need for 2 days, there are 2 UD's because there are June 24 2 times and for the rest there are single days.
I am showing the expected output here:
Primary key UD Date
-------------------------------------------
1 123 2015-06-24 00:00:00.000
6 456 2015-06-24 00:00:00.000
2 123 2015-06-25 00:00:00.000
3 658 2015-06-26 00:00:00.000
4 598 2015-06-27 00:00:00.000
5 156 2015-06-28 00:00:00.000
No of times Number of days
-----------------------------
4 1
2 2
The logic is 4 users are there who used the application on 1 day and there are 2 userd who used the application on 2 days
You can use two levels of aggregation:
select cnt, count(*)
from (select date, count(*) as cnt
from t
group by date
) d
group by cnt
order by cnt desc;

How to UNION ALL tables with dynamic column headers

I am trying to UNION ALL a bunch of tables. Most of them have the exact same structure and column headings and data types which works fine. The tables have some column headings which dynamically change each month. Most tables look like this:
Table 1:
Type 2018-08 2018-09 2018-10 2018-11 2018-12
------ --------- --------- --------- --------- ---------
1 10 16 8 4 11
2 17 21 6 9 14
3 12 12 10 5 10
The month columns change every month. The new month is added and the oldest month removed. The number of columns doesn't change.
The problem is when I try to UNION ALL tables which have an extra column like so:
Table 2:
Type Category 2018-08 2018-09 2018-10 2018-11 2018-12
------ ---------- --------- --------- --------- --------- ---------
1 A 10 16 8 4 11
2 B 17 21 6 9 14
3 A 12 12 10 5 10
Normally I would just:
SELECT [Type], '' AS Category, [2018-08], [2018-09], [2018-10], [2018-11], [2018-12]
FROM Table1
UNION ALL
SELECT [Type], Category, [2018-08], [2018-09], [2018-10], [2018-11], [2018-12]
FROM Table2
The problem with this is that I would have to update the month column names manually every month.
I also have a table with a different extra column like so:
Table 3:
Type Organisation 2018-08 2018-09 2018-10 2018-11 2018-12
------ -------------- --------- --------- --------- --------- ---------
11 South 15 12 6 8 18
13 West 14 9 9 11 16
21 North 10 15 13 14 16
I tried to:
SELECT '' AS Category, '' AS Organisation, *
FROM Table1
UNION ALL
SELECT Category, '' AS Organisation, *
FROM Table2
UNION ALL
SELECT '' AS Category, Organisation, *
FROM Table3
But this also didn't work as it was still including all columns which weren't matching up.
Is it possible to UNION ALL these tables without specifying the column names?
Appreciate any help.

Insert multiple rows from result of Average by date and id

I have a table with 1 result per day like this :
id | item_id | date | amount
-------------------------------------
1 1 2019-01-01 1
2 1 2019-01-02 2
3 1 2019-01-03 3
4 1 2019-01-04 4
5 1 2019-01-05 5
6 2 2019-01-01 1
7 2 2019-01-01 2
8 2 2019-01-01 3
9 2 2019-01-01 4
10 2 2019-01-01 5
11 3 2019-01-01 1
12 3 2019-01-01 2
13 3 2019-01-01 3
14 3 2019-01-01 4
15 3 2019-01-01 5
First I was trying to average the column amount for each day.
SELECT
x.item_id AS id,avg(x.amount) AS result
FROM
(SELECT
il.item_id, il.amount,
ROW_NUMBER() OVER (PARTITION BY il.item_id ORDER BY il.date DESC) rn
FROM
item_prices il) x
WHERE
x.rn BETWEEN 1 AND 50
GROUP BY
x.item_id
The result is going to be the following if calculated on 2019-01-05
item_id | average
1 3
2 3
3 3
or, if calculated 2019-01-04
item_id | average
1 2.5
2 2.5
3 2.5
My goal is to run the Average query , every day that would update the average automatically and insert it in 5th column "average" :
id | item_id | date | amount | average
5 1 2019-01-05 5 3
10 2 2019-01-05 5 3
15 3 2019-01-05 5 3
Issue is that every example i can find with Insert the Select they only update one row and they are over another table there is also the most recent date issue...
Can someone point me in the right direction?
Perhaps you want to see running average every day. Storing the value as a separate column is bound to cause problems especially when the rows are updated/deleted, the column also needs to be updated and hence will require complex triggers.
Simply create a View and run whenever you want to check the average directly from that View.
CREATE OR REPLACE VIEW v_item_prices AS
SELECT t.*,avg(t.amount) OVER ( PARTITION BY item_id order by date)
AS average FROM item_prices t
order by item_id,date
DEMO

Max date among records and across tables - SQL Server

I tried max to provide in table format but it seem not good in StackOver, so attaching snapshot of the 2 tables. Apologize about the formatting.
SQL Server 2012
**MS Table**
**mId tdId name dueDate**
1 1 **forecastedDate** 1/1/2015
2 1 **hypercareDate** 11/30/2016
3 1 LOE 1 7/4/2016
4 1 LOE 2 7/4/2016
5 1 demo for yy test 10/15/2016
6 1 Implementation – testing 7/4/2016
7 1 Phased Rollout – final 7/4/2016
8 2 forecastedDate 1/7/2016
9 2 hypercareDate 11/12/2016
10 2 domain - Forte NULL
11 2 Fortis completion 1/1/2016
12 2 Certification NULL
13 2 Implementation 7/4/2016
-----------------------------------------------
**MSRevised**
**mId revisedDate**
1 1/5/2015
1 1/8/2015
3 3/25/2017
2 2/1/2016
2 12/30/2016
3 4/28/2016
4 4/28/2016
5 10/1/2016
6 7/28/2016
7 7/28/2016
8 4/28/2016
9 8/4/2016
9 5/28/2016
11 10/4/2016
11 10/5/2016
13 11/1/2016
----------------------------------------
The required output is
1. Will be passing the 'tId' number, for instance 1, lets call it tid (1)
2. Want to compare tId (1)'s all milestones (except hypercareDate) with tid(1)'s forecastedDate milestone
3. return if any of the milestone date (other than hypercareDate) is greater than the forecastedDate
The above 3 steps are simple, but I have to first compare the milestones date with its corresponding revised dates, if any, from the revised table, and pick the max date among all that needs to be compared with the forecastedDate
I managed to solve this. Posting the answer, hope it helps aomebody.
//Insert the result into temp table
INSERT INTO #mstab
SELECT [mId]
, [tId]
, [msDate]
FROM [dbo].[MS]
WHERE ([msName] NOT LIKE 'forecastedDate' AND [msName] NOT LIKE 'hypercareDate'))
// this scalar function will get max date between forecasted duedate and forecasted revised date
SELECT #maxForecastedDate = [dbo].[fnGetMaxDate] ( 'forecastedDate');
// this will get the max date from temp table and compare it with forecasatedDate/
SET #maxmilestoneDate = (SELECT MAX(maxDate)
FROM ( SELECT ms.msDueDate AS dueDate
, mr.msRevisedDate AS revDate
FROM #mstab as ms
LEFT JOIN [MSRev] as mr on ms.msId = mr.msId
) maxDate
UNPIVOT (maxDate FOR DateCols IN (dueDate, revDate))up );