Merge and copy columns into database view using sql

Merge and copy columns into database view using sql - sql

Simplified problem:
I have 3 tables with same structure: 2 columns, TIMESTAMP and VALUE.
Example
(I simplified timestamp and value for easier understanding):
Table A:
TIMESTAMP VALUE
1 a101
5 a105
9 a109
17 a117
Table B:
TIMESTAMP VALUE
3 b103
5 b105
8 b108
13 b113
Table C:
TIMESTAMP VALUE
9 c109
11 c111
13 c113
18 c118
View should contain TIMESTAMPs of all tables in one single column and one column for each VALUE column of each table:
View
TIMESTAMP A_VALUE B_VALUE C_VALUE
1 a101
3 b103
5 a105 b105
8 b108
9 a109 c109
11 c111
13 b113 c113
17 a117
18 c118
Is this possible using a view?
Many thanks for answers.

Related

I need to count the average of each day's records and size in MB for each file created in a day. For a whole year

I ask for your help after several unsuccessful attempts.
I am learning with PL SQL. I am using Oracle SQL developer v.20
I have this situation. My data set looks like this:
id_file size_byte created_at
_________ _________ ____________________________
1 45323 17-FEB-22 17:21:13,726874000
2 41232 17-FEB-22 17:21:13,740587004
3 1234456 20-FEB-22 17:25:13,368874058
4 233545488 20-FEB-22 17:21:18,400049000
5 233545488 21-FEB-22 18:11:18,058746868
So my desired output would be something like this for year 2022:
TOT_records AVG_file_created_for_day TOT_size_files AVG_size_files_created_each_day
___________ ________________________ ______________ _______________________________
9.999.999 10.000 999.999.999 5 MB (default is byte)
ID is type NUMBER, SIZE_BYTE is type NUMBER, CREATED_AT is TIMESTAMP(6)
My table is partitioned for each year, PARTITION_DATE is type DATE

There's some ambiguity on things like "average file size per day"... That could be:
sum all file sizes / total number of days, or
average of files size per day, then take average of that average
Anyway, here's some stuff to get you going (I'm assuming the latter above)
SQL> create table t as
2 select
3 rownum id_file,
4 dbms_random.value(1000,20000000) bytes,
5 date '2021-01-01' + dbms_random.value(1,700) created_at
6 from dual
7 connect by level <= 5000;
Table created.
SQL>
SQL> select * from t
2 where rownum <= 20;
ID_FILE BYTES CREATED_A
---------- ---------- ---------
1 19305636.7 02-SEP-22
2 6305773.83 10-OCT-21
3 11939117.8 04-NOV-21
4 11039507.9 01-SEP-21
5 15555516.8 02-NOV-22
6 2809048.47 13-SEP-22
7 2070381.41 18-DEC-21
8 11116786.1 11-MAR-22
9 17519679.8 21-DEC-21
10 6728222.84 02-APR-22
11 7569442.31 07-AUG-22
12 16949454.2 06-JUL-21
13 8019443.02 03-JUN-21
14 13147674.9 31-AUG-21
15 14590702.5 16-JUL-22
16 13028609.7 11-MAY-21
17 5466477.07 06-APR-22
18 4469902.12 08-MAY-21
19 14511096 31-MAY-22
20 5245726.03 12-JUL-21
20 rows selected.
SQL> select
2 count(*) total_records,
3 avg(daily_size_avg)/1024/1024 avg_size_files_per_day_mb,
4 sum(bytes)/1024/1024/1024 tot_bytes_gb,
5 avg(files_per_day) avg_files_per_day
6 from
7 (
8 select
9 bytes,
10 avg(bytes) over ( partition by trunc(created_at) ) daily_size_avg,
11 count(*) over ( partition by trunc(created_at) ) files_per_day
12 from t
13 );
TOTAL_RECORDS AVG_SIZE_FILES_PER_DAY_MB TOT_BYTES_GB AVG_FILES_PER_DAY
------------- ------------------------- ------------ -----------------
5000 9.5313187 46.5396421 8.092

Join SQL server tables without a unique identifier and duplicate data

I'm relatively inexperienced in SQL and could use some help beyond the usual SELECT and JOIN.
The Problem
Suppose you have 2 tables you wish to join in Microsoft SQL, however they are missing a unique identifier so duplicates entries are incorrectly generated. I've created an example SQLfiddle to try and demonstrate using a small subset of the full database schema http://sqlfiddle.com/#!18/df3fc.
One table has a list of measurement steps taken for 2 systems, identified by their serial. These measurement steps can contain multiple pieces of data, which are contained in the second table. This would not normally be an issue but, as in the sqlfiddle example for serial=1004, sometimes the same data may be retaken as part of a rework. When I then query, each piece of rework data gets joined to each step, duplicating data. The select query:
SELECT my_measurement_steps.id AS steps_id, my_measurement_steps.serial, my_measurement_data.id AS data_id, my_measurement_data.my_data, my_measurement_data.measurementid, my_measurement_steps.date
FROM my_measurement_steps INNER JOIN
my_measurement_data ON my_measurement_steps.serial = my_measurement_data.serial AND
my_measurement_steps.measurementid = my_measurement_data.measurementid
Desired Output
steps_id
serial
data_id
my_data
measurementid
date
15
1004
36
0.9496555
33
2021-10-12 07:55:58.100
14
1004
35
-0.03252285
11
2021-10-07 07:56:31.530
14
1004
34
-0.0003081787
11
2021-10-07 07:56:31.530
13
1004
33
-0.01728721
10
2021-10-07 07:56:31.530
13
1004
32
-0.1996608
10
2021-10-07 07:56:31.530
12
1004
31
0.003044653
9
2021-10-07 07:24:49.500
12
1004
30
0.002392432
9
2021-10-07 07:24:49.500
11
1004
29
1.012242
8
2021-10-07 07:24:30.720
11
1004
28
1.003897
8
2021-10-07 07:24:30.720
11
1004
27
0.9917302
8
2021-10-07 07:24:30.720
11
1004
26
-0.002975781
8
2021-10-07 07:24:30.720
11
1004
25
-0.002746948
8
2021-10-07 07:24:30.720
10
1004
24
0.9695401
33
2021-10-05 11:37:51.430
9
1005
23
0.9731983
33
2021-10-05 08:00:10.490
8
1005
22
0.01013499
11
2021-10-01 07:12:07.470
8
1005
21
-0.007311231
11
2021-10-01 07:12:07.470
7
1005
20
-0.0003634033
10
2021-10-01 07:12:07.470
7
1005
19
-0.2021408
10
2021-10-01 07:12:07.470
6
1005
18
-0.002507007
9
2021-09-30 13:00:57.260
6
1005
17
0.001181299
9
2021-09-30 13:00:57.260
5
1005
16
1.007857
8
2021-09-30 12:39:50.280
5
1005
15
1.000333
8
2021-09-30 12:39:50.280
5
1005
14
0.9913442
8
2021-09-30 12:39:50.280
5
1005
13
0.002449243
8
2021-09-30 12:39:50.280
5
1005
12
-0.002550488
8
2021-09-30 12:39:50.280
4
1004
11
-0.02970417
11
2021-09-30 06:57:33.160
4
1004
10
-0.0007542603
11
2021-09-30 06:57:33.160
3
1004
9
-0.005267761
10
2021-09-30 06:57:33.160
3
1004
8
-0.2038888
10
2021-09-30 06:57:33.160
2
1004
7
-0.007525305
9
2021-09-30 06:56:59.060
2
1004
6
-0.004998779
9
2021-09-30 06:56:59.060
1
1004
5
0.9935537
8
2021-09-29 12:34:08.090
1
1004
4
0.9952038
8
2021-09-29 12:34:08.090
1
1004
3
0.9978707
8
2021-09-29 12:34:08.090
1
1004
2
-0.0006630127
8
2021-09-29 12:34:08.090
1
1004
1
0.0002386719
8
2021-09-29 12:34:08.090
I'm unsure how to achieve the desired output given the repeating data. Also for some serials there can be more than 1 repeat as shown in the example.
Happy to provide any extra information required.
Many Thanks.
Code to Generate Tables
create table my_measurement_steps(id int, serial int, measurementid int, date datetime);
create table my_measurement_data(id int, serial int, my_data float(7), measurementid int);
insert into my_measurement_steps values
(1,1004,8,'2021-09-29 12:34:08.090'),
(2,1004,9,'2021-09-30 06:56:59.060'),
(3,1004,10,'2021-09-30 06:57:33.160'),
(4,1004,11,'2021-09-30 06:57:33.160'),
(5,1005,8,'2021-09-30 12:39:50.280'),
(6,1005,9,'2021-09-30 13:00:57.260'),
(7,1005,10,'2021-10-01 07:12:07.470'),
(8,1005,11,'2021-10-01 07:12:07.470'),
(9,1004,33,'2021-10-05 08:00:10.490'),
(10,1005,33,'2021-10-05 11:37:51.430'),
(11,1004,8,'2021-10-07 07:24:30.720'),
(12,1004,9,'2021-10-07 07:24:49.500'),
(13,1004,10,'2021-10-07 07:56:31.530'),
(14,1004,11,'2021-10-07 07:56:31.530'),
(15,1004,33,'2021-10-12 07:55:58.100');
insert into my_measurement_data values
(1,1004,0.0002386719,8),
(2,1004,-0.0006630127,8),
(3,1004,0.9978707,8),
(4,1004,0.9952038,8),
(5,1004,0.9935537,8),
(6,1004,-0.004998779,9),
(7,1004,-0.007525305,9),
(8,1004,-0.2038888,10),
(9,1004,-0.005267761,10),
(10,1004,-0.0007542603,11),
(11,1004,-0.02970417,11),
(12,1005,-0.002550488,8),
(13,1005,0.002449243,8),
(14,1005,0.9913442,8),
(15,1005,1.000333,8),
(16,1005,1.007857,8),
(17,1005,0.001181299,9),
(18,1005,-0.002507007,9),
(19,1005,-0.2021408,10),
(20,1005,-0.0003634033,10),
(21,1005,-0.007311231,11),
(22,1005,0.01013499,11),
(23,1004,0.9695401,33),
(24,1005,0.9731983,33),
(25,1004,-0.002746948,8),
(26,1004,-0.002975781,8),
(27,1004,0.9917302,8),
(28,1004,1.003897,8),
(29,1004,1.012242,8),
(30,1004,0.002392432,9),
(31,1004,0.003044653,9),
(32,1004,-0.1996608,10),
(33,1004,-0.01728721,10),
(34,1004,-0.0003081787,11),
(35,1004,-0.03252285,11),
(36,1004,0.9496555,33);
Edits
Added datestamp to measurement step table - sqlfiddle not working so can't update.
All tables now updated and sqlfiddle
Removed section and added desired output

You want to detect blocks of rows belonging together.
When sorting my_measurement_steps we see that serial/measurementid 1004/8 occurs twice for instance, once in row #1 and then again in row #11.
When sorting my_measurement_data we see about the same thing. The serial/measurementid 1004/8 occurs in two blocks, once in rows #1-5 and then again in rows #25-29.
You want to join the serial/measurementid's nth occurence in my_measurement_steps with its nth occurrence in my_measurement_data.
The detection of such blocks is called a gaps and islands problem. This can be done with two concurrent row counts.
with data_groups_found as
(
select
my_measurement_data.*,
row_number() over (order by id) -
row_number() over (partition by serial, measurementid order by id) as grp
from my_measurement_data
)
, data_groups_numbered as
(
select
data_groups_found.*,
dense_rank() over (partition by serial, measurementid order by grp) as grp_id
from data_groups_found
)
, steps_numbered as
(
select
my_measurement_steps.*,
row_number() over (partition by serial, measurementid order by id) as grp_id
from my_measurement_steps
)
select *
from steps_numbered s
left join data_groups_numbered d
on d.serial = s.serial
and d.measurementid = s.measurementid
and d.grp_id = s.grp_id
order by s.id, d.id;
Demo: http://sqlfiddle.com/#!18/df3fc/6

Snowflake table replace nulls with multiple values

I have database table at Snowflake, with NULL values.
id
month
a01
5
a02
6
b01
6
b04
NULL
I need transform it as this example:
id
month
a01
5
a02
6
b01
6
b04
7
b04
8
I must replace NULLs with multiple values (with 2 values: 7 and 8 - summer months). So from each NULL row I need to make two rows.

The simplest way to do this would be with two separate statements, one to insert an extra row and one to update the existing NULL value:
INSERT INTO tab (id, month)
SELECT id, 8 as month
FROM tab
WHERE month is NULL;
followed by
UPDATE tab
SET month = 7
WHERE month is NULL;
Result:
id
month
a01
5
a02
6
b01
6
b04
7
b04
8
See this db<>fiddle.

Max date among records and across tables - SQL Server

I tried max to provide in table format but it seem not good in StackOver, so attaching snapshot of the 2 tables. Apologize about the formatting.
SQL Server 2012
**MS Table**
**mId tdId name dueDate**
1 1 **forecastedDate** 1/1/2015
2 1 **hypercareDate** 11/30/2016
3 1 LOE 1 7/4/2016
4 1 LOE 2 7/4/2016
5 1 demo for yy test 10/15/2016
6 1 Implementation – testing 7/4/2016
7 1 Phased Rollout – final 7/4/2016
8 2 forecastedDate 1/7/2016
9 2 hypercareDate 11/12/2016
10 2 domain - Forte NULL
11 2 Fortis completion 1/1/2016
12 2 Certification NULL
13 2 Implementation 7/4/2016
-----------------------------------------------
**MSRevised**
**mId revisedDate**
1 1/5/2015
1 1/8/2015
3 3/25/2017
2 2/1/2016
2 12/30/2016
3 4/28/2016
4 4/28/2016
5 10/1/2016
6 7/28/2016
7 7/28/2016
8 4/28/2016
9 8/4/2016
9 5/28/2016
11 10/4/2016
11 10/5/2016
13 11/1/2016
----------------------------------------
The required output is
1. Will be passing the 'tId' number, for instance 1, lets call it tid (1)
2. Want to compare tId (1)'s all milestones (except hypercareDate) with tid(1)'s forecastedDate milestone
3. return if any of the milestone date (other than hypercareDate) is greater than the forecastedDate
The above 3 steps are simple, but I have to first compare the milestones date with its corresponding revised dates, if any, from the revised table, and pick the max date among all that needs to be compared with the forecastedDate

I managed to solve this. Posting the answer, hope it helps aomebody.
//Insert the result into temp table
INSERT INTO #mstab
SELECT [mId]
, [tId]
, [msDate]
FROM [dbo].[MS]
WHERE ([msName] NOT LIKE 'forecastedDate' AND [msName] NOT LIKE 'hypercareDate'))
// this scalar function will get max date between forecasted duedate and forecasted revised date
SELECT #maxForecastedDate = [dbo].[fnGetMaxDate] ( 'forecastedDate');
// this will get the max date from temp table and compare it with forecasatedDate/
SET #maxmilestoneDate = (SELECT MAX(maxDate)
FROM ( SELECT ms.msDueDate AS dueDate
, mr.msRevisedDate AS revDate
FROM #mstab as ms
LEFT JOIN [MSRev] as mr on ms.msId = mr.msId
) maxDate
UNPIVOT (maxDate FOR DateCols IN (dueDate, revDate))up );

SQL: Create a new id column that changes based on the values of other three columns? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I have a table with three columns named cid, orderdate, and priororderdate among others.
Here is how the table looks:
cid orderdate priororderdate position
12 NULL NULL 1
12 NULL NULL 2
12 NULL NULL 3
12 2014-08-08 23:25 NULL 1
12 2014-08-08 23:25 NULL 2
12 2014-08-08 23:25 NULL 3
12 2014-08-08 23:25 NULL 4
12 2014-09-06 17:19 2014-08-08 23:25 1
12 2014-09-06 17:19 2014-08-08 23:25 2
12 2014-09-06 17:19 2014-08-08 23:25 3
13 NULL NULL 1
13 NULL NULL 2
13 NULL NULL 3
The combination of the columns cid, orderdatetime, and priororderdatetime defines a unique fpid (a new column I want to create). Hence, the final result would be:
cid orderdate priororderdate position fpid
12 NULL NULL 1 1
12 NULL NULL 2 1
12 NULL NULL 3 1
12 2014-08-08 23:25 NULL 1 2
12 2014-08-08 23:25 NULL 2 2
12 2014-08-08 23:25 NULL 3 2
12 2014-08-08 23:25 NULL 4 2
12 2014-09-06 17:19 2014-08-08 23:25 1 3
12 2014-09-06 17:19 2014-08-08 23:25 2 3
12 2014-09-06 17:19 2014-08-08 23:25 3 3
13 NULL NULL 1 4
13 NULL NULL 2 4
13 NULL NULL 3 4
How can I create the fpid column?

You can do this using dense_rank() in a select query:
select t.*,
dense_rank() over (order by cid, orderdate, priororderdate) as fpid
from table t;
If you have the column fpid already in the table and want to update it:
with toupdate as (
select t.*,
dense_rank() over (order by cid, orderdate, priororderdate) as new_fpid
from table t
)
update toupdate
set fpid = new_fpid;
(If you want to add it, you can use an alter table statement.)

It's a little bit confusion that you say that fpid is unique, but looking at your desired output, it looks like you want to use ROW_NUMBER().
UPDATE tab2 t SET fpid =
(SELECT ROW_NUMBER () OVER (ORDER BY cid)
FROM tab2
GROUP BY cid, orderdate, priororderdate
WHERE t.cid = cid
AND t.orderdate = orderdate
AND t.priororderdate = priororderdate)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Merge and copy columns into database view using sql - sql

Related

I need to count the average of each day's records and size in MB for each file created in a day. For a whole year

Join SQL server tables without a unique identifier and duplicate data

Snowflake table replace nulls with multiple values

Max date among records and across tables - SQL Server

SQL: Create a new id column that changes based on the values of other three columns? [closed]

Categories

Resources