I have a table that consist of multiple data of same ID with different-different time stamp(each interval of 6 minutes) in one column, and its recorded temperature at each given time stamp.
ID
Time_Stamp
Temperature_1
Temperature_2
101
18-09-2020 17:05:40
98.50
87.63
101
18-09-2020 17:11:40
96.60
46.3
101
18-09-2020 17:17:40
80.50
65.30
101
18-09-2020 17:23:40
65.30
77.21
101
18-09-2020 17:29:40
36.20
63.30
101
18-09-2020 17:35:40
69.30
54.70
..... up to 614 rows
Output should be:
ID
Time_Stamp
Avg_Temperature_1
Avg_Temperature_2
101
18-09-2020 17:29:40
98.50
87.63
101
18-09-2020 18:29:40
96.60
46.3
101
18-09-2020 19:29:40
80.50
65.30
..... up to 61 rows
Elaboration:
Lets assume it has 614 rows.
i have to first search all the data in ascending manner(by time_stamp) (e.g select * from table where id=101 order by time_stamp asc).
Now, I have 614 row of that data.but I have to consider only nearest 10 data. e.g if here 614 rows then i have to consider only 610 data. similarly if i will have 219 data, then i have to consider only 210 data (if 220 then i have to consider 220 data), if 155 then 150, if 314 then 310 data and so on..
so after considering 610 row i have to divide it by 10. so that in my final o/p i will have only 61 rows .(each of 10-10 set)
Also note that if i am taking 10-10 set then i will have each row showing avg of each hour in my final o/p)
how? (the data has came at interval of every 6-6 minute, so if i take 10 data together then it will have data of each 1-1 hour(6*10=60 min) representing by each row).
so finally i have to take set of 10-10 row and find the average of each temp column and represent it as a single row.
Note that in time_stamp column we can take any mid value of 10 set,either 4th one,5th one or 6th one.
And in Temp1 column it should be avg of 10 row.
i have to show the avg temp data of each 1-1 hour interval time or for 10-10 set of rows.
How to write a SQL query for this?
What I tried so far is as below - for this I thought to write a stored procedure:
Step 1:- starting i will fetch all data and floor(cound(id)) value
by:-
SELECT * FROM table WHERE id = 1 ORDER BY Time_Stamp ASC
and then
SELECT FLOOR(COUNT(id) / 10) FROM table_name WHERE id = 1 (for deciding num of time loop should execute)
This will return a value of 61.
Step 2: looping on upto n times (here 61 times)
And within each loop I suppose limit up to 10 rows and take avg of temperature and all.
In each loop: finding the avg of column w.r.t id (but I am unable to include time stamp)
I use below for finding the avg with respect to id of first 10 data by:-
select id, avg(Temperature_1) as TempAVG1, avg(Temperature_2) as TempAVG2
from table_name
where Time_stamp >= TO_CHAR('18-09-2020 17:05')
and Time_stamp <= TO_CHAR('18-09-2020 18:05:40')
and id = 101
group by id
Here I'm unable to include the time stamp (4, 5 or 6th one of 10 set)
So for that I tried to write another query for finding only time stamp and willing to do union with first query, but I am unable to union both query because avg column and time column have diff data types (also all columns are not same)
Also I cannot think how to left last odd rows ( e.g if lastly if there is only1 to 9 rows left)
Please provide another efficient way if possible to write query for this or try to help me to write this stored procedure.
Or else if it is/can be mixing of query and C# code (e.g., using datatable and all) then also its welcome.
Technologies I am using: C# and an Oracle database
Convert a row into multiple rows in bigQuery SQL.
The number of rows depend on a particular column value (in this case, the value of delta_unit/60):
Source table:
ID time delta_unit
101 2019-06-18 01:00:00 60
102 2019-06-18 01:01:00 60
103 2019-06-18 01:03:00 120
The ID 102 does recorded a time at 01:01:00 and the next record was at 01:03:00.
So, we are missing a record that should have been 01:02:00 and the delta_unit = 60
Expected table:
ID time delta_unit
101 2019-06-18 01:00:00 60
102 2019-06-18 01:01:00 60
104 2019-06-18 01:02:00 60
103 2019-06-18 01:03:00 60
A new row is created based on the delta_unit. The number of rows that need to be created will depend on the value delta_unit/60 (in this case, 120/60 = 2)
I have found a solution to your problem. I have done the following, first run
SELECT max(delta/60) as max_a FROM `<projectid>.<dataset>.<table>`
to compute the maximum number of steps. Then run the following loop
DECLARE a INT64 DEFAULT 1;
WHILE a <= 2 DO --2=max_a (change accordingly)
INSERT INTO `<projectid>.<dataset>.<table>` (id,time,delta)
SELECT id+1,TIMESTAMP_ADD(time, INTERVAL a MINUTE),delta-60*a
FROM
(SELECT id,time,delta
FROM `<projectid>.<dataset>.<table>`
)
WHERE delta > 60*a;
SET a = a + 1;
END WHILE;
Of course this is not efficient enough but it gets the Job done. The IDs and deltas do not finish at the right values yet, they should not be needed. The deltas would end up all at 60 (the column can be deleted) and the IDs can be recreated using the timestamp to get them ordered.
You might try using a conditional expression in here to avoid the loop and only going through the table once.
I have tried
INSERT INTO `<projectid>.<dataset>.<table>` (id,time,delta)
SELECT id+1, CASE
WHEN delta>80 THEN TIMESTAMP_ADD(time, INTERVAL 1 MINUTE)
WHEN delta>150 THEN TIMESTAMP_ADD(time, INTERVAL 2 MINUTE)
END
,60
FROM
(SELECT id,time,delta
FROM `<projectid>.<dataset>.<table>`
)
WHERE delta > 60;
but fails because only returns the first condition where the when is True. So, I am not sure if it is possible to do it all at once. If you have small tables I would stick to the first one which works fine.
I have a list of unique ID's in one table that has a date column. Example:
TABLE1
ID Date
0 2018-01-01
1 2018-01-05
2 2018-01-15
3 2018-01-06
4 2018-01-09
5 2018-01-12
6 2018-01-15
7 2018-01-02
8 2018-01-04
9 2018-02-25
Then in another table I have a list of different values that appear multiple times for each ID with various dates.
TABLE 2
ID Value Date
0 18 2017-11-28
0 24 2017-12-29
0 28 2018-01-06
1 455 2018-01-03
1 468 2018-01-16
2 55 2018-01-03
3 100 2017-12-27
3 110 2018-01-04
3 119 2018-01-10
3 128 2018-01-30
4 223 2018-01-01
4 250 2018-01-09
4 258 2018-01-11
etc
I want to find the value in table 2 that is closest to the unique date in table 1.
Sometimes table 2 does contain a value that matches the date exactly and I have had no problem in pulling through those values. But I can't work out the code to pull through the value closest to the date requested from table 1.
My desired result based on the examples above would be
ID Value Date
0 24 2017-12-29
1 455 2018-01-03
2 55 2018-01-03
3 110 2018-01-04
4 250 2018-01-09
Since I can easily find the ID's with an exact match, one thing I have tried is taking the ID's that don't have an exact date match and placing them with their corresponding values into a temporary table. Then trying to find the values where I need the closest possible match, but it's here that I'm not sure where to begin on the coding of that.
Apologies if I'm missing a basic function or clause for this, I'm still learning!
The below would be one method:
WITH Table1 AS(
SELECT ID, CONVERT(date, datecolumn) DateColumn
FROM (VALUES (0,'20180101'),
(1,'20180105'),
(2,'20180115'),
(3,'20180106'),
(4,'20180109'),
(5,'20180112'),
(6,'20180115'),
(7,'20180102'),
(8,'20180104'),
(9,'20180225')) V(ID, DateColumn)),
Table2 AS(
SELECT ID, [value], CONVERT(date, datecolumn) DateColumn
FROM (VALUES (0,18 ,'2017-11-28'),
(0,24 ,'2017-12-29'),
(0,28 ,'2018-01-06'),
(1,455,'2018-01-03'),
(1,468,'2018-01-16'),
(2,55 ,'2018-01-03'),
(3,100,'2017-12-27'),
(3,110,'2018-01-04'),
(3,119,'2018-01-10'),
(3,128,'2018-01-30'),
(4,223,'2018-01-01'),
(4,250,'2018-01-09'),
(4,258,'2018-01-11')) V(ID, [Value],DateColumn))
SELECT T1.ID,
T2.[Value],
T2.DateColumn
FROM Table1 T1
CROSS APPLY (SELECT TOP 1 *
FROM Table2 ca
WHERE T1.ID = ca.ID
ORDER BY ABS(DATEDIFF(DAY, ca.DateColumn, T1.DateColumn))) T2;
Note that if the difference is days is the same, the row returned will be random (and could differ each time the query is run). For example, if Table had the date 20180804 and Table2 had the dates 20180803 and 20180805 they would both have the value 1 for ABS(DATEDIFF(DAY, ca.DateColumn, T1.DateColumn)). You therefore might need to include additional logic in your ORDER BY to ensure consistent results.
dude.
I'll say a couple of things here for you to consider, since SQL Server is not my comfort zone, while SQL itself is.
First of all, I'd join TABLE1 with TABLE2 per ID. That way, I can specify on my SELECT clause the following tuple:
SELECT ID, Value, DateDiff(d, T1.Date, T2.Date) qt_diff_days
Obviously, depending on the precision of the dates kept there, rather they have times or not, you can change the date field on DateDiff function.
Going forward, I'd also make this date difference an absolute number (to resolve positive / negative differences and consider only the elapsed time).
After that, and that's where it gets tricky because I don't know the SQL Server version you're using, but basically I'd use a ROW_NUMBER window function to rank all my lines per difference. Something like the following:
SELECT
ID, Value, Abs(DateDiff(d, T1.Date, T2.Date)) qt_diff_days,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Abs(DateDiff(d, T1.Date, T2.Date)) ASC) nu_row
ROW_NUMBER (Transact-SQL)
Numbers the output of a result set. More specifically, returns the sequential number of a row within a partition of a result set, starting at 1 for the first row in each partition.
If you could run ROW_NUMBER properly, you should notice the query will rank it's data per ID, starting with 1 and increasing this ranking by it's difference between both dates, reseting it's rank to 1 when ID changes.
After that, all you need to do is select only those lines where nu_row equals to 1. I'd use a CTE to that.
WITH common_table_expression (Transact-SQL)
Specifies a temporary named result set, known as a common table expression (CTE).
I started working in BI and I was given a brain teaser since I came from C# and not SQL/cognus.
I get a number. It can be between 0 and a very large number. When I get it and it's below 1,000 everything is dandy. But if it's bigger than or equal to 1,000 , I should use 1,000 instead.
I am not allowed to use conditions, I need it to be pure math, or if I can't then I should use efficient methods.
I thought it would be easy and just use Min() but that works differently in cognus and SQL apparently.
Use the LEAST() function:
Oracle Setup:
CREATE TABLE data ( value ) AS
SELECT 1 FROM DUAL UNION ALL
SELECT 999 FROM DUAL UNION ALL
SELECT 1000 FROM DUAL UNION ALL
SELECT 1001 FROM DUAL;
Query:
SELECT value, LEAST( value, 1000 ) AS output FROM data
Output:
VALUE OUTPUT
----- ------
1 1
999 999
1000 1000
1001 1000
I have searched but not found an answer for my question.
I have a table orders that consists of
id (primary key autonumber)
client_id : identifies each client (unique)
date: order dates for each client
I want to retrieve the last N order dates for each client in a single view
Of course I could use SELECT TOP N date FROM orders WHERE client = 'xx' ORDER DESC and then use UNION for the different values for client. The problem is that with changes in client base the statement would require revision and that the UNION statement is impractical with a large client base.
As an additional requirement this needs to work in Access SQL.
Step 1: Create a query that yields a rank order by date per client for every row. Since Access SQL does not have ROW_NUMBER() OVER (...) like SQL Server, you can simulate this by using the technique described in the following question:
Access query producing results like ROW_NUMBER() in T-SQL
If you have done step 1 correctly, your result should be as follows:
id client_id date rank
----------------------------------
1 2014-12-01 7
1 2014-12-02 6
1 2014-12-05 5
1 2014-12-07 4
1 2014-12-11 3
1 2014-12-14 2
1 2014-12-15 1
2 2014-12-01 2
2 2014-12-02 1
...
Step 2: Use the result from step 1 as a subquery and filter the result such that only records with rank <= N are returned.
I think the following will work in MS Access:
select t.*
from table as t
where t.date in (select top N t2.date
from table as t2
where t2.client_id = t.client_id
order by t2.date desc
);
One problem with MS Access is that top N will retrieve more than N records if there are ties. If you want exactly "N", then you can use order by date, id in the subquery.