How do I get results of several Insert Into Select statements into one row - sql-server-2012

I have seen a lot of similar questions, but none seem to have an answer applicable to what I am trying to do, so here goes:
I have a table with data spread over 4 rows. Certain values in those rows need to populate another table, all in one row, mapped to specific columns (this to enable a PLC to pick it up).
I created the following query, which as you may expect gives me output in multiples rows (though doesn't actually seem to work when I execute more than 2 part in one go).
How would I go about merging these into the one row? I have had a go with UNION, and had a look at nested queries but can't work it out. Help gratefully received.
--////////
--To change format so one row contains all recipe data,
INSERT INTO dbo.Rep_Criteria (FILE_NAME, BAC, Rev, Tol_Lo1, Tol_Hi1, Eng_Tol1, Min, Max, Min TCs)
SELECT File_Name, BAC, Rev, Low, High, Eng_Tol, Min, Max, Min_TC
FROM dbo.Recipe
WHERE ID = 1 ;
INSERT INTO dbo.Rep_Criteria (Tol_Lo2, Tol_Hi2, Eng_Tol2)
SELECT Low, High, Eng_Tol
FROM dbo.Recipe
WHERE ID = 2 ;
INSERT INTO dbo.Rep_Criteria (Tol_Lo3, Tol_Hi3, Eng_Tol3)
SELECT Low, High, Eng_Tol
FROM dbo.Recipe
WHERE ID = 3 ;
INSERT INTO dbo.Rep_Criteria (Tol_Lo4, Tol_Hi4, Eng_Tol4)
SELECT Low, High, Eng_Tol
FROM dbo.Recipe
WHERE ID = 4 ;
--////////
This might help, though formatting messed up:
Table 1: (Input)
File_Name BAC Rev Low High Eng_Tol Min Max Min_TC
Zeppelin 5636 F 0 6 4.0 1550 1725 1
Zeppelin 0 0 6 15 5.0 0 0 0
Zeppelin 0 0 15 100 6.0 0 0 0
Zeppelin 0 0 100 600 15.0 0 0 0
Table 2: (Output)
File_Name - BAC - Rev - Tol_Lo1 - Tol_Lo2 - Tol_Lo3 - Tol_Lo4 - Tol_Hi1 - Tol_Hi2 - Tol_Hi3 - -------
Zeppelin 5636 F 0 6 15 100
--------
Tol_Hi4 - Eng_Tol1 - Eng_Tol2 - Eng_Tol3 - Eng_Tol4 - Min - Max - Min# T/C's
600 4.0 5.0 6.0 15.0 1550 1725 1

Related

A follow up question on Gaps and Islands solution

This is continuation of my previous question A question again on cursors in SQL Server.
To reiterate, I get values from a sensor as 0 (off) or 1(on) every 10 seconds. I need to log in another table the on times ie when the sensor value is 1.
I will process the data every one minute (which means I will have 6 rows of data). I needed a way to do this without using cursors and was answered by #Charlieface.
WITH cte1 AS (
SELECT *,
PrevValue = LAG(t.Value) OVER (PARTITION BY t.SlaveID, t.Register ORDER BY t.Timestamp)
FROM YourTable t
),
cte2 AS (
SELECT *,
NextTime = LEAD(t.Timestamp) OVER (PARTITION BY t.SlaveID, t.Register ORDER BY t.Timestamp)
FROM cte1 t
WHERE (t.Value <> t.PrevValue OR t.PrevValue IS NULL)
)
SELECT
t.SlaveID,
t.Register,
StartTime = t.Timestamp,
Endtime = t.NextTime
FROM cte2 t
WHERE t.Value = 1;
db<>fiddle
The raw data set and desired outcome are as below. Here register 250 represents the sensor and value presents the value as 0 or 1 and time stamp represents the time of reading the value
SlaveID
Register
Value
Timestamp
ProcessTime
3
250
0
13:30:10
NULL
3
250
0
13:30:20
NULL
3
250
1
13:30:30
NULL
3
250
1
13:30:40
NULL
3
250
1
13:30:50
NULL
3
250
1
13:31:00
NULL
3
250
0
13:31:10
NULL
3
250
0
13:31:20
NULL
3
250
0
13:32:30
NULL
3
250
0
13:32:40
NULL
3
250
1
13:32:50
NULL
The required entry in the logging table is
SlaveID
Register
StartTime
Endtime
3
250
13:30:30
13:31:10
3
250
13:32:50
NULL //value is still 1
The solution given works fine but when the next set of data is processed, the exiting open entry (end time is null) is to be considered.
If the next set of values is only 1 (ie all values are 1), then no entry is to be made in the log table since the value was 1 in the previous set of data and continues to be 1. When the value changes 0 in one of the sets, then the end time should be updated with that time. A fresh row to be inserted in log table when it becomes 1 again
I solved the issue by using a 'hybrid'. I get 250 rows (values of 250 sensors polled) every 10 seconds. I process the data once in 180 seconds. I get about 4500 records which I process using the CTE. Now I get result set of around 250 records (a few more than 250 if some signals have changed the state). This I insert into a #table (of the table being processed) and use a cursor on this #table to check and insert into the log table. Since the number of rows is around 250 only cursor runs without issue.
Thanks to #charlieface for the original answer.

Seperating a Oracle Query with 1.8 million rows into 40,000 row blocks

I have a project where I am taking Documents from one system and importing them into another.
The first system has the documents and associated keywords stored. I have a query that will return the results which will then be used as the index file to import them into the new system. There are about 1.8 million documents involved so this means 1.8 million rows (One per document).
I need to divide the returned results into blocks of 40,000 to make importing them in batches of 40,000 at a time, rather than one long import.
I have the query to return the results I need. Just need to know how to take that and break it up for easier import. My apologies if I have included to little information. This is my first time here asking for help.
Use the built-in function ORA_HASH to divide the rows into 45 buckets of roughly the same number of rows. For example:
select * from some_table where ora_hash(id, 44) = 0;
select * from some_table where ora_hash(id, 44) = 1;
...
select * from some_table where ora_hash(id, 44) = 44;
The function is deterministic and will always return the same result for the same input. The resulting number starts with 0 - which is normal for a hash, but unusual for Oracle, so the query may look off-by-one at first. The hash works better with more distinct values, so pass in the primary key or another unique value if possible. Don't use a low-cardinality column, like a status column, or the buckets will be lopsided.
This process is in some ways inefficient, since you're re-reading the same table 45 times. But since you're dealing with documents, I assume the table scanning won't be the bottleneck here.
A prefered way to bucketing the ID is to use the NTILE analytic function.
I'll demonstrate this on a simplified example with a table with 18 rows that should be divided in four chunks.
select listagg(id,',') within group (order by id) from tab;
1,2,3,7,8,9,10,15,16,17,18,19,20,21,23,24,25,26
Note, that the IDs are not consecutive, so no arithmetic can be used - the NTILE gets the parameter of the requested number of buckets (4) and calculates the chunk_id
select id,
ntile(4) over (order by ID) as chunk_id
from tab
order by id;
ID CHUNK_ID
---------- ----------
1 1
2 1
3 1
7 1
8 1
9 2
10 2
15 2
16 2
17 2
18 3
19 3
20 3
21 3
23 4
24 4
25 4
26 4
18 rows selected.
All but the last bucket are of the same size, the last one can be smaller.
If you want to calculate the ranges - use simple aggregation
with chunk as (
select id,
ntile(4) over (order by ID) as chunk_id
from tab)
select chunk_id, min(id) ID_from, max(id) id_to
from chunk
group by chunk_id
order by 1;
CHUNK_ID ID_FROM ID_TO
---------- ---------- ----------
1 1 8
2 9 17
3 18 21
4 23 26

Misleading count of 1 on JOIN in Postgres 11.7

I've run into a subtlety around count(*) and join, and a hoping to get some confirmation that I've figured out what's going on correctly. For background, we commonly convert continuous timeline data into discrete bins, such as hours. And since we don't want gaps for bins with no content, we'll use generate_series to synthesize the buckets we want values for. If there's no entry for, say 10AM, fine, we stil get a result. However, I noticed that I'm sometimes getting 1 instead of 0. Here's what I'm trying to confirm:
The count is 1 if you count the "grid" series, and 0 if you count the data table.
This only has to do with count, and no other aggregate.
The code below sets up some sample data to show what I'm talking about:
DROP TABLE IF EXISTS analytics.measurement_table CASCADE;
CREATE TABLE IF NOT EXISTS analytics.measurement_table (
hour smallint NOT NULL DEFAULT NULL,
measurement smallint NOT NULL DEFAULT NULL
);
INSERT INTO measurement_table (hour, measurement)
VALUES ( 0, 1),
( 1, 1), ( 1, 1),
(10, 2), (10, 3), (10, 5);
Here are the goal results for the query. I'm using 12 hours to keep the example results shorter.
Hour Count sum
0 1 1
1 2 2
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
7 0 0
8 0 0
9 0 0
10 3 10
11 0 0
12 0 0
This works correctly:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(measurement_table.hour) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (measurement_table.hour = hour_series.hour)
GROUP BY 1
ORDER BY 1
This returns misleading 1's on the match:
WITH hour_series AS (
select * from generate_series (0,12) AS hour
)
SELECT hour_series.hour,
count(*) AS frequency,
COALESCE(sum(measurement_table.measurement), 0) AS total
FROM hour_series
LEFT JOIN measurement_table ON (hour_series.hour = measurement_table.hour)
GROUP BY 1
ORDER BY 1
0 1 1
1 2 2
2 1 0
3 1 0
4 1 0
5 1 0
6 1 0
7 1 0
8 1 0
9 1 0
10 3 10
11 1 0
12 1 0
The only difference between these two examples is the count term:
count(*) -- A result of 1 on no match, and a correct count otherwise.
count(joined to table field) -- 0 on no match, correct count otherwise.
That seems to be it, you've got to make it explicit that you're counting the data table. Otherwise, you get a count of 1 since the series data is matching once. Is this a nuance of joinining, or a nuance of count in Postgres?
Does this impact any other aggrgate? It seems like it sholdn't.
P.S. generate_series is just about the best thing ever.
You figured out the problem correctly: count() behaves differently depending on the argument is is given.
count(*) counts how many rows belong to the group. This just cannot be 0 since there is always at least one row in a group (otherwise, there would be no group).
On the other hand, when given a column name or expression as argument, count() takes in account any non-null value, and ignores null values. For your query, this lets you distinguish groups that have no match in the left joined table from groups where there are matches.
Note that this behavior is not Postgres specific, but belongs to the standard
ANSI SQL specification (all databases that I know conform to it).
Bottom line:
in general cases, uses count(*); this is more efficient, since the database does not need to check for nulls (and makes it clear to the reader of the query that you just want to know how many rows belong to the group)
in specific cases such as yours, put the relevant expression in the count()

Cartodb SQL: ST_MakeLine Conversion

I am using Cartodb to map some GPS points that I would like to connect the unique IDs with lines. I was using this site as a reference when writing my SQL. The SQL executes without any errors, but my map is not generating.
Here is my CSV dataset I'm running the SQL on:
X Y track_fid track_seg_point_id time
-87.5999 41.7083 0 0 2/17/2018 16:10
-87.74214 41.91581 0 0 2/17/2018 16:11
-87.6005 41.7081 0 0 2/17/2018 16:14
-87.6584 41.8265 0 1 2/17/2018 16:41
-87.63029 41.85842 0 1 2/17/2018 16:59
-87.7308 41.8893 0 1 2/17/2018 17:07
-87.59857 41.708393 0 2 2/17/2018 17:08
-87.5995 41.7081 0 2 2/17/2018 17:15
-87.68106 41.799088 0 2 2/17/2018 17:47
Here is my SQL:
SELECT
ST_MakeLine(the_geom_webmercator ORDER BY time ASC) AS the_geom_webmercator,
extract(hour from time) as hour,
track_seg_point_id AS cartodb_id
FROM snow_plow_data
GROUP BY
track_seg_point_id,
hour
Here is the resulting table from my SQL:
Hour cartodb_id
16 0
16 1
17 1
17 2
Any ideas or suggested would be great on why my map points are not being displayed as lines.
If you are using BUILDER UI, you can add Create Lines from Points analysis, ordering by time and grouping your lines by the track_fid field (or track_seg_point_id, don't know which field you want to use):
On the other hand, if you want to do it using the SQL console. CARTO BUILDER now needs not only the cartodb_id and the_geom_webmercator, but also the_geom column. So you would need to add this last field to your query. Something like this should work:
WITH lines as (
SELECT
ST_MakeLine(the_geom_webmercator ORDER BY time ASC) AS the_geom_webmercator,
ROW_NUMBER() OVER() as cartodb_id
FROM
tracks
GROUP BY
track_seg_point_id)
SELECT
ST_Transform(the_geom_webmercator, 4326) as the_geom,
*
FROM
lines

Conditional SELECT depending on a set of rules

I need to get data from different columns depending on a set of rules and I don't see how to do it. Let me illustrate this with an example. I have a table:
ID ELEM_01 ELEM_02 ELEM_03
---------------------------------
1 0.12 0 100
2 0.14 5 200
3 0.16 10 300
4 0.18 15 400
5 0.20 20 500
And I have a set of rules which look something like this:
P1Z: ID=2 and ELEM_01
P2Z: ID=4 and ELEM_03
P3Z: ID=4 and ELEM_02
P4Z: ID=3 and ELEM_03
I'm trying to output the following:
P1Z P2Z P3Z P4Z
------------------------
0.14 400 15 300
I'm used to much simpler queries and this is a bit above my level. I'm getting mixed up by this problem and I don't see a straightforward solution. Any pointers would be appreciated.
EDIT Logic behind the rules: the table contains data about different aspects of a piece of equipment. Each combination of ID/ELEM_** represents the value of one aspect of the piece of equipment. The table contains all values of all aspects, but we want a row containing data on only a specific subset of aspects, so that we can output in a single table the values of a specific subset of aspects for all pieces of equipment.
Assuming that each column is numeric and ID is unique you could do:
SELECT
SUM(CASE WHEN ID = 2 THEN ELEM_01 END) AS P1Z,
SUM(CASE WHEN ID = 4 THEN ELEM_03 END) AS P2Z,
SUM(CASE WHEN ID = 4 THEN ELEM_02 END) AS P3Z,
SUM(CASE WHEN ID = 3 THEN ELEM_03 END) AS P4Z
...