Advice on a complex SQL query for a BIRT dataset - sql

I have the following (simplified) PostgreSQL database table containing info about maintenance done on a certain device:
id bigint NOT NULL,
"time" timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text,
The action_name field can have four values of interest:
MAINTENANCE_START
DEVICE_DEFECT
DEVICE_REPAIRED
MAINTENANCE_STOP
<other (irrelevant) values>
I have to do a BIRT report using the information from this table. I should have an entry in a table each time a MAINTENANCE_STOP action is encountered. If between this MAINTENANCE_STOP action and the its corresponding MAINTENANCE_START action (should be the MAINTENANCE_START action with the max "time" value smaller than that of the MAINTENANCE_STOP action) I encounter a DEVICE_DEFECT or DEVICE_REPAIRED action I should write in a table cell the string "Device not available", else I should write "Device available".
Also, I should compute the duration of the maintenance as the time difference between the MAINTENANCE_STOP action and the MAINTENANCE_START action.
I first attempted to do this in the SQL query, but now I'm not sure it's possible. What approach do you recommend?

My working snippet:
CREATE TABLE "log"
(
id bigint NOT NULL,
time timestamp(0) with time zone,
action_name text NOT NULL,
action_info text NOT NULL DEFAULT ''::text
);
insert into log(id,time,action_name,action_info) values ( 1, '2011-01-01', 'MAINTENANCE_START', 'maintenance01start');
insert into log(id,time,action_name,action_info) values ( 2, '2011-02-01', 'MAINTENANCE_START', 'maintenance02start');
insert into log(id,time,action_name,action_info) values ( 3, '2011-03-01', 'MAINTENANCE_START', 'maintenance03start');
insert into log(id,time,action_name,action_info) values ( 4, '2011-04-01', 'MAINTENANCE_START', 'maintenance04start');
insert into log(id,time,action_name,action_info) values ( 5, '2011-01-10', 'MAINTENANCE_STOP', 'maintenance01stop');
insert into log(id,time,action_name,action_info) values ( 6, '2011-02-10', 'MAINTENANCE_STOP', 'maintenance02stop');
insert into log(id,time,action_name,action_info) values ( 7, '2011-03-10', 'MAINTENANCE_STOP', 'maintenance03stop');
--insert into log(id,time,action_name,action_info) values ( 8, '2011-04-10', 'MAINTENANCE_STOP', 'maintenance04stop');
insert into log(id,time,action_name,action_info) values ( 9, '2011-02-05', 'DEVICE_DEFECT', 'maintenance02defect');
insert into log(id,time,action_name,action_info) values ( 10, '2011-03-05', 'DEVICE_REPAIRED', 'maintenance03repaired');
select
maintenance.start as start
, maintenance.stop as stop
, count (device_action.*) as device_actions
from (select
l_start.time as start
, (select time
from log l_stop
where l_stop.time > l_start.time
and l_stop.action_name = 'MAINTENANCE_STOP'
order by time asc limit 1) as stop
from log l_start
where l_start.action_name='MAINTENANCE_START' order by l_start.time asc) maintenance
left join log device_action
on device_action.time > maintenance.start
and device_action.time < maintenance.stop
and device_action.action_name like 'DEVICE_%'
group by maintenance.start
, maintenance.stop
order by maintenance.start asc
;
Be carefull with performance. If Postgres didn't optimize nested query, it would take O(n^2) time.
If you may:
Change structure. E.g. one table DEVICE_MAINTENANCES with maintenance ID and second table DEVICE_MAINTENANCE_ACTIONS with foreign key DEVICE_MAINTENANCES.ID. Queries will be simpler and faster.
If not, treat time as primary key (implict index)
If not, create index on time column.

Related

Inserting missing date into non nullable Datetime field

Here we have an existing database and I'm building a new system with a new database.
There I need to transfer some data from the old database table to the new database table.
I wrote this query in the SQL
INSERT INTO [Mondo-UAT].[dbo].[Countries] (
[Country_Name]
,[Country_Code]
,[Note]
)
SELECT [Code1]
,[Code3]
,[Name]
FROM [MondoErp-UAT].[dbo].[Nations]
The issue is in the [Mondo-UAT].[dbo].[Countries] table has other columns like Note is a string and Status is a bool CreateBy is int CreateDate is DateTime . So when I run the query it returns with an error
Msg 515, Level 16, State 2, Line 8 Cannot insert the value NULL into column 'CreatedDate', table 'Mondo-UAT.dbo.Countries'; column does not allow nulls. INSERT fails. The statement has been terminated
So I wanna know how to insert data for the CreateDate,CreateBy ,Notes from the above script I wrote.
If the target table has non-nullable columns without defaults, you have to specify values for those fields when inserting data.
The easiest solution would be to modify the target table to add DEFAULT values for those fields, eg SYSDATETIME() for Created, SYSTEM_USER for CreatedBy and '' for Notes. For Status you'll have to decide what a good default status is. 0 may or may not be meaningful.
Otherwise, these values will have to be specified in the query:
INSERT INTO [Mondo-UAT].[dbo].[Countries] (
[Country_Name]
,[Country_Code]
,[Note]
, Created
, CreatedBy
, Status
)
SELECT [Code1]
,[Code3]
,[Name]
, SYSDATETIME()
, SYSTEM_USER
, 0
FROM [MondoErp-UAT].[dbo].[Nations]
SYSTEM_USER returns the login name of the current user, whether it's a Windows or SQL Server login. This makes it a good default for user columns.
SYSDATETIME returns the current time as a DATETIME2(7) value. If Created is a DATETIMEOFFSET, SYSDATETIMEOFFSET should be used.
You can set Default Value for CreateDate,CreateBy,Notes columns in the [Mondo-UAT].[dbo].[Countries] table if they are not null columns. So, When you insert data into the table and do not insert a value for these columns, the default value will be inserted.

Not able to insert a row in a table which has auto incremented primary key

I have a table reportFilters which has the following column names:
The reportFilterId is auto increment. I want to insert a row in the table with the script below:
IF OBJECT_ID(N'ReportFilters', N'U') IS NOT NULL
BEGIN
IF NOT EXISTS (SELECT * FROM [ReportFilters]
WHERE ReportId IN (SELECT ReportId FROM [Reports] WHERE ReportType = 'Operational Insights Command Staff Dashboard') )
BEGIN
INSERT INTO [ReportFilters] Values(1, 'SelectView', 'Select Views', 13, 'Views','Views', 'SelectView', 'a', 'b', 'c' );
END
END
GO
But I am getting the following error:
Column name or number of supplied values does not match table definition.
Can I please get help on this ? Thanks in advance.
I think the problem is on inserted columns can't match with inserted data because that will instead by your table column order which is ReportFilterId instead of ReportId
So that there are 11 columns in your table but your statement only provides 10 columns.
I would use explicitly specify for inserted columns (inserted columns start from ReportId except your PK ReportFilterId column)
INSERT INTO [ReportFilters] (ReportId,ReportFilterName,ReportFilterTitle....)
Values (1, 'SelectView', 'Select Views', 13, 'Views','Views', 'SelectView', 'a', 'b', 'c' );

Leveraging CHECKSUM in MERGE but unable to get all rows to merge

I am having trouble getting MERGE statements to work properly, and I have recently started to try to use checksums.
In the toy example below, I cannot get this row to insert (1, 'ANDREW', 334.3) that is sitting in the staging table.
DROP TABLE TEMP1
DROP TABLE TEMP1_STAGE
-- create table
CREATE TABLE TEMP1
(
[ID] INT,
[NAME] VARCHAR(55),
[SALARY] FLOAT,
[SCD] INT
)
-- create stage
CREATE TABLE TEMP1_STAGE
(
[ID] INT,
[NAME] VARCHAR(55),
[SALARY] FLOAT,
[SCD] INT
)
-- insert vals into stage
INSERT INTO TEMP1_STAGE (ID, NAME, SALARY)
VALUES
(1, 'ANDREW', 333.3),
(2, 'JOHN', 555.3),
(3, 'SARAH', 444.3)
-- insert stage table into main table
INSERT INTO TEMP1
SELECT *
FROM TEMP1_STAGE;
-- clean up stage table
TRUNCATE TABLE TEMP1_STAGE;
-- put some new values in the stage table
INSERT INTO TEMP1_STAGE (ID, NAME, SALARY)
VALUES
(1, 'ANDREW', 334.3),
(4, 'CARL', NULL)
-- CHECKSUMS
update TEMP1_STAGE
set SCD = binary_checksum(ID, NAME, SALARY);
update TEMP1
set SCD = binary_checksum(ID, NAME, SALARY);
-- run merge
MERGE TEMP1 AS TARGET
USING TEMP1_STAGE AS SOURCE
-- match
ON (SOURCE.[ID] = TARGET.[ID])
WHEN NOT MATCHED BY TARGET
THEN INSERT (
[ID], [NAME], [SALARY], [SCD]) VALUES (
SOURCE.[ID], SOURCE.[NAME], SOURCE.[SALARY], SOURCE.[SCD]);
-- the value: (1, 'ANDREW', 334.3) is not merged in
SELECT * FROM TEMP1;
How can I use the checksum to my advantage in the MERGE?
Your issue is that the NOT MATCHED condition is only considering the ID values specified in the ON condition.
If you want duplicate, but distinct records, include SCD to the ON condition.
If (more likely) your intent is that record ID = 1 be updated with the new SALARY, you will need to add a WHEN MATCHED AND SOURCE.SCD <> TARGET.SCD THEN UPDATE ... clause.
That said, the 32-bit int value returned by the `binary_checksum()' function is not sufficiently distinct to avoid collisions and unwanted missed updates. Take a look at HASHBYTES instead. See Binary_Checksum Vs HashBytes function.
Even that may not yield your intended performance gain. Assuming that you have to calculate the hash for all records in the staging table for each update cycle, you may find that it is simpler to just compare each potentially different field before the update. Something like:
WHEN MATCHED AND (SOURCE.NAME <> TARGET.NAME OR SOURCE.SALARY <> TARGET.SALARY)
THEN UPDATE ...
Even then, you need to be careful of potential NULL values and COLLATION. Both NULL <> 50000.00 and 'Andrew' <> 'ANDREW' may not give you the results you expect. It might be easiest and most reliable to just code WHEN MATCHED THEN UPDATE ....
Lastly, I suggest using DECIMAL instead of FLOAT for Salary.

Type Decimal doesn't work in Amazon Redshift

I created a table in Redshift:
CREATE TABLE test_table1 (
id bigint NOT NULL,
longitude decimal(6,4)
);
and inserted:
INSERT INTO test_table1 VALUES (1, 65.3695);
yet querying the database shows (1, 65.37)
Why isn't the result: 1, 65.3695?
This is a display parameter in SQL Workbench.
Tools > Options... > Data Formatting > Decimal digits

Use Military-time value as Time in PostgreSQL?

I have created a table
CREATE TABLE mytable
(food VARCHAR(20) references myfood(BIN)
,name NUMERIC(2)
,servingsize VARCHAR(15)
,time TIMESTAMP
,PRIMARY KEY(food, time)
);
and what I want is to insert data in to this table which I use:
INSERT INTO dinnertime (food, name, servingsize, time) VALUES
('earl', 3, 'one cup', '1300'),
('phebeo', 2, 'two cup', '1100'),
('apollo', 1, 'one Cup', '0700'),
('oscar', 4, 'one cup', '2200');
But PostgrSQL does not let me do this. The problem is with the time. What changes do I need to make to my table for it to take this format of time
Use time instead of timestamp (timestamp is date + time, you only have time), then convert to one of the recognized formats for time:
http://www.postgresql.org/docs/8.0/static/datatype-datetime.html