Using a query to update existing records. - sql

Basically, I have a table of data with plenty of records (10,000+)
They all have 4 fields in common which must have data entered. The unique data is TIME.
I have already done a group sort query which has identified the group's of data based on these 4 fields, and then calculated an average time for each group.
I'm now needing to re-insert the average time against the real time in a table so each individual record's time can be evaluated against the average of its type.
For instance, one group from the query would have the result
Process1, Week2, Operator3, Shift4, AvgOfTIME = 120.70
It would then need to re-insert that average time into all records that match those criteria, but do it for every group result and record.
Is this even possible?

you need to update table use subquery
update t1 set t1.timefield = s2.AvgOfTIME
from yourtable t1
inner join
(
-- you query for calculating avg time
)
as t2
on t1.Process1 = t2.Process1 and t1.Week2 = t2.Week2 and t1.Operator3 = t2.Operator3 and t1.Shift4 = t2.Shift4

Related

INSERT INTO two columns from a SELECT query

I have a table called VIEWS with Id, Day, Month, name of video, name of browser... but I'm interested only in Id, Day and Month.
The ID can be duplicate because the user (ID) can watch a video multiple days in multiple months.
This is the query for the minimum date and the maximum date.
SELECT ID, CONCAT(MIN(DAY), '/', MIN(MONTH)) AS MIN_DATE,
CONCAT(MAX(DAY), '/', MAX(MONTH)) AS MAX_DATE,
FROM Views
GROUP BY ID
I want to insert this select with two columns(MIN_DATE and MAX_DATE) to two new columns with insert into.
How can be the insert into query?
To do what you are trying to do (there are some issues with your solution, please read my comments below), first you need to add the new columns to the table.
ALTER TABLE Views ADD MIN_DATE VARCHAR(10)
ALTER TABLE Views ADD MAX_DATE VARCHAR(10)
Then you need to UPDATE your new columns (not INSERT, because you don't want new rows). Determine the min/max for each ID, then join the result back to the table to be able to update each row. You can't update directly from a GROUP BY as rows are grouped and lose their original row.
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE,
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE
FROM
Views AS V
GROUP BY
ID
)
UPDATE V SET
MIN_DATE = M.MIN_DATE,
MAX_DATE = M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID
The problems that I see with this design are:
Storing aggregated columns: you usually want to do this only for performance issues (which I believe is not the case here), as querying the aggregated (grouped) rows is faster due to being less rows to read. The problem is that you will have to update the grouped values each time one of the original rows is updated, which as extra processing time. Another option would be periodically updating the aggregated values, but you will have to accept that for a period of time the grouped values are not really representing the tracking table.
Keeping aggregated columns on the same table as the data they are aggregating: this is normalization problem. Updating or inserting a row will trigger updating all rows with the same ID as the min/max values might have changed. Also the min/max values will always be repeated on all rows that belong to the same ID, which is extra space that you are wasting. If you had to save aggregated data, you need to save it on a different table, which causes the problems I listed on the previous point.
Using text data type to store dates: you always want to work dates with a proper DATETIME data type. This will not only enable to use date functions like DATEADD or DATEDIFF, but also save space (varchars that store dates need more bytes that DATETIME). I don't see the year part on your query, it should be considered to compute a min/max (this might depend what you are storing on this table).
Computing the min/max incorrectly: If you have the following rows:
ID DAY MONTH
1 5 1
1 3 2
The current result of your query would be 3/1 as MIN_DATE and 5/2 as MAX_DATE, which I believe is not what you are trying to find. The lowest here should be the 5th of January and the highest the 3rd of February. This is a consequence of storing date parts as independent values and not the whole date as a DATETIME.
What you usually want to do for this scenario is to group directly on the query that needs the data grouped, so you will do the GROUP BY on the SELECT that needs the min/max. Having an index by ID would make the grouping very fast. Thus, you save the storage space you would use to keep the aggregated values and also the result is always the real grouped result at the time that you are querying.
Would be something like the following:
;WITH MinMax
(
SELECT
ID,
CONCAT(MIN(V.DAY), '/', MIN(V.MONTH)) AS MIN_DATE, -- Date problem (varchar + min/max computed seperately)
CONCAT(MAX(V.DAY), '/', MAX(V.MONTH)) AS MAX_DATE -- Date problem (varchar + min/max computed seperately)
FROM
Views AS V
GROUP BY
ID
)
SELECT
V.*,
M.MIN_DATE,
M.MAX_DATE
FROM
MinMax AS M
INNER JOIN Views AS V ON M.ID = V.ID

How to compare value in where clause with max value for an ID?

I am trying to make a selection from a lot of different tables, and for one of the constraints the primary key can not have an item associated to it that has a date within the last 30 days. However, it is possible for the primary key to have multiple items associated with it.
The issue that i currently face is when there are multiple items associated, one that is within the date range and one that is outside, it is still getting returned. I want it to be excluded if any of the associated item's dates are within the past 30 days.
How can i make the program to fetch all the items instead of just one at a time?
Thanks!
Using NOT IN
select *
from table
where id not in (select id from table where datefield > dateadd(day,-30,getdate())
This only returns records from the table where the id doesn't have a record where the datefield is more than 30 days old. You likely need a join too since you reference a lot of different tables. Something like...
select *
from table
where id not in (select id
from table
inner join table2 on table2.refid = table.id
where table2.datefield > dateadd(day,-30,getdate())

Suppress Nonadjacent Duplicates in Report

Medical records in my Crystal Report are sorted in this order:
...
Group 1: Score [Level of Risk]
Group 2: Patient Name
...
Because patients are sorted by Score before Name, the report pulls in multiple entries per patient with varying scores - and since duplicate entries are not always adjacent, I can't use Previous or Next to suppress them. To fix this, I'd like to only display the latest entry for each patient based on the Assessment Date field - while maintaining the above order.
I'm convinced this behavior can be implemented with a custom SQL command to only pull in the latest entry per patient, but have had no success creating that behavior myself. How can I accomplish this compound sort?
Current SQL Statement in use:
SELECT "EpisodeSummary"."PatientID",
"EpisodeSummary"."Patient_Name",
"EpisodeSummary"."Program_Value"
"RiskRating"."Rating_Period",
"RiskRating"."Assessment_Date",
"RiskRating"."Episode_Number",
"RiskRating"."PatientID",
"Facility"."Provider_Name",
FROM (
"SYSTEM"."EpisodeSummary"
"EpisodeSummary"
LEFT OUTER JOIN "FOOBARSYSTEM"."RiskAssessment" "RiskRating"
ON (
("EpisodeSummary"."Episode_Number"="RiskRating"."Episode_Number")
AND
("EpisodeSummary"."FacilityID"="RiskRating"."FacilityID")
)
AND
("EpisodeSummary"."PatientID"="RiskRating"."PatientID")
), "SYSTEM"."Facility" "Facility"
WHERE (
"EpisodeSummary"."FacilityID"="Facility"."FacilityID"
)
AND "RiskRating"."PatientID" IS NOT NULL
ORDER BY "EpisodeSummary"."Program_Value"
The SQL code below may not be exactly correct, depending on the structure of your tables. The code below assumes the 'duplicate risk scores' were coming from the RiskAssessment table. If this is not correct, the code may need to be altered.
Essentially, we create a derived table and create a row_number for each record, based on the patientID and ordered by the assessment date - The most recent date will have the lowest number (1). Then, on the join, we restrict the resultset to only select record #1 (each patient has its own rank #1).
If this doesn't work, let me know and provide some table details -- Should the Facility table be the starting point? are there multiple entries in EpisodeSummary per patient? thanks!
SELECT es.PatientID
,es.Patient_Name
,es.Program_Value
,rrd.Rating_Period
,rrd.Assessment_Date
,rrd.Episode_Number
,rrd.PatientID
,f.Provider_Name
FROM SYSTEM.EpisodeSummary es
LEFT JOIN (
--Derived Table retreiving highest risk score for each patient)
SELECT PatientID
,Assessment_Date
,Episode_Number
,FacilityID
,Rating_Period
,ROW_NUMBER() OVER (
PARTITION BY PatientID ORDER BY Assessment_Date DESC
) AS RN -- This code generates a row number for each record. The count is restarted for every patientID and the count starts at the most recent date.
FROM RiskAssessment
) rrd
ON es.patientID = rrd.patientid
AND es.episode_number = rrd.episode_number
AND es.facilityid = rrd.facilityid
AND rrd.RN = 1 --This only retrieves one record per patient (the most recent date) from the riskassessment table
INNER JOIN SYSTEM.Facility f
ON es.facilityid = f.facilityid
WHERE rrd.PatientID IS NOT NULL
ORDER BY es.Program_Value

SQL Delete based on max value

I have a table that has a composite key of 3 columns
st_id, sj_id, order
and want to delete a row based on a specific st_id and sj_id and by taking the max(order)
Could you please help?
As far as I know, you'll need to do this in two steps (this is from memory, so may not compile first time):
DELETE
FROM table
WHERE st_id = my_st_id
AND sj_id = my_sj_id
AND order IN (
SELECT MAX(order)
FROM table
WHERE st_id = my_st_id
AND sj_id = my_sj_id)
What this does is perform the inner (SELECT) query first, returning the maximum order. Those results then get passed to the outer query which does the delete.

SQL table update from selection/join with multiple columns, only one column data needed

Firstly, I'm rather new to SQL and I've run into a roadblock. I'm using the Mimer SQL system.
I have three tables: "Transactions", roughly equivalent with a receipt total, which I want to update with data from a selection and the tables "item", containing prices and "sale", containing number of items and item IDs for a given transaction, which I have to join in order to get the data to update Transaction with.
SELECT sale.T_ID, SUM(sale.quantity * item.price) as total
FROM sale
INNER JOIN item
ON sale.I_ID = item.I_ID
GROUP BY T_ID
Gives me the desired data selection with the transaction IDs and the sum total for that transaction:
T_ID Amount
1 100
2 150
etc...
I want to update the Transaction table, which contains columns "T_ID" and "Total". I want to match the T_IDs and update the Total with the data from the corresponding Amount in the selection. The query:
UPDATE transaction SET total = (
SELECT total FROM (
SELECT sale.T_ID, SUM(sale.quantity * item.price) as total
FROM sale
INNER JOIN item
ON sale.I_ID = item.I_ID
GROUP BY T_ID)
WHERE transaction.T_ID=T_ID);
I can sense that the above statement is faulty, but unable to discern the problem. How should I construct the query?
Only select the SUM(sale.quantity * item.price) in your subquery and remove the select total one.
update Transaction
set total = (
select SUM(sale.quantity * Item.price)
from Sale
inner join Item on Sale.I_ID = Item.I_ID
where Sale.T_ID = Transaction.T_ID
)
That is what I would try out in SQL Server.
I would have prefered to have some sample data so that I can test it against my local SQL Server database in order to verify whether this statement does what is expected, though you're not using SQL Server, the idea is still the same.
EDIT #1
Though I do not fully understand why it does not overwrite values in the transaction.total column when a transaction is composed of several different items, your suggested query works.
The behaviour of the SUM function is to sum all targeted records resulting in only one scalar value for each Sale and Item rows. There can be only one sum, can it not?
That said, it multipies each resulting rows from the constraint, that is, for a particular transaction.
It selects both Sale and Item rows for this very transaction;
It then iterates through each of the resulting rows and multiply Sale.Quantity and Item.Price together, which is worth a value for each row;
Once a row is multiplied, its total is then added to a total which is stored somewhere in memory;
Once it has processed all the rows for that transaction, it comes out with a scalar value, which is the sum of all the rows in Sale and Item for that given transaction;
This SUM ends up to be this transaction's total.
In other words, the subquery "knows" for which transaction to sum which is filtered in the subquery itself. The Transaction table is accessible in the subquery as part of the main SQL statement. So, putting it in the where clause filters the rows from Sale and Item that will be later multiplied and additioned for the total.
Does this help you better understand this update statement?
Please feel free to ask your questions. =)