I'm trying to do this update but for some reason I cannot quite master SQL sub queries.
My table structure is as follows:
id fk date activeFlg
--- -- ------- ---------
1 1 04/10/11 0
2 1 02/05/99 0
3 2 09/10/11 0
4 3 11/28/11 0
5 3 12/25/98 0
Ideally I would like to set the activeFlg to 1 for all of the distinct foreign keys with the most recent date. For instance after running my query id 1,3 and 4 will have an active flag set to one.
The closest thing I came up with was a query returning all of the max dates for each distinct fk:
SELECT MAX(date)
FROM table
GROUP BY fk
But since I cant even come up with the subquery there is no way I can proceed :/
Can somebody please give me some insight on this. I'm trying to really learn more about sub queries so an explanation would be greatly appreciated.
Thank you!
You need to select the fk to and then restrict by that, so
SELECT fk,MAX(date)
FROM table
GROUP BY fk
To
With Ones2update AS
(
SELECT fk,MAX(date)
FROM table
GROUP BY fk
)
Update table
set Active=1
from table t
join Ones2update u ON t.fk = u.fk and t.date = u.date
also I would test first so do this query first
With Ones2update AS
(
SELECT fk,MAX(date)
FROM table
GROUP BY fk
)
selct fk, date, active
from table t
join Ones2update u ON t.fk = u.fk and t.date = u.date
to make sure you are getting what you expect and I did not make any typos.
Additional note: I use a join instead of a sub-query -- they are logically the same but I always find joins to be clearer (once I got used to using joins). Depending on the optimizer they can be faster.
This is the general idea. You can flesh out the details.
update t
set activeFlg = 1
from yourTable t
join (
select id, max([date] maxdate
from TheForeignKeyTable
group by [date]
) sq on t.fk = sq.id and t.[date] = maxdate
Related
I have the following table named 'flt'
You can see the duplicates are identifed by 3 columns only (flight, fltno, stad)... I don't care about what is in col1 and col2.. But I should be able to show it in the query.
So.. you can see ids 8, 3 and 10 are duplicates.
I want to write a pure SQL query... that can do the following:
1) the duplicate count column.. which basically counts how many records are there that matches the flight, fltno, stad of the currently selected row.
2) the "duplicate rank" column which orders the duplicates.. 1 means first record, 2 means this is the 2nd record and 3 means this is the 3rd record. You can see ba 104 has 2 records in total... and it is ranked 1 and 2.
3) from the resulting (possibly editable) query.. I should be able to filter out (using where) all the duplicate ranks that are > 1... then able to delete those records.
So.. id 8, 3 and 10 are > 1.. and I should be able to delete them with in this query... by clicking on the row and delete key.
If the condition 3 is not entirely achievable.. please give me the best way possible. Thanks.
This SQL will get you the results as per your question, however it won't work as part of a DELETE query, I suggest SELECTING from this query into a temporary table and then running a DELETE query from that : )
SELECT A.id, A.flight, A.fltno, A.stad, A.col1, A.col2, B.concount AS [duplicate count], (SELECT Count(C.id) FROM tblfit As C WHERE C.flight&C.fltno&C.stad=A.concat AND C.id <= A.id) AS [duplicate rank]
FROM (SELECT tblfit.*, [flight] & [fltno] & [stad] AS concat
FROM tblfit) AS A,
(SELECT [flight] & [fltno] & [stad] AS concat, Count([concat]) AS concount
FROM tblfit
GROUP BY [flight] & [fltno] & [stad]) AS B
WHERE A.concat = B.concat;
Added a column in the table where the value is always 1 called
countValue
Then the first query for duplicate count is
SELECT tableA.flight, tableA.fltno, tableA.stad, Sum(tableA.countValue) AS duplicateCount
FROM tableA
GROUP BY tableA.flight, tableA.fltno, tableA.stad;
Then the second query for duplicate rank (ranked by id number) is
SELECT (SELECT Count(*)+1 FROM tableA WHERE id < temp.id AND stad = temp.stad AND flight = temp.flight AND fltno = temp.fltno) AS flightRank, temp.id, temp.flight, temp.fltno, temp.stad
FROM tableA AS temp;
Then you can join them
SELECT tableA.id, tableA.flight, tableA.fltno, tableA.stad, tableA.col1, tableA.col2, queryCounts.duplicateCount, queryRanking.flightRank
FROM (tableA INNER JOIN queryRanking ON tableA.id = queryRanking.id) INNER JOIN queryCounts ON (tableA.stad = queryCounts.stad) AND (tableA.fltno = queryCounts.fltno) AND (tableA.flight = queryCounts.flight);
Then regarding the delete query read this thread since you need to delete using joins
How to delete in MS Access when using JOIN's?
This will solve question 1. I do not believe all 3 questions can be answered in the same query.
Select Flight, FltNo, Stad, Sum(1) as DupCnt
from FLT
Group By Flight, FltNo, Stad
order by Sum(1) DESC
How do you know you want to delete 8 and 3, and not 4 and 10?
I have been trying to get this to work for 12 hrs now and I cannot :-( Can someone please show me how I can get the ssnumber to group and get the total for each ssnumber.
Here is what I have now. In Table number 1 I have this code
SELECT
UNIT_NO, SUM(RATEB) AS TOTALRTE
FROM TABLE1
WHERE
TRUCK_PAID = 1
AND PICK_UP_DATE >= '(fromdate)'
AND PICK_UP_DATE <= '(todate)'
GROUP BY
UNIT_NO
ORDER BY
UNIT_NO
But table number 2 is where the ssnumber column is, so what I'm trying to do is the rateB sum from all of the loads for each unit_no and then group them and then go into table number 2 and group the ssnumber with the unit number from table number 1 and sum the rateB from table number 1.
Something like this (see below) but its not working :-(
SELECT
UNIT_NO, SUM(RATEB)
FROM
TABLE1
WHERE
TRUCK_PAID = 1
AND PICK_UP_DATE >= '(fromdate)'
AND PICK_UP_DATE <= '(todate)'
GROUP BY
UNIT_NO
JOIN
TABLE TABLE1.UNIT_NO = TABLE2.UNIT_NO GROUP BY TABLE2.SS_NUM
or
SELECT
UNIT_NO, SUM(RATEB) AS TOTALRATE
FROM
TABLE1
GROUP BY
UNIT_NO
JOIN
TRUCKS ON (TABLE1.UNIT_NO = TABLE2.UNIT_NO)
GROUP BY
TABLE2.SSNUMBER
Thank you guys so much for any help...
As requested, it is hard to really understand what you are trying to accomplish without more info about table2 and maybe an example of what you are expecting. However, what I got from your description is that you are trying to accomplish something like this?
SELECT UNIT_NO, TOTALRTE, TOTALLDSRTE
FROM
(
SELECT UNIT_NO,SUM(RATEB) AS TOTALRTE
FROM LOADS
GROUP BY UNIT_NO
) AS tbl1
JOIN
(
SELECT SS_NUM, SUM(RATEB) AS TOTALLDSRTE
FROM LOADS
GROUP BY SS_NUM
) AS tbl2
ON tbl1.UNIT_NO = tbl2.SS_NUM
I would suggest instead of getting data from two select queries in one select query, try to fetch them as separate queries. This saves a lot of time. That, or you can create a table for the result and update the result of each query into the table.
Referring to the diagram below the records table has unique Records. Each record is updated, via comments through an Update Table. When I join the two I get lots of duplicates.
How to remove duplicates? Group By does not work for me as I have more than 10 fields in select query and some of them are functions.
Write a sub query which pulls the last updates in the Update table for each record that is updated in a particular month. Joining with this sub query will solve my problem.
Thanks!
Edit
Table structure that is of interest is
create table Records(
recordID int,
90more_fields various
)
create table Updates(
update_id int,
record_id int,
comment text,
byUser varchar(25),
datecreate datetime
)
Here's one way.
SELECT * /*But list columns explicitly*/
FROM Orange o
CROSS APPLY (SELECT TOP 1 *
FROM Blue b
WHERE b.datecreate >= '20110901'
AND b.datecreate < '20111001'
AND o.RecordID = b.Record_ID2
ORDER BY b.datecreate DESC) b
Based on the limited information available...
WITH cteLastUpdate AS (
SELECT Record_ID2, UpdateDateTime,
ROW_NUMBER() OVER(PARTITION BY Record_ID2 ORDER BY UpdateDateTime DESC) AS RowNUM
FROM BlueTable
/* Add WHERE clause if needed to restrict date range */
)
SELECT *
FROM cteLastUpdate lu
INNER JOIN OrangeTable o
ON lu.Record_ID2 = o.RecordID
WHERE lu.RowNum = 1
Last updates per record and month:
SELECT *
FROM UPDATES outerUpd
WHERE exists
(
-- Magic part
SELECT 1
FROM UPDATES innerUpd
WHERE innerUpd.RecordId = outerUpd.RecordId
GROUP BY RecordId
, date_part('year', innerUpd.datecolumn)
, date_part('month', innerUpd.datecolumn)
HAVING max(innerUpd.datecolumn) = outerUpd.datecolumn
)
(Works on PostgreSQL, date_part is different in other RDBMS)
I think this is a pretty basic question and I have looked around on the site but I am not sure what to search on to find the answer.
I have an SQL table that looks like:
studentId period class
1 1 math
1 2 english
2 1 math
2 2 history
I am looking for a SELECT statement that finds the studentId that is taking math 1st period and english 2nd period. I have tried something like SELECT studentID WHERE ( period = 1 AND class= "math" ) AND ( period = 2 AND class = "english" ) but that has not worked.
I have also thought about changing my table to be:
studentId period1 period2 period3 period4 period5 etc
But I think I want to be adding things besides classes like after school activities and wanted to be able to expand easily without constantly having to add columns.
Thanks for any help you can give me.
try something like:
select studentid from table where ( period = 1 AND class= "math" ) or ( period = 2 AND class =
"english" ) group by studentid having count(*) >= 2
the idea is to select all who meet the first criteria or the second criteria, group it by person and see where all are met by checking the number of rows grouped
You can use subqueries to do each individually and get only results where both subqueries match.
Select StudentId FROM table WHERE
StudentId IN
(SELECT studentID FROM table WHERE ( period = 1 AND class= "math" ) )
AND
StudentId IN
(SELECT studentID FROM table WHERE ( period = 2 AND class= "english" ) )
Edit - added
I have not tested this myself, but I was curious about performance considerations, so I looked it up. I found this quote:
Many Transact-SQL statements that
include subqueries can be
alternatively formulated as joins.
Other questions can be posed only with
subqueries. In Transact-SQL, there is
usually no performance difference
between a statement that includes a
subquery and a semantically equivalent
version that does not. However, in
some cases where existence must be
checked, a join yields better
performance. Otherwise, the nested
query must be processed for each
result of the outer query to ensure
elimination of duplicates. In such
cases, a join approach would yield
better results. The following is an
example showing both a subquery SELECT
and a join SELECT that return the same
result set:
here: http://technet.microsoft.com/en-us/library/ms189575.aspx
You could also do a self join
SELECT t1.studentID
FROM table t1
JOIN table t2 ON t1.studentID = t2.studentID
WHERE ( t1.period = 1 AND t1.class= "math" )
AND ( t2.period = 2 AND t2.class = "english" )
SpousesTable
SpouseID
SpousePreviousAddressesTable
PreviousAddressID, SpouseID, FromDate, AddressTypeID
What I have now is updating the most recent for the whole table and assigning the most recent regardless of SpouseID the AddressTypeID = 1
I want to assign the most recent SpousePreviousAddress.AddressTypeID = 1
for each unique SpouseID in the SpousePreviousAddresses table.
UPDATE spa
SET spa.AddressTypeID = 1
FROM SpousePreviousAddresses AS spa INNER JOIN Spouses ON spa.SpouseID = Spouses.SpouseID,
(SELECT TOP 1 SpousePreviousAddresses.* FROM SpousePreviousAddresses
INNER JOIN Spouses AS s ON SpousePreviousAddresses.SpouseID = s.SpouseID
WHERE SpousePreviousAddresses.CountryID = 181 ORDER BY SpousePreviousAddresses.FromDate DESC) as us
WHERE spa.PreviousAddressID = us.PreviousAddressID
I think I need a group by but my sql isn't all that hot. Thanks.
Update that is Working
I was wrong about having found a solution to this earlier. Below is the solution I am going with
WITH result AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY SpouseID ORDER BY FromDate DESC) AS rowNumber, *
FROM SpousePreviousAddresses
WHERE CountryID = 181
)
UPDATE result
SET AddressTypeID = 1
FROM result WHERE rowNumber = 1
Presuming you are using SQLServer 2005 (based on the error message you got from the previous attempt) probably the most straightforward way to do this would be to use the ROW_NUMBER() Function couple with a Common Table Expression, I think this might do what you are looking for:
WITH result AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY SpouseID ORDER BY FromDate DESC) as rowNumber,
*
FROM
SpousePreviousAddresses
)
UPDATE SpousePreviousAddresses
SET
AddressTypeID = 2
FROM
SpousePreviousAddresses spa
INNER JOIN result r ON spa.SpouseId = r.SpouseId
WHERE r.rowNumber = 1
AND spa.PreviousAddressID = r.PreviousAddressID
AND spa.CountryID = 181
In SQLServer2005 the ROW_NUMBER() function is one of the most powerful around. It is very usefull in lots of situations. The time spent learning about it will be re-paid many times over.
The CTE is used to simplyfy the code abit, as it removes the need for a temporary table of some kind to store the itermediate result.
The resulting query should be fast and efficient. I know the select in the CTE uses *, which is a bit of overkill as we dont need all the columns, but it may help to show what is happening if anyone want to see what is happening inside the query.
Here's one way to do it:
UPDATE spa1
SET spa1.AddressTypeID = 1
FROM SpousePreviousAddresses AS spa1
LEFT OUTER JOIN SpousePreviousAddresses AS spa2
ON (spa1.SpouseID = spa2.SpouseID AND spa1.FromDate < spa2.FromDate)
WHERE spa1.CountryID = 181 AND spa2.SpouseID IS NULL;
In other words, update the row spa1 for which no other row spa2 exists with the same spouse and a greater (more recent) date.
There's exactly one row for each value of SpouseID that has the greatest date compared to all other rows (if any) with the same SpouseID.
There's no need to use a GROUP BY, because there's kind of an implicit grouping done by the join.
update: I think you misunderstand the purpose of the OUTER JOIN. If there is no row spa2 that matches all the join conditions, then all columns of spa2.* are returned as NULL. That's how outer joins work. So you can search for the cases where spa1 has no matching row spa2 by testing that spa2.SpouseID IS NULL.
UPDATE spa SET spa.AddressTypeID = 1
WHERE spa.SpouseID IN (
SELECT DISTINCT s1.SpouseID FROM Spa S1, SpousePreviousAddresses S2
WHERE s1.SpouseID = s2.SpouseID
AND s2.CountryID = 181
AND s1.PreviousAddressId = s2.PreviousAddressId
ORDER BY S2.FromDate DESC)
Just a guess.