I am having trouble figuring out how to get the results that I need for this query.
I am looking for the last record for a dog that has a status of adopted. If the last record is returned, I don't want that record - only the adopted records.
If my table contains these rows:
ID NAME DATE STATUS WANT THIS ONE?
14 Fido 7/1/2014 Adopted Yes - last record for Fido that is Adopted
13 Elle 6/15/2014 Returned No - last record for Elle but not Adopted
12 Elle 6/1/2014 Adopted No - not the last record for Elle
11 Spot 5/30/14 Adopted Yes - last record for Spot that is Adopted
10 Spot 5/15/2014 Returned No - not Adopted
9 Spot 5/1/2014 Adopted No - not the last record for Spot
select * from (
select * ,
row_number() over (partition by name order by date desc) rn
from tbl
) t1 where t1.rn = 1
and status = 'Adopted'
or
select * from tbl t1
where status = 'Adopted'
and not exists (
select 1 from tbl t2
where t2.Name = t1.Name
and t2.Date > t1.Date
)
If you want to return the latest record for each dog, only where the latest status is 'Adopted':
select *
from tbl t
where date = (select max(x.date) from tbl x where x.name = t.name)
and status = 'Adopted'
Fiddle: http://sqlfiddle.com/#!6/e2cae/1/0
In this query if the latest record for a dog is anything other than 'Adopted', the dog will not be returned. This matches your desired output, based on the comments you've placed beside the table.
If you want to return the latest 'Adopted' record for each dog (if any):
select *
from tbl t
where date = (select max(x.date)
from tbl x
where x.name = t.name
and x.status = 'Adopted')
However both queries are vulnerable to mixing up 2 dogs who have the same name. You should have another table to uniquely identify the dogs that you can join into, and a unique DOG_ID field on this table that references that table.
For the data you showed in the question, it is pretty tricky. Let's start with the assumption that dogs can't be adopted in the future. Something like this should work:
select dog, maxAdoptedDate
from (
select adopted.name dog
, isnull(max(returned.date), dateadd(day, 1, getdate())) maxreturnedDate
, max(adopted.date) maxAdoptedDate
from yourTable adopted left join yourTable returned
on adopted.name = returned.name
and returned.status = 'Returned'
and adopted.status = 'Adopted'
where whatever
group by adopted.name) temp
where maxAdoptedDate > maxReturnedDate
and whatever
The two whatevers should be the same. As mentioned in another answer, if two dogs have the same name, you are in trouble.
Related
i have a table called "main" which has 4 columns, ID, name, DateID and Sign.
i want to create a query that will delete entries in this table if there is the same ID record in twice within a certain DateID.
i have my where clause that searches the previous 3 weeks
where DateID =((SELECT MAX( DateID)
WHERE DateID < ( SELECT MAX( DateID )-3))
e.g of my dataset im working with:
id
name
DateID
sign
12345
Paul
1915
Up
23658
Danny
1915
Down
37868
Jake
1916
Up
37542
Elle
1917
Up
12345
Paul
1917
Down
87456
John
1918
Up
78563
Luke
1919
Up
23658
Danny
1920
Up
in the case above, both entries for ID 12345 would need to be removed.
however the entries for ID 23658 would need to be kept as the DateID > 3
how would this be possible?
You can use window functions for this.
It's not quite clear, but it seems LAG and conditional COUNT should fit what you need.
DELETE t
FROM (
SELECT *,
CountWithinDate = COUNT(CASE WHEN t.PrevDate >= t.DateId - 3 THEN 1 END) OVER (PARTITION BY t.id)
FROM (
SELECT *,
PrevDate = LAG(t.DateID) OVER (PARTITION BY t.id ORDER BY t.DateID)
FROM YourTable t
) t
) t
WHERE CountWithinDate > 0;
db<>fiddle
Note that you do not need to re-join the table, you can delete directly from the t derived table.
Hope this works:
DELETE FROM test_tbl
WHERE id IN (
SELECT T1.id
FROM test_tbl T1
WHERE EXISTS (SELECT 1 FROM test_tbl T2 WHERE T1.id = T2.id AND ABS(T2.dateid - T1.dateid) < 3 AND T1.dateid <> T2.dateid)
)
In case you need more logic for data processing, I would suggest using Stored Procedure.
How can I delete rows where dateupdated was least updated ?
My table is
Name Dateupdated ID status
john 1/02/17 JHN1 A
john 1/03/17 JHN2 A
sally 1/02/17 SLLY1 A
sally 1/03/17 SLLY2 A
Mike 1/03/17 MK1 A
Mike 1/04/17 MK2 A
I want to be left with the following after the data removal:
Name Date ID status
john 1/03/17 JHN2 A
sally 1/03/17 SLLY2 A
Mike 1/04/17 MK2 A
If you really want to "delete rows where dateupdated was least updated" then a simple single-row subquery should do the trick.
DELETE MyTable
WHERE Date = (SELECT MIN(Date) From MyTable)
If on the other hand you just want to delete the row with the earliest Date per person (as identified by their ID) you could use:
DELETE MyTable
FROM MyTable a
JOIN (SELECT ID, MIN(Date) MinDate FROM MyTable GROUP BY ID) b
ON a.ID = b.ID AND a.Date = b.MinDate
The idea here is you create an aggregate query that returns rows containing the columns that would match the rows you want deleted, then join to it. Because it's an inner join, rows that do not match the criteria will be excluded.
If people are uniquely identified by something else (e.g. Name then you can just substitute that for the ID in my example above.
I am thinking though that you don't want either of these. I think you want to delete everything except for each person's latest row. If that is the case, try this:
DELETE MyTable
WHERE EXISTS (SELECT 0 FROM MyTable b WHERE b.ID = MyTable.ID AND b.Date > MyTable.Date)
The idea here is you check for existence of another data row with the same ID and a later date. If there is a later record, delete this one.
The nice thing about the last example is you can run it over and over and every person will still be left with exactly one row. The other two queries, if run over and over, will nibble away at the table until it is empty.
P.S. As these are significantly different solutions, I suggest you spend some effort learning how to articulate unambiguous requirements. This is an extremely important skill for any developer.
This deletes rows where the name is a duplicate, and deletes all but the latest row for each name. This is different from your stated question.
Using a common table expression (cte) and row_number():
;with cte as (
select *
, rn = row_number() over (
partition by Name
order by Dateupdated desc
)
from t
)
/* ------------------------------------------------
-- Remove duplicates by deleting rows
-- where the row number (rn) is greater than 1
-- leaving the first row for each partition
------------------------------------------------ */
delete
from cte
where cte.rn > 1
select * from t
rextester: http://rextester.com/HZBQ50469
returns:
+-------+-------------+-------+--------+
| Name | Dateupdated | ID | status |
+-------+-------------+-------+--------+
| john | 2017-01-03 | JHN2 | A |
| sally | 2017-01-03 | SLLY2 | A |
| Mike | 2017-01-04 | MK2 | A |
+-------+-------------+-------+--------+
Without using the cte it can be written as:
delete d
from (
select *
, rn = row_number() over (
partition by Name
order by Dateupdated desc
)
from t
) as d
where d.rn > 1
This should do the trick:
delete
from MyTable a
where not exists (
select top 1 1
from MyTable b
where b.name = a.name
and b.DateUpdated < a.DateUpdated
)
i.e. remove any entries from the table for which there is no record on the same name with a date earlier than the record to be deleted's.
Your Name column has Mike and Mik2 which is different for each other.
So, if you did not make a mistake, standard column to group by must be ID column without last digit.
I think following is more accurate if you did not mistaken.
delete a
from MyTable a
inner join
(select substring(ID, 1, len(ID) - 1) as ID, min(Dateupdated) as MinDate
from MyTable
group by substring(ID, 1, len(ID) - 1)
) b
on substring(a.ID, 1, len(a.ID) - 1) = b.ID and a.Dateupdated = b.MinDate
You can test it at SQLFiddle: http://sqlfiddle.com/#!6/9c440/1
Each time a user searches for a text on the website, the search text gets recorded to search_table. The sub-searches are also recorded. They are recorded with an asterisk.
The goal is to find the most complete search texts that the user searched for.
The ideal way would be:
Group the ids = 1,4,6 and obtain id=6
Group the ids = 2,5,7 and obtain id = 7
Group the ids = 3 and obtain id = 3
Group the ids 8, 9 and obtain id = 9
SEARCH_TABLE
id user search_text
--------------------
1 user1 data manag*
2 user1 confer*
3 user1 incomplete sear*
4 user1 data managem*
5 user1 conference c*
6 user1 data management
7 user1 conference call
8 user1 status in*
9 user1 status information
Output should be
user search_text
---------------------
user1 data management
user1 conference call
user1 incomplete sear*
user1 status information
Can you help please?
Something like below should do the work:
SELECT * FROM
SEARCH_TABLE st
WHERE
NOT EXISTS (
SELECT 1 FROM
SEARCH_TABLE st2
-- remove asterkis and ad %
WHERE st2.search_Text LIKE replace(st.search_text,'*','')||'%'
)
This is filtering all searches that are part of others.
This is probably not the most elegant way, but here's a go at it:
alter table your_table
add group_id int
select [user], left(search_text, 5) as Group_Text, IDENTITY(int, 1,1) as Group_ID
into #group_id_table
from your_table
group by [user], left(search_text, 5)
order by [user], left(search_text, 5)
update a
set a.group_id = b.group_id
from your_table as a
join #group_id_table as b
on left(search_text, 5) = group_text
select [user], max(search_text), group_id
from your_table
group by [user], group_id
order by [user], group_id
This achieved the desired results when I ran it, but of course because you're basing the group_id's off a user specified string length there could be issues there. I hope this does the job for you.
Give this a shot. I separated out the completed texts (and their shorter partials), and then found the longest partial for each record. Tested in Oracle as I don't have access to a PostgreSQL right now, but I didn't use anything exotic so it should work.
with
--Contains all completed searches
COMPLETE as (select * from SEARCH_TABLE where SEARCH_TEXT not like '%*'),
--Contains all searches that are incomplete and dont have a completed match
INCOMPLETE as (
select S.*
from SEARCH_TABLE S
left join COMPLETE C
on S.USR = C.USR
and C.SEARCH_TEXT like replace(S.SEARCH_TEXT, '*', '%')
where C.ID is null
),
--chains all incompleted with any matching pattern shorter than it.
CHAINED_INC as (
select LONGER.USR, LONGER.ID, LONGER.SEARCH_TEXT, SHORTER.SEARCH_TEXT SEARCH_TEXT_SHORT
from INCOMPLETE LONGER
join INCOMPLETE SHORTER
on LONGER.SEARCH_TEXT like replace(SHORTER.SEARCH_TEXT, '*', '%')
and LONGER.ID <> SHORTER.ID
)
--if a text is not the shorter text for a different record, that means it's the longest text for that pattern.
select distinct T1.USR, T1.SEARCH_TEXT
from CHAINED_INC T1
left join CHAINED_INC T2
on T1.USR = T2.USR
and T1.SEARCH_TEXT = T2.SEARCH_TEXT_SHORT
where T2.SEARCH_TEXT_SHORT is null
--finally, union back to the completed texts.
union all
select USR, SEARCH_TEXT from COMPLETE
;
Edit: removed ID from select
I have a table that looks like the following but also has more columns that are not needed for this instance.
ID DATE Random
-- -------- ---------
1 4/12/2015 2
2 4/15/2015 2
3 3/12/2015 2
4 9/16/2015 3
5 1/12/2015 3
6 2/12/2015 3
ID is the primary key
Random is a foreign key but i am not actually using table it points to.
I am trying to design a query that groups the results by Random and Date and select the MAX Date within the grouping then gives me the associated ID.
IF i do the following query
select top 100 ID, Random, MAX(Date) from DateBase group by Random, Date, ID
I get duplicate Randoms since ID is the primary key and will always be unique.
The results i need would look something like this
ID DATE Random
-- -------- ---------
2 4/15/2015 2
4 9/16/2015 3
Also another question is there could be times where there are many of the same date. What will MAX do in that case?
You can use NOT EXISTS() :
SELECT * FROM YourTable t
WHERE NOT EXISTS(SELECT 1 FROM YourTable s
WHERE s.random = t.random
AND s.date > t.date)
This will select only those who doesn't have a bigger date for corresponding random value.
Can also be done using IN() :
SELECT * FROM YourTable t
WHERE (t.random,t.date) in (SELECT s.random,max(s.date)
FROM YourTable s
GROUP BY s.random)
Or with a join:
SELECT t.* FROM YourTable t
INNER JOIN (SELECT s.random,max(s.date) as max_date
FROM YourTable s
GROUP BY s.random) tt
ON(t.date = tt.max_date and s.random = t.random)
In SQL Server you could do something like the following,
select a.* from DateBase a inner join
(select Random,
MAX(dt) as dt from DateBase group by Random) as x
on a.dt =x.dt and a.random = x.random
This method will work in all versions of SQL as there are no vendor specifics (you'll need to format the dates using your vendor specific syntax)
You can do this in two stages:
The first step is to work out the max date for each random:
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
Now you can join back onto your table to get the max ID for each combination:
SELECT MAX(e.ID) AS ID
,e.DateField AS DateField
,e.Random
FROM Example AS e
INNER JOIN (
SELECT MAX(DateField) AS MaxDateField, Random
FROM Example
GROUP BY Random
) data
ON data.MaxDateField = e.DateField
AND data.Random = e.Random
GROUP BY DateField, Random
SQL Fiddle example here: SQL Fiddle
To answer your second question:
If there are multiples of the same date, the MAX(e.ID) will simply choose the highest number. If you want the lowest, you can use MIN(e.ID) instead.
Note: The Data schema can not be changed. I'm stuck with it.
Database: SQLite
I have a simple tree structure, without parent keys, that is only 1 level deep. I have simplied the data for clarity:
ID Content Title
1 Null Canada
2 25 Toronto
3 33 Vancouver
4 Null USA
5 45 New York
6 56 Dallas
The structure is ordinal as well so all Canadian Cities are > Canada's ID of 1 and less than the USA's ID of 4
Question: How do I select all a nation's Cities when I do not know how many there are?
My query assigns every city to every country, which is probably not what you want, but:
http://sqlfiddle.com/#!5/94d63/3
SELECT *
FROM (
SELECT
place.Title AS country_name,
place.ID AS id,
(SELECT MIN(ID)
FROM place AS next_place
WHERE next_place.ID > place.ID
AND next_place.Content IS NULL
) AS next_id
FROM place
WHERE place.Content IS NULL
) AS country
INNER JOIN place
ON place.ID > country.id
AND CASE WHEN country.next_id IS NOT NULL
THEN place.ID < country.next_id
ELSE 1 END
select * from tbl
where id > 1
and id < (select min(id) from tbl where content is null and id > 1)
EDIT
I just realized the above does not work if there are no countries with greater ID. This should fix it.
select * from tbl a
where id > 4
and id < (select coalesce(b.id,a.id+1) from tbl b where b.content is null and b.id > a.id)
Edit 2 - Also made subquery fully correlated, so only have to change country id in one place.
You have here severals things to consider, one is if your data is gonna change and the other one is if it isn't gonna change, for the first one exist 2 solutions, and for the second, just one.
If your data is organize as shown in your example, you can do a select top 3, i.e.
SELECT * FROM CITIES WHERE ID NOT IN (SELECT TOP 3 ID FROM CITIES)
You can create another table where you specify wich city belongs to what parent, and make the hierarchy by yourself.
I reccomend the second one to be used.