Can't aggregate on max function - sql

I have a table
CREATE TABLE `messages` ( `uid` BIGINT NOT NULL ,
`mid` BIGINT , `date` BIGINT NOT NULL , PRIMARY KEY (`mid`));
I want to select max(date) grouped by uid, i.e. for every uid(read user) I want to find the latest message (with tha maximum date)
tried this
select messages.mid, max(messages.date), messages.uid, messages.body
from messages
where messages.chat_id is NULL
group by messages.uid
but the query works wrong.

A subquery can give you the date you need in order to retrieve the newest message for each user:
SELECT messages.uid, messages.mid, messages.date, messages.body
FROM messages
WHERE messages.chat_id IS NULL
AND messages.date IN
( SELECT MAX(m2.date) FROM messages m2
WHERE m2.uid = messages.uid AND m2.chat_id IS NULL
)
;

u need to group by all the fields while using aggregate functions :) using a subquery would sort out the problem.
SELECT messages.date,messages.uid, messages.mid, messages.body
FROM messages
WHERE messages.chat_id IS NULL AND messages.date IN (SELECT MAX(msg.date) FROM messages msg WHERE messages.chat_id IS NULL And msg.uid = messages.uid )
alternatively it can also be done using the 'having' clause
done :)

Related

Find groups that do not contain any NULL value

I have a many to many table called dbo.ObjectOwner having following columns:
ObjectId | OwnerId | StartDate |EndDate
Where ObjectId, OwnerId is not a primary key and Startdate and Enddate refer to the dates where Object is owned by Owner.
The query I'm trying to do should return all
ObjectId's where for each ObjectId, it has no associated records where EndDate is null. I.e, return all objects who currently have no owner.
so something like
foreach(objectId in dbo.ObjectOwner)
if (
doesnotexist (records where ObjectId = objectid and EndDate is null)
)
{
add this objectid to the select table
}
I had a look at group by and having, but the following script returns all records
SELECT oo.ObjectId
FROM dbo.ObjectOwner oo
GROUP BY oo.ObjectId
HAVING NOT EXISTS (
SELECT 1
FROM dbo.ObjectOwner
WHERE dbo.ObjectOwner.EndDate = null
)
Thanks in advance
You can use GROUP BY and HAVING. The following works because NULL values are not COUNTed:
SELECT ObjectId
FROM ObjectOwner
GROUP BY ObjectId
HAVING COUNT(*) = COUNT(EndDate)
It can't work if you write <...> = NULL, because NULL can't be equal to something.
SELECT oo.ObjectId
FROM dbo.ObjectOwner oo
GROUP BY oo.ObjectId
HAVING NOT EXISTS (
SELECT 1
FROM dbo.ObjectOwner
WHERE dbo.ObjectOwner.EndDate IS NULL
)

Ambiguous column name SQL

I get the following error when I want to execute a SQL query:
"Msg 209, Level 16, State 1, Line 9
Ambiguous column name 'i_id'."
This is the SQL query I want to execute:
SELECT DISTINCT x.*
FROM items x LEFT JOIN items y
ON y.i_id = x.i_id
AND x.last_seen < y.last_seen
WHERE x.last_seen > '4-4-2017 10:54:11'
AND x.spot = 'spot773'
AND (x.technology = 'Bluetooth LE' OR x.technology = 'EPC Gen2')
AND y.id IS NULL
GROUP BY i_id
This is how my table looks like:
CREATE TABLE [dbo].[items] (
[id] INT IDENTITY (1, 1) NOT NULL,
[i_id] VARCHAR (100) NOT NULL,
[last_seen] DATETIME2 (0) NOT NULL,
[location] VARCHAR (200) NOT NULL,
[code_hex] VARCHAR (100) NOT NULL,
[technology] VARCHAR (100) NOT NULL,
[url] VARCHAR (100) NOT NULL,
[spot] VARCHAR (200) NOT NULL,
PRIMARY KEY CLUSTERED ([id] ASC));
I've tried a couple of things but I'm not an SQL expert:)
Any help would be appreciated
EDIT:
I do get duplicate rows when I remove the GROUP BY line as you can see:
I'm adding another answer in order to show how you'd typically select the lastest record per group without getting duplicates. You's use ROW_NUMBER for this, marking every last record per i_id with row number 1.
SELECT *
FROM
(
SELECT
i.*,
ROW_NUMBER() over (PARTITION BY i_id ORDER BY last_seen DESC) as rn
FROM items i
WHERE last_seen > '2017-04-04 10:54:11'
AND spot = 'spot773'
AND technology IN ('Bluetooth LE', 'EPC Gen2')
) ranked
WHERE rn = 1;
(You'd use RANK or DENSE_RANK instead of ROW_NUMBER if you wanted duplicates.)
You forgot the table alias in GROUP BY i_id.
Anyway, why are you writing an anti join query where you are trying to get rid of duplicates with both DISTINCT and GROUP BY? Did you have issues with a straight-forward NOT EXISTS query? You are making things way more complicated than they actually are.
SELECT *
FROM items i
WHERE last_seen > '2017-04-04 10:54:11'
AND spot = 'spot773'
AND technology IN ('Bluetooth LE', 'EPC Gen2')
AND NOT EXISTS
(
SELECT *
FROM items other
WHERE i.i_id = other.i_id
AND i.last_seen < other.last_seen
);
(There are other techniques of course to get the last seen record per i_id. This is one; another is to compare with MAX(last_seen); another is to use ROW_NUMBER.)

Postgres union query returns strange result

I have the following tables:
CREATE TABLE geodat(
vessel UUID NOT NULL,
trip UUID NOT NULL,
geom geometry(LineString,4326),
PRIMARY KEY(vessel,trip)
);
CREATE TABLE areas(
gid SERIAL NOT NULL,
/* --other columns of little interest-- */
geom geometry(MultiPolygon,3035),
PRIMARY KEY(gid)
);
The following query is supposed to return the area that has been crossed the least, as well as how many times it was crossed and by which vessels.
SELECT vessel,MIN(cnt) as min_crossing,gid
FROM (
SELECT vessel,COUNT(*) as cnt, gid
FROM (
SELECT vessel, null as geo1, geom as geo2, null as gid
FROM geodat
UNION ALL
SELECT null,geom,null,gid FROM areas ) as P
WHERE ST_Crosses(geo1,geo2) AND geo1 IS NOT NULL AND geo2 IS NOT NULL
GROUP BY gid,vessel) as P1
GROUP BY gid,vessel
Theoretically, this query should solve the question above. The problem is that I am getting (0 rows) as an answer, although I have been assured as to the opposite. I discovered it has something to do with the null values the UNION produced, but I don't have a clue how to fix this.
Any ideas?
NOTE: The two tables have 31822 rows and 27308 rows respectively which makes a JOIN impractical.
You have the condition
WHERE ST_Crosses(geo1,geo2) AND geo1 IS NOT NULL AND geo2 IS NOT NULL
However, in the union all you are explicitly setting geo1 to null and geo2 to null. Hence the query is returning 0 rows.
You can change the where condition above to or, which would return rows.
WHERE ST_Crosses(geo1,geo2) AND geo1 IS NOT NULL OR geo2 IS NOT NULL

SQL Query giving inconsistant results

I have the following table which logs chat messages
CREATE TABLE message_log
(
id serial NOT NULL,
message text,
from_id character varying(500),
to_id character varying(500),
match_id character varying(500),
unix_timestamp bigint,
own_account boolean,
reply_batch boolean DEFAULT false,
CONSTRAINT message_log_pkey PRIMARY KEY (id)
)
A chat conversation will have the same match_id
I want a query that would return a list of match_ids which the last message related to the match_id (the last message of the chat conversation) is from the non account holder (own_account = false)
I came up with the following query, but it is giving inconsistent results which I don't understand.
select * from message_log
where from_id <> ?
and to_id = ?
and unix_timestamp in ( select distinct max(unix_timestamp)
from message_log group by match_id )
The question mark in the SQL query represents the account holder's user id
It would seem you need to bind the message_id back to the base query as well, otherwise you could be getting a unix_timestamp from a different message:
select m.*
from message_log m
where m.from_id <> ?
and m.to_id = ?
and m.unix_timestamp = ( select max(unix_timestamp)
from message_log
where match_id = m.match_id
group by match_id )
I would suggest using distinct on. This is specific to Postgres, but designed for this situation:
select distinct on (match_id) ml.*
from message_log ml
where from_id <> ? and to_id = ?
order by match_id, unix_timestamp desc;

SQLite Update Syntax for Correlated Subquery with Condition

I'm working with GTFS data in a SQLite database. I have a StopTimes table with columns trip_id, stop_sequence, and departure_time (among others). I want to null the last departure_time (in the tuple with the largest stop_sequence) for each trip.
The most obvious way I could think of to do this was this query:
UPDATE StopTimes AS A
SET departure_time = NULL
WHERE NOT EXISTS
(
SELECT * FROM StopTimes B
WHERE B.stop_sequence > A.stop_sequence
)
Unfortunately, it looks like I can't use an alias for a table in an UPDATE statement in SQLite. The query fails with (in Python) "sqlite3.OperationalError: near "AS": syntax error", and this other style isn't allowed in SQLite either:
UPDATE A
SET departure_time = NULL
FROM StopTimes AS A
WHERE NOT EXISTS
(
SELECT * FROM StopTimes B
WHERE B.stop_sequence > A.stop_sequence
)
I've tried a couple other variants, such as using ">= ALL (SELECT stop_sequence... WHERE trip_id = ? ...)", but I can't fill in that question mark.
I've also tried this one, but it doesn't look like this is valid SQL:
UPDATE StopTimes
SET departure_time = NULL
WHERE (trip_id, stop_sequence) IN
(
SELECT trip_id, MAX(stop_sequence)
FROM StopTimes
GROUP BY trip_id
)
How can I reference an outer table's attributes in a subquery in an UPDATE query, with syntax that SQLite will accept? Is there some way I can reformulate my query to get around this limitation?
You cannot put an alias on the table updated in an UPDATE statement.
However, you can rename any other table in a subquery, so the table names will still be unambiguous:
UPDATE StopTimes
SET departure_time = NULL
WHERE NOT EXISTS
(
SELECT 1 FROM StopTimes AS B
WHERE B.stop_sequence > StopTimes.stop_sequence
)