Select where join match list of value - sql

I'm not sure how to ask this question. I have the following schema :
message_id
message_content
1
Hello World
2
EHLO
message_id
concerned_user
1
laura
1
vick
1
john
2
laura
2
vick
How to select message_id which concern laura and vick (and only laura and vick). Expected result is ̀[2]`.
I'm sure it is basic SQL but I don't find it.
As questionned in some answer: I use PostgreSQL.
Thanks !

Build a string of the concerned users and see if that matches what you are looking for. In PostgreSQL the string concatenation group function is STRING_AGG:
select message_id
from mytable
group by message_id
having string_agg(concerned_user, ',' order by concerned_user) = 'laura,vick';
If there can be duplicates in the table (two or more rows for the same message_id and concerned_user), you must add DISTINCT: string_agg(distinct concerned_user ...).

If only 2 specific users have the same message, then that message will have only 2 unique users.
And a count for each user won't be zero.
SELECT message_id
FROM your_table
GROUP BY message_id
HAVING COUNT(DISTINCT concerned_user) = 2
AND COUNT(CASE WHEN concerned_user = 'laura' THEN 1 END) > 0
AND COUNT(CASE WHEN concerned_user = 'vick' THEN 1 END) > 0;
db<>fiddle here

In very basic SQL it can be done with a statement doing something using WHERE EXISTS / NOT EXISTS
like (in pseudo code, real SQL statement follows):
select messages-from-laura
where exists(message-from-vick-with-same-id)
and not exist(message-from-someone-else with same-id)
same-id refering to laura-message id.
This is standard SQL, it works in postgresql, mssql (tested with both) and probably with others.
select *
from messages laura_messages
where concerned_user = 'laura'
and (exists (select 1 from messages vick_messages
where vick_messages.message_id = laura_messages.message_id))
and not exists (select 1 from messages other_messages
where other_messages.message_id = laura_messages.message_id
and other_messages.concerned_user not in ('laura','vick'))
Note the use of aliases for the main table and subqueries tables
(in SQL, you can add an alias name after the table name in the FROM TableName clause)
and the reference of the first message id in EXISTS subqueries.
Note also that you don't need to return something in subqueries, just that some data exists, so doing SELECT 1 is fine.
And you could write the equivalent query with joins but the optimizer is very good at rewriting queries so it would probably be the same, this one is simpler imho.
Or you could also use GROUP or something more sophisticated, the nice thing with sql is that you often have several possibilities to write the same query :)

Related

Duplicate ID in the database

I noticed in my database, some users have the same ID number (it seems to be a bug that didn't check if the id number was already taken for a deleted user).
There are hundreds of couples of users with the same ID number.
Through SQL I would like to update (adding a 0) to all those users who have a duplicate ID and are deleted.
I'm very familiar with the SQL language.
I found all the duplicate ID users using this query, but I am not sure how I should proceed.
SELECT ID, COUNT(*) As Num
FROM Users
GROUP BY ID
HAVING COUNT(ID) >= 2
If I understand correctly, you have some sort of "isdeleted" flag. Although I'm not sure that "adding a zero" is the best solution to your problem, the standard SQL for this would, based on your description, look something like this:
update t
set id = id || '0'
where isdeleted = 1 and
exists (select 1 from t t2 where t2.id = t.id and t2.isdeleted = 0);
This assumes that isdeleted is a number, with 0 for false and 1 for true. || is the standard SQL operator for string concatenation. Some databases have other mechanisms for string concatenation.
The query is for oracle, not sure what database are you using,
update users set id = id||0 where rowid not in
(select max(rowid ) from users group by id)
--and flag = 'Deleted Flag' -- uncomment the delete flag if you have in the table. If not just use same query a it is
;

Microsoft Access Query

I have one settings table that is storing different setting keys for a app built using Microsoft Access
One of the settings key drives how many records should be seen in a dropdown list
The query behind the list is similar to the one below:
Select Top 3 id, name FROM tblRegular
Now, I want to achieve something like this:
Select Top (Select keyValue FROM tblSettings WHERE key="rowNumber") id, name FROM tblRegular
But, using it like this does not work as it fires errors
Could someone tell me if it can be done?
EDIT: The table structure looks similar to the one below:
tblRegular:
id | name
1 'A'
2 'B'
3 'C'
tblSettings:
id | key | keyValue
1 'rowNumber' 2
Thank you!
Consider the pure SQL solution using a correlated subquery to calculate a rowCount that is then used in outer query to filter by number of rows:
SELECT main.id, main.[name]
FROM
(SELECT t.id, t.[name],
(SELECT Count(*) FROM tblRegular sub
WHERE sub.[name] <= t.[name]) AS rowCount
FROM tblRegular t) AS main
WHERE main.rowCount <= (SELECT Max(s.keyValue) FROM tblSettings s
WHERE s.key = 'rowNumber')
Alternatively with the domain aggregate, DMax():
SELECT main.id, main.[name]
FROM
(SELECT t.id, t.[name],
(SELECT Count(*) FROM tblRegular sub
WHERE sub.[name] <= t.[name]) AS rowCount
FROM tblRegular t) AS main
WHERE main.rowCount <= DMax("keyValue", "tblSettings", "key = 'rowNumber'")
This syntax does fail in Access SQL, barking at "Select" with the (localised) message like:
The SELECT sentence contains a reserved word or argument, that is
misspelled or missing, or the punctuation is not correct.
Ok, so, just found out that using the Select statement as it was in the question would have triggered the errors mentioned. So, the approach that worked is to alter the .RowSource of the dropdown on form load and the query placed in the rowsource should look like:
Select Top (" & rowNr & ") id, name FROM tblRegular
where rowNr is fetched using another SQL query or even a DAO/ADO function to retrieve the value from the database

PostgreSQL: How to select on non-aggregating column?

Seems like a simple question but I'm having trouble accomplishing it. What I want to do is return all names that have duplicate ids. The view looks as such:
id | name | other_col
---+--------+----------
1 | James | x
2 | John | x
2 | David | x
3 | Emily | x
4 | Cameron| x
4 | Thomas | x
And so in this case, I'd just want the result:
name
-------
John
David
Cameron
Thomas
The following query works but it seems like an overkill to have two separate selects:
select name
from view where id = ANY(select id from view
WHERE other_col='x'
group by id
having count(id) > 1)
and other_col='x';
I believe it should be possible to do something under the lines of:
select name from view WHERE other_col='x' group by id, name having count(id) > 1;
But this returns nothing at all! What is the 'proper' query?
Do I just have to it like my first working suggestion or is there a better way?
You state you want to avoid two "queries", which isn't really possible. There are plenty of solutions available, but I would use a CTE like so:
WITH cte AS
(
SELECT
id,
name,
other_col,
COUNT(name) OVER(PARTITION BY id) AS id_count
FROM
table
)
SELECT name FROM cte WHERE id_count > 1;
You can reuse the CTE, so you don't have to duplicate logic and I personally find it easier to read and understand what it is doing.
SELECT name FROM Table
WHERE id IN (SELECT id, COUNT(*) FROM Table GROUP BY id HAVING COUNT(*)>1) Temp
Use EXIST operator
SELECT * FROM table t1
WHERE EXISTS(
SELECT null FROM table t2
WHERE t1.id = t2.id
AND t1.name <> t2.name
)
Use a join:
select distinct name
from view v1
join view v2 on v1.id = v2.id
and v1.name != v2.name
The use of distinct is there in case there are more than 2 rows sharing the same id. If that's not possible, you can omit distinct.
A note: Naming a column id when it's not unique will likely cause confusion, because it's the industry standard for the unique identifier column. If there isn't a unique column at all, it will cause coding difficulties.
Do not use a CTE. That's typically more expensive because Postgres has to materialize the intermediary result.
An EXISTS semi-join is typically fastest for this. Just make sure to repeat predicates (or match the values):
SELECT name
FROM view v
WHERE other_col = 'x'
AND EXISTS (
SELECT 1 FROM view
WHERE other_col = 'x' -- or: other_col = v.other_col
AND id <> v.id -- exclude join to self
);
That's a single query, even if you see the keyword SELECT twice here. An EXISTS expression does not produce a derived table, it will be resolved to simple index look-ups.
Speaking of which: a multicolumn index on (other_col, id) should help. Depending on data distribution and access patterns, appending the payload column name to enable index-only scans might help: (other_col, id, name). Or even a partial index, if other_col = 'x' is a constant predicate:
CREATE INDEX ON view (id) WHERE other_col = 'x';
PostgreSQL does not use a partial index
The upcoming Postgres 9.6 would even allow an index-only scan on the partial index:
CREATE INDEX ON view (id, name) WHERE other_col = 'x';
You will love this improvement (quoting the /devel manual):
Allow using an index-only scan with a partial index when the index's
predicate involves column(s) not stored in the index (Tomas Vondra,
Kyotaro Horiguchi)
An index-only scan is now allowed if the query mentions such columns
only in WHERE clauses that match the index predicate
Verify performance with EXPLAIN (ANALYZE, TIMING OFF) SELECT ...
Run a couple of times to rule out caching effects.

SQLite: detect if a rowid exists

What's the best/right/fastest/most appropriate way to detect if a row with a given rowid exists?
Or by extension, hwo to detect if at least one row matching a given condition exists?
I'm firing quite some of these requests. I am currently using
SELECT 1 FROM table WHERE condition LIMIT 1
looks a bit weird to me, but looks to me like "the least work" for the db, however, my SQL knowledge is spotty.
I would probably do it something like this:
SELECT
CASE
WHEN EXISTS(SELECT NULL FROM table1 WHERE ID=someid)
THEN 1
ELSE 0
END
To Count the rows is not that effective.
To check if something exists is in most cases more effective
Since it's sqlite, you need to use the column name "rowid" to access that id column. Using Craig Ringer's sql, the sqlite version would look like this:
SELECT EXISTS(SELECT 1 FROM table WHERE rowid = insert_number)
Use EXISTS, it sounds perfect for what you are after. e.g.
SELECT *
FROM T1
WHERE EXISTS (SELECT 1 FROM T2 WHERE T2.X = T1.X AND T2.Y = 1)
It is effectly the same as LIMIT 1 but is generally optimised better.
One could test
select true from table where id = id1
You can for example use
SELECT COUNT(*) FROM table WHERE ID = whatever

How to give the output of the first query(which has two values) as the input to the second?

i get 2 names as the output of the first query....
eg: paul,peter
now this should be the input for the second query,
which has to display paul's and peter's email ids....
For nested queries I would strongly recommend WITH clause. It makes long complex queries order of magnitude easier to understand / construct / modify:
WITH
w_users AS( -- you can name it whatever you want
SELECT id
FROM users
WHERE < long condition here >
),
w_other_subquery AS(
...
)
SELECT email_id
FROM ...
WHERE user_id IN (SELECT id FROM w_users)
You can use like this
LIKE
SELECT USER_ID,EMAIL_ID FROM USERS where user_id IN
(SELECT PRODUCT_MEMBERS FROM PRODUCT WHERE PRODUCT_NAME='ICP/RAA');
Just use the IN clause '=' is used for matching one result
You can use In Command to get result
ex:
SELECT email FROM tableName WHERE (Name IN ('paul', 'peter'))