MySQL conditional SELECT statement - sql

If there are records that have a field containing "X", return them, else return a random record.
How the heck do you do this?

This is best done with 2 queries. The first returns the records where field='x'. If that's empty, then do a query for a random record with field!='x'. Getting a random record can be very inefficient as you'll see from the number of "get random record" questions on SO. Because of this, you really only want to do it if you absolutely have to.

Just the bit to select a random record would be very difficult and highly unefficient on large tables in mysql, in this website you can find one script to do it, it should be trivial to add your condition for 'x' and get the functionality you need.

Well, here is my example based on mysql.users table:
First, non existing records:
mysql> SELECT * FROM (select user, 1 as q from user where user like '%z' union all (select user, 0 from user limit 1)) b WHERE q=(SELECT CASE WHEN EXISTS(select user, 1 as q from user where user like '%z' ) THEN 1 ELSE 0 END);
+--------+---+
| user | q |
+--------+---+
| drupal | 0 |
+--------+---+
1 row in set (0.00 sec)
Then, existing:
mysql> SELECT * FROM (select user, 1 as q from user where user like '%t' union all (select user, 0 from user limit 1)) b WHERE q=(SELECT CASE WHEN EXISTS(select user, 1 as q from user where user like '%t' ) THEN 1 ELSE 0 END);
+------------------+---+
| user | q |
+------------------+---+
| root | 1 |
| root | 1 |
| debian-sys-maint | 1 |
| root | 1 |
+------------------+---+
4 rows in set (0.00 sec)
Maybe it will be useful, or maybe someone will be able to rewrite it in better way.

Related

Update column in one table for a user based on count of records in another table for same user without using cursor

I have 2 tables A and B. I need to update a column in table A for all userid's based on the count of records that userid has in another table based on defined rules. If count of records in another table is 3 and is required for that userID, then mark IsCorrect as 1 else 0, if count is 2 and required is 5 then IsCorrect as 0 For e.g. Below is what I am trying to achieve
Table A
UserID | Required | IsCorrect
----------------------------------
1 | SO;GO;PE | 1
2 | SO;GO;PE;PR | 0
3 | SO;GO;PE | 1
Table B
UserID | PPName
-----------------------
1 | SO
1 | GO
1 | PE
2 | SO
2 | GO
3 | SO
3 | GO
3 | PE
I tried using Update in table joining another table, but cannot up with one. Also, do not want to use cursors, because of its overhead. I know I will have to create a stored Procedure for it for the rules, but how to pass the userID's to it without cursor is what am i am looking for.
Thanks for the help. Apologies for not formatting the table correctly :)
update A
set IsCorrect = case
when Required <= (select count(*) from B where b.UserID = A.UserID)
then 'Y' -- or 0, or whatever sense is appropriate
else 'N'
end
THIS ANSWERS THE ORIGINAL QUESTION.
Hmmm, you can use a correlated subquery and some case logic:
update a
set iscorrect = (case when required <=
(select count(*) from b where b.userid = a.userid)
then 1 else 0
end);

Get last messages for user inbox using sql

i am working on user inbox. i have to retrieve last message to show in inbox, either it was from me or from my friend. i have tried every possible solution but i am not getting any result that's why i am asking here. i am still learning sql. here is my db picture
database table
solution i want: example:
id | sent_by | sent_to | descp
42 | 3 | 7 | fdssdf
30 | 3 | 6 | sdas
I'm not quite sure what you mean, but usually when you want to select the latest of something, you use SELECT TOP 1 and ORDER BY x DESC
SELECT TOP 1 *
FROM your_table_name
WHERE sent_by = 3
ORDER BY id DESC

Find Duplicate Values in a column based on specific criteria

I have a table that holds actions against specific accounts, the actions are given a numbered SET of actions and within that SET they get a unique, sequential number. We ran into an issue where somehow one of the unique numbers had been duplicated and would like to check for more examples where this might have happened. The table looks a little like this:
Account | Action Set | Action No | Action Code
--------|------------|-----------|------------
001 | 1 | 1 | GEN
001 | 1 | 2 | PHO
001 | 1 | 3 | RAN
001 | 1 | 3 | GEN
002 | 1 | 1 | GEN
002 | 1 | 2 | PHO
002 | 1 | 3 | RAN
I have tried various things I've found through searches on here but can't find anything that looks like it fits my specific circumstances.
For any given account number, I would like to find where within one Action SET the same Action Number is used more than once. I also need to return the full row, not just a count of how many there are.
From the example above, I would expect to see these results, same account, same action set, same action number
Account | Action Set | Action No | Action Code
--------|------------|-----------|------------
001 | 1 | 3 | RAN
001 | 1 | 3 | GEN
I would post what I have tried so far but honestly the extent of the code I have written so far is:
SELECT
TIA
Mark
Based on your description, you can use exists:
select t.*
from t
where exists (select 1
from t t2
where t2.account = t.account and
t2.actionset = t.actionset and
t2.actionno <> t.actionno
);
EDIT:
The above assumes that action numbers are different. Otherwise you can use:
select t.*
from t
where (select count(*)
from t t2
where t2.account = t.account and
t2.actionset = t.actionset
) >= 2;
try this one
Select account,actionset,actioncode,actionno
from table
where (account,actionset)
IN
(
Select account,actionset from table
group by account,actionset
having count(distinct actionno)>1
)
group by account,actionset,actioncode,actionno
Please find my solution for Getting duplicate records from table.
SELECT [ActionSet],ActionCode,[ActionNo]
FROM
(
SELECT *,ROW_NUMBER()OVER(PARTITION by [ActionSet],[ActionNo] ORDER BY
[ActionNo]) as rnk FROM [dbo].[ActionAccount]
) t where t.rnk>1
Thanks .

SQL optimisation - Word count in string - Postgresql

I am trying to update a large table (about 1M rows) with the count of words in a field on Postgresql.
This query works, and sets the token_count field counting the words (tokens) in longtext in table my_table:
UPDATE my_table mt SET token_count =
(select count(token) from
(select unnest(regexp_matches(t.longtext, E'\\w+','g')) as token
from my_table as t where mt.myid = t.myid)
as tokens);
myid is the primary key of the table.
\\w+ is necessary because I want to count words, ignoring special characters.
For example, A test . ; ) would return 5 with space-based count, while 2 is the right value.
The issue is that it's horribly slow, and 2 days are not enough to complete it on 1M rows.
What would you do to optimised it? Are there ways to avoid the join?
How can I split the batch into blocks, using for example limit and offset?
Thanks for any tips,
Mulone
UPDATE: I measured the performance of the array_split, and the update is gonna be slow anyway. So maybe a solution would consist of parallelising it. If I run different queries from psql, only one query works and the others wait for it to finish. How can I parallelise an update?
Have you tried using array_length?
UPDATE my_table mt
SET token_count = array_length(regexp_split_to_array(trim(longtext), E'\\W+','g'), 1)
http://www.postgresql.org/docs/current/static/functions-array.html
# select array_length(regexp_split_to_array(trim(' some long text '), E'\\W+'), 1);
array_length
--------------
3
(1 row)
UPDATE my_table
SET token_count = array_length(regexp_split_to_array(longtext, E'\\s+'), 1)
Or your original query without a correlation
UPDATE my_table
SET token_count = (
select count(*)
from (select unnest(regexp_matches(longtext, E'\\w+','g'))) s
);
Using tsvector and ts_stat
get statistics of a tsvector column
SELECT *
FROM ts_stat($$
SELECT to_tsvector(t.longtext)
FROM my_table AS t
$$);
No sample data to try it but it should work.
Sample Data
CREATE TEMP TABLE my_table
AS
SELECT $$A paragraph (from the Ancient Greek παράγραφος paragraphos, "to write beside" or "written beside") is a self-contained unit of a discourse in writing dealing with a particular point or idea. A paragraph consists of one or more sentences.$$::text AS longtext;
SELECT *
FROM ts_stat($$
SELECT to_tsvector(t.longtext)
FROM my_table AS t
$$);
word | ndoc | nentry
--------------+------+--------
παράγραφος | 1 | 1
written | 1 | 1
write | 1 | 2
unit | 1 | 1
sentenc | 1 | 1
self-contain | 1 | 1
self | 1 | 1
point | 1 | 1
particular | 1 | 1
paragrapho | 1 | 1
paragraph | 1 | 2
one | 1 | 1
idea | 1 | 1
greek | 1 | 1
discours | 1 | 1
deal | 1 | 1
contain | 1 | 1
consist | 1 | 1
besid | 1 | 2
ancient | 1 | 1
(20 rows)
Make sure myid is indexed, being the first field in the index.
Consider doing this outside DB in the first place. Hard to say without benchmarking, but the counting may be more costly than select+update; so may be worth it.
use COPY command (BCP equivalent for Postgres) to copy out the table data efficiently in bulk to a file
Run a simple Perl script to count. 1 million rows should take couple of minutes to 1 hour for Perl, depending on how slow your IO is.
use COPY to copy the table back into DB (possibly into a temp table, then update from that temp table; or better yet truncate the main table and COPY straight into it if you can afford the downtime).
For both your approach, AND the last step of my approach #2, update the token_count in batches of 5000 rows (e.g. set rowcount to 5000, and loop the updates adding where token_count IS NULL to the query

How can I do access control via an SQL table?

I'm trying to create an access control system.
Here's a stripped down example of what the table I'm trying to control access to looks like:
things table:
id group_id name
1 1 thing 1
2 1 thing 2
3 1 thing 3
4 1 thing 4
5 2 thing 5
And the access control table looks like this:
access table:
user_id type object_id access
1 group 1 50
1 thing 1 10
1 thing 2 100
Access can be granted either by specifying the id of the 'thing' directly, or granted for an entire group of things by specifying a group id. In the above example, user 1 has been granted an access level of 50 to group 1, which should apply unless there are any other rules granting more specific access to an individual thing.
I need a query that returns a list of things (ids only is okay) along with the access level for a specific user. So using the example above I'd want something like this for user id 1:
desired result:
thing_id access
1 10
2 100
3 50 (things 3 and 4 have no specific access rule,
4 50 so this '50' is from the group rule)
5 (thing 5 has no rules at all, so although I
still want it in the output, there's no access
level for it)
The closest I can come up with is this:
SELECT *
FROM things
LEFT JOIN access ON
user_id = 1
AND (
(access.type = 'group' AND access.object_id = things.group_id)
OR (access.type = 'thing' AND access.object_id = things.id)
)
But that returns multiple rows, when I only want one for each row in the 'things' table. I'm not sure how to get down to a single row for each 'thing', or how to prioritise 'thing' rules over 'group' rules.
If it helps, the database I'm using is PostgreSQL.
Please feel free to leave a comment if there's any information I've missed out.
Thanks in advance!
I don't know the Postgres SQL dialect, but maybe something like:
select thing.*, coalesce ( ( select access
from access
where userid = 1
and type = 'thing'
and object_id = thing.id
),
( select access
from access
where userid = 1
and type = 'group'
and object_id = thing.group_id
)
)
from things
Incidentally, I don't like the design. I would prefer the access table to be split into two:
thing_access (user_id, thing_id, access)
group_access (user_id, group_id, access)
My query then becomes:
select thing.*, coalesce ( ( select access
from thing_access
where userid = 1
and thing_id = thing.id
),
( select access
from group_access
where userid = 1
and group_id = thing.group_id
)
)
from things
I prefer this because foreign keys can now be used in the access tables.
I just read a paper last night on this. It has some ideas on how to do this. If you can't use the link on the title try using Google Scholar on Limiting Disclosure in Hippocratic Databases.
While there are several good answers, the most efficient would probably be something like this:
SELECT things.id, things.group_id, things.name, max(access)
FROM things
LEFT JOIN access ON
user_id = 1
AND (
(access.type = 'group' AND access.object_id = things.group_id)
OR (access.type = 'thing' AND access.object_id = things.id)
)
group by things.id, things.group_id, things.name
Which simply uses summarization added to you query to get what you're looking for.
Tony:
Not a bad solution, I like it, seems to work. Here's your query after minor tweaking:
SELECT
things.*,
coalesce (
( SELECT access
FROM access
WHERE user_id = 1
AND type = 'thing'
AND object_id = things.id
),
( SELECT access
FROM access
WHERE user_id = 1
AND type = 'group'
AND object_id = things.group_id
)
) AS access
FROM things;
And the results look correct:
id | group_id | name | access
----+----------+---------+--------
1 | 1 | thing 1 | 10
2 | 1 | thing 2 | 100
3 | 1 | thing 3 | 50
4 | 1 | thing 4 | 50
5 | 2 | thing 5 |
I do completely take the point about it not being an ideal schema. However, I am stuck with it to some extent.
Josef:
Your solution is very similar to the stuff I was playing with, and my instincts (such as they are) tell me that it should be possible to do it that way. Unfortunately it doesn't produce completely correct results:
id | group_id | name | max
----+----------+---------+-----
1 | 1 | thing 1 | 50
2 | 1 | thing 2 | 100
3 | 1 | thing 3 | 50
4 | 1 | thing 4 | 50
5 | 2 | thing 5 |
The access level for 'thing 1' has taken the higher 'group' access value, rather than the more specific 'thing' access value of 10, which is what I'm after. I don't think there's a way to fix that within a GROUP BY, but if anyone has any suggestions I'm more than happy to be proven incorrect on that point.