I have a PostgreSQL table containing users, a PostgreSQL table containing documents, and a table mapping the users to the documents they've read. Like so:
Table: users
------------
| oid |
| username |
| ... |
------------
Table: documents
------------
| oid |
| title |
| ... |
------------
Table: users_documents
-----------------
| oid |
| user_oid |
| document_oid |
-----------------
Whenever a user reads a document, a record is added to users_documents with their user id and the document id. This is all working fine.
What I want to do though is select a random unread document for a given user. I feel like I should be able to do this with a fairly simple JOIN, but I can't get my head around exactly what the query should look like.
Can someone help please? Thanks.
For most data, you can go to the users_documents table. For instance,
a query of the documents read by user_oid = ? would be
SELECT document_oid
FROM users_documents
WHERE user_oid = ?
However, you want the records which are missing.
You can do an outer join, and find the NULLs in the results.
SELECT users_documents.document_oid
FROM users_documents
RIGHT OUTER JOIN documents
ON users_documents.document_oid = documents.oid AND users_documents.user_oid = ?
It is important that the "users_documents.user_oid = ?" be in the join in the FROM clause rather than in a WHERE clause, because it would not work in the WHERE clause.
Related
Table data sample:
--------------------------
| key | domain | value |
--------------------------
| a | en | English |
| a | de | Germany |
Query which returns result I need:
select * from
(
select t1.key,
(select value from TABLE where t1.key=key AND code='en') en,
(select value from TABLE where t1.key=key AND code='de') de
from TABLE t1
) as t2
Data returned from query:
---------------------------
| key | en | de |
---------------------------
| a | English | Germany |
I do not want to list all available domains with:
(select value from TABLE where t1.key=key AND code='*') *
Is it possible to make this query more dynamic in Postgres: automatically add all domain columns that exist in the table?
For more than a few domains use crosstab() to make the query shorter and faster.
PostgreSQL Crosstab Query
A completely dynamic query, returning a dynamic number of columns based on data in your table is not possible, because SQL is strictly typed. Whatever you try, you'll end up needing two steps. Step 1: generate the query, step 2: execute it.
Execute a dynamic crosstab query
Or you return something more flexible instead of table columns, like an array or a document type like json. Details:
Dynamic alternative to pivot with CASE and GROUP BY
Refactor a PL/pgSQL function to return the output of various SELECT queries
The architecture of my DB involves records in a Tags table. Each record in the Tags table has a string which is a Name and a foreign kery to the PrimaryID's of records in another Worker table.
Records in the Worker table have tags. Every time we create a Tag for a worker, we add a new row in the Tags table with the inputted Name and foreign key to the worker's PrimaryID. Therefore, we can have multiple Tags with different names per same worker.
Worker Table
ID | Worker Name | Other Information
__________________________________________________________________
1 | Worker1 | ..........................
2 | Worker2 | ..........................
3 | Worker3 | ..........................
4 | Worker4 | ..........................
Tags Table
ID |Foreign Key(WorkerID) | Name
__________________________________________________________________
1 | 1 | foo
2 | 1 | bar
3 | 2 | foo
5 | 3 | foo
6 | 3 | bar
7 | 3 | baz
8 | 1 | qux
My goal is to filter WorkerID's based on an inputted table of strings. I want to get the set of WorkerID's that have the same tags as the inputted ones. For example, if the inputted strings are foo and bar, I would like to return WorkerID's 1 and 3. Any idea how to do this? I was thinking something to do with GROUP BY or JOINING tables. I am new to SQL and can't seem to figure it out.
This is a variant of relational division. Here's one attempt:
select workerid
from tags
where name in ('foo', 'bar')
group by workerid
having count(distinct name) = 2
You can use the following:
select WorkerID
from tags where name in ('foo', 'bar')
group by WorkerID
having count(*) = 2
and this will retrieve your desired result/
Regards.
This article is an excellent resource on the subject.
While the answer from #Lennart works fine in Query Analyzer, you're not going to be able to duplicate that in a stored procedure or from a consuming application without opening yourself up to SQL injection attacks. To extend the solution, you'll want to look into passing your list of tags as a table-valued parameter since SQL doesn't support arrays.
Essentially, you create a custom type in the database that mimics a table with only one column:
CREATE TYPE list_of_tags AS TABLE (t varchar(50) NOT NULL PRIMARY KEY)
Then you populate an instance of that type in memory:
DECLARE #mylist list_of_tags
INSERT #mylist (t) VALUES('foo'),('bar')
Then you can select against that as a join using the GROUP BY/HAVING described in the previous answers:
select workerid
from tags inner join #mylist on tag = t
group by workerid
having count(distinct name) = 2
*Note: I'm not at a computer where I can test the query. If someone sees a flaw in my query, please let me know and I'll happily correct it and thank them.
I have two tables like:
ID | TRAFFIC
fd56756 | 4398
645effa | 567899
894fac6 | 611900
894fac6 | 567899
and
USER | ID | TRAFFIC
andrew | fd56756 | 0
peter | 645effa | 0
john | 894fac6 | 0
I need to get SUM ("TRAFFIC") from first table AND set column traffic to the second table where first table ID = second table ID. ID's from first table are not unique, and can be duplicated.
How can I do this?
Table names from your later comment. Chances are, you are reporting table and column names incorrectly.
UPDATE users u
SET "TRAFFIC" = sub.sum_traffic
FROM (
SELECT "ID", sum("TRAFFIC") AS sum_traffic
FROM stats.traffic
GROUP BY 1
) sub
WHERE u."ID" = sub."ID";
Aside: It's unwise to use mixed-case identifiers in Postgres. Use legal, lower-case identifiers, which do not need to be double-quoted, to make your life easier. Start by reading the manual here.
Something like this?
UPDATE users t2 SET t2.traffic = t1.sum_traffic FROM
(SELECT sum(t1.traffic) t1.sum_traffic FROM stats.traffic t1)
WHERE t1.id = t2.id;
I'm working on a user management system and I need to copy a user to a "backup" table before the user will be deleted. How can I set the id to the new column userid while id on both tables are unique?
users
+----+------+-------------+--+
| id | lang | email | |
+----+------+-------------+--+
| 20 | en | test#ya.hoo | |
+----+------+-------------+--+
delusers
+----+--------+------+-------------+
| id | userid | lang | email |
+----+--------+------+-------------+
| 1 | 20 | en | test#ya.hoo |
+----+--------+------+-------------+
First, if an user cannot be deleted twice, the delusers.id could be the PRIMARY KEY of delusers table, and you could use the value of the id of the user itself. There is no need of id and userid on delusers table.
Then, you can just INSERT on delusers and DELETE on users (inside the same transaction, of course):
BEGIN;
INSERT INTO delusers(id,lang,email)
SELECT id,lang,email FROM users WHERE id = 20;
DELETE FROM users WHERE id = 20;
COMMIT;
You could also do that on same command (using CTE):
WITH deleted AS (
DELETE FROM users WHERE id = 20 RETURNING id, lang, email
)
INSERT INTO delusers(id,lang,email)
SELECT id,lang,email FROM deleted;
The last is good for some reasons:
You don't need to explicit open a transaction if this is all you are going to do (it would make no harm in opening it though);
You can delete lots of users at same time;
PostgreSQL does not need to find for the users at users table more than once (one for SELECT and other for DELETE). The same could be achieved with cursors though.
I have the following query:
SELECT `masters_tp`.*, `masters_cp`.`cp` as cp, `masters_cp`.`punti` as punti
FROM (`masters_tp`)
LEFT JOIN `masters_cp` ON `masters_cp`.`nickname` = `masters_tp`.`nickname`
WHERE `masters_tp`.`stake` = 'report_A'
AND `masters_cp`.`stake` = 'report_A'
ORDER BY `masters_tp`.`tp` DESC, `masters_cp`.`punti` DESC
LIMIT 400;
Is there something wrong with this query that could affect the server memory?
Here is the output of EXPLAIN
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+-------+----------------------------------------------+
| 1 | SIMPLE | masters_cp | ALL | NULL | NULL | NULL | NULL | 8943 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | masters_tp | ALL | NULL | NULL | NULL | NULL | 12693 | Using where |
Run the same query prefixed with EXPLAIN and add the output to your question - this will show what indexes you are using and the number of rows being analyzed.
You can see from your explain that no indexes are being used, and its having to look at thousands of rows to get your result. Try adding an index on the columns used to perform the join, e.g. nickname and stake:
ALTER TABLE masters_tp ADD INDEX(nickname),ADD INDEX(stake);
ALTER TABLE masters_cp ADD INDEX(nickname),ADD INDEX(stake);
(I've assumed the columns might have duplicated values, if not, use UNIQUE rather than INDEX). See the MySQL manual for more information.
Replace the "masters_tp.* " bit by explicitly naming only the fields from that table you actually need. Even if you need them all, name them all.
There's actually no reason to do a left join here. You're using your filters to whisk away any leftiness of the join. Try this:
SELECT
`masters_tp`.*,
`masters_cp`.`cp` as cp,
`masters_cp`.`punti` as punti
FROM
`masters_tp`
INNER JOIN `masters_cp` ON
`masters_tp`.`stake` = `masters_cp`.stake`
and `masters_tp`.`nickname` = `masters_cp`.`nickname`
WHERE
`masters_tp`.`stake` = 'report_A'
ORDER BY
`masters_tp`.`tp` DESC,
`masters_cp`.`punti` DESC
LIMIT 400;
inner joins tend to be faster than left joins. The query can limit the number of rows that have to be joined using the predicates (aka the where clause). This means that the database is handling, potentially, a lot less rows, which obviously speeds things up.
Additionally, make sure you have a non-clustered index on stake and nickname (in that order).
It is simple query. I think everything is ok with it. You can try add indexes on 'stake' fields or make limit lower.