SQL - get from table and join with same table - sql

I have two tables, ChatRoom and ChatRoomMap, I want to get a list of chatrooms a user belongs to, along with all the other users in each chatroom.
// this contains a map of user to chatroom, listing which user is in what room
CREATE TABLE ChatRoomMap
(
user_id bigint NOT NULL,
chatroom_id text NOT NULL,
CONSTRAINT uniq UNIQUE (userid, roomid)
)
// sample values
==========================
| user_id | chatroom_id |
| 1 | 7 |
| 1 | blue |
| 7 | red |
==========================
And
CREATE TABLE ChatRoom
(
id text NOT NULL,
admin bigint,
name text,
created timestamp without time zone NOT NULL DEFAULT now(),
CONSTRAINT uniqid UNIQUE (id)
)
// sample values
======================================================
| id | admin | name | timestamp |
| blue | 7 | blue room | now() |
| red | 2 | red | now() |
| 7 | 11 | mine | now() |
======================================================
To get a list of rooms a user is in, I can do:
SELECT DISTINCT ON (id) id, userid, name, admin
FROM ChatRoomMap, ChatRoom WHERE ChatRoomMap.user_id = $1 AND ChatRoomMap.chatroom_id = ChatRoom.id
This will get me a distinct list of chat rooms a user is in.
I would like to get the distinct list of rooms along with all the users in each room (concatenation of all as a separate column), how can this be done?
Example result:
=======================================================
| user_id | chatroom_id | name | admin | other_users |
| 10 | 7 | One | 1 | 1, 2, 3, 8 |
| 10 | 4 | AAA | 10 | 7, 11, 15 |
=======================================================

First up, use proper joins - the explicit join syntax was introduced to the SQL92 standard and the major vendors implemented it in the early 2000's (and it's the only way to achieve an outer join).
Try this:
SELECT DISTINCT id, crm2.user_id, name, admin,
FROM ChatRoomMap crm1
JOIN ChatRoom ON crm1.chatroom_id = ChatRoom.id
LEFT JOIN ChatRoomMap crm2 ON crm2.chatroom_id = crm1.chatroom_id
AND crm2.user_id != crm1.user_id -- only other users
WHERE crm1.user_id = $1
The LEFT JOIN is needed in case there are no other users in the room it will still list the room (with a null for other user id).

Related

SQL - Get unique values by key selected by condition

I want to clean a dataset because there are repeated keys that should not be there. Although the key is repeated, other fields do change. On repetition, I want to keep those entries whose country field is not null. Let's see it with a simplified example:
| email | country |
| 1#x.com | null |
| 1#x.com | PT |
| 2#x.com | SP |
| 2#x.com | PT |
| 3#x.com | null |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
Email acts as key, and country is the field which I want to filter by. On email repetition:
Retrieve the entry whose country is not null (case 1)
If there are several entries whose country is not null, retrieve one of them, the first occurrence for simplicity (case 2)
If all the entries' country is null, again, retrieve only one of them (case 3)
If the entry key is not repeated, just retrieve it no matter what its country is (case 4 and 5)
The expected output should be:
| email | country |
| 1#x.com | PT |
| 2#x.com | SP |
| 3#x.com | null |
| 4#x.com | UK |
| 5#x.com | null |
I have thought of doing a UNION or some type of JOIN to achieve this. One possibility could be querying:
SELECT
...
FROM (
SELECT *
FROM `myproject.mydataset.mytable`
WHERE country IS NOT NULL
) AS a
...
and then match it with the full table to add those values which are missing, but I am not able to imagine the way since my experience with SQL is limited.
Also, I have read about the COALESCE function and I think it could be helpful for the task.
Consider below approach
select *
from `myproject.mydataset.mytable`
where true
qualify row_number() over(partition by email order by country nulls last) = 1

SQL design for notification of new registered users

I'm with a great difficulty in formulate a SQL for a module of notifications when a new user register.
I have a database of Notifications, I set up a notification to be sent. Examples:
Send notification when a man and blue eyes register;
Send notification when a woman register;
Send a notification when a blue-eyed woman, brown and work in the company Foo;
With these rules we can see that there can be several possibilities (so the table columns are optional).
Some details:
The table columns are defined as integers because are FK. I just did not put tables because the structure is unnecessary, since the SQL will only relashionship between User and Notification;
The date field is used to store both the date of registration of the notice of such person. So I can only filter to notify the new register of user;
Table Structure
User:
+------------+----------+------+-----+---------+------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+------------+
| Id | int(11) | NO | PRI | | auto_incre |
| Gender | int(11) | YES | | | |
| HairColor | int(11) | YES | | | |
| EyeColor | int(11) | YES | | | |
| Company | int(11) | YES | | | |
| Date | datetime | NO | | | |
| ... | | | | | |
+------------+----------+------+-----+---------+------------+
Notification:
+------------+----------+------+-----+---------+------------+
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+------------+
| Id | int(11) | NO | PRI | | auto_incre |
| Gender | int(11) | YES | | | |
| HairColor | int(11) | YES | | | |
| EyeColor | int(11) | YES | | | |
| Company | int(11) | YES | | | |
| Date | datetime | NO | | | |
+------------+----------+------+-----+---------+------------+
Initial idea
The initial idea I had was doing a select for each possibility and joining via union:
-- Selects new users by gender notification
SELECT *
FROM Notification
inner join User on (
User.Date >= Notification.Date and
Notification.Gender = User.Gender and
Notification.HairColor is null and
Notification.EyeColor is null and
Notification.Company is null
)
union all
-- Selects new users by gender and hair color notification
SELECT *
FROM Notification
inner join User on (
User.Date >= Notification.Date and
Notification.Gender = User.Gender and
Notification.HairColor = User.HairColor and
Notification.EyeColor is null and
Notification.Company is null
)
-- ... and so on, doing a select for each option, resulting in 16 selects (4 columns: gender, hair color, eye color and company)
My question is:
Is there another way I can do this SQL querying all the possibilities of notifications in a more easy?
Following this structure of 4 columns we already have 16 selects. In my real structure will have more columns with something unfeasible to keep it that way.
Is there any other suggestion storage structure of the data for a better way to do this functionality?
SELECT *
FROM Notification
inner join User on (
User.Date >= Notification.Date and
(Notification.Gender is null or Notification.Gender = User.Gender) and
(Notification.HairColor is null or Notification.HairColor = User.HairColor) and
(Notification.EyeColor is null Notification.EyeColor = User.EyeColor) and
(Notification.Company is null or Notification.Company = User.Company)
)
This way you get every set of user with the notification stored in the tables.
This is the way I would implement this user registration / notification functionality:
Three tables: Users, Notif_type, Notif_queue.
A trigger on insert on table Users which calls a stored procedure SendNotification(user_id).
The stored proc will have the logic which you can change overtime without having to modify the schema/data. The logic will be:
to select the type of notification (form Notif_type) the new user should receive based on your rules;
to insert a row in Notif_queue which holds a FK to user_id and notif_type_id, so that the functionality notifying the user is completely de-coupled from the notification rules.
why can't you just use the one table "user" and put an extra field/flag called [Notified] so that every time you want to send notifications just refer it to the flag.
i find it irrelevant to use the notification table.

Postgres Array Prefix Matching

I have an array search in Postgres hat matches at least one tag as this:
SELECT * FROM users WHERE tags && ['fun'];
| id | tags |
| 1 | [fun,day] |
| 2 | [fun,sun] |
It is possible to match on prefixes? Something like:
SELECT * FROM users WHERE tags LIKE 'f%';
| id | tags |
| 1 | [fun,day] |
| 2 | [fun,sun] |
| 3 | [far] |
| 4 | [fin] |
try this
create table users (id serial primary key, tags text[]);
insert into users (tags)
values
('{"fun", "day"}'),
('{"fun", "sun"}'),
('{"test"}'),
('{"fin"}');
select *
from users
where exists (select * from unnest(tags) as arr where arr like 'f%')
SQL FIDDLE EXAMPLE
Here's a working example that should get you more or less what you're after. Note that I am not claiming that this approach will scale...
create table users (
id serial primary key,
tags text[] not null
);
insert into users (tags) values
('{"aaaa","bbbb","cccc"}'::text[]),
('{"badc","dddd","eeee"}'::text[]),
('{"gggg","ffbb","attt"}'::text[]);
select *
from (select id,unnest(tags) arr from users) u
where u.arr like 'a%';
id | arr
----+------
1 | aaaa
3 | attt

obtaining unique/distinct values from multiple unassociated columns

I have a table in a postgresql-9.1.x database which is defined as follows:
# \d cms
Table "public.cms"
Column | Type | Modifiers
-------------+-----------------------------+--------------------------------------------------
id | integer | not null default nextval('cms_id_seq'::regclass)
last_update | timestamp without time zone | not null default now()
system | text | not null
owner | text | not null
action | text | not null
notes | text
Here's a sample of the data in the table:
id | last_update | system | owner | action |
notes
----+----------------------------+----------------------+-----------+------------------------------------- +-
----------------
584 | 2012-05-04 14:20:53.487282 | linux32-test5 | rfell | replaced MoBo/CPU |
34 | 2011-03-21 17:37:44.301984 | linux-gputest13 | apeyrovan | System deployed with production GPU |
636 | 2012-05-23 12:51:39.313209 | mac64-cvs11 | kbhatt | replaced HD |
211 | 2011-09-12 16:58:16.166901 | linux64-test12 | rfell | HD swap |
drive too small
What I'd like to do is craft a SQL query that returns only the unique/distinct values from the system and owner columns (and filling in NULLs if the number of values in one column's results is less than the other column's results), while ignoring the association between them. So something like this:
system | owner
-----------------+------------------
linux32-test5 | apeyrovan
linux-gputest13 | kbhatt
linux64-test12 | rfell
mac64-cvs11 |
The only way that I can figure out to get this data is with two separate SQL queries:
SELECT system FROM cms GROUP BY system;
SELECT owner FROM cms GROUP BY owner;
Far be it from me to inquire why you would want to do such a thing. The following query does this by doing a join, on a calculated column using the row_number() function:
select ts.system, town.owner
from (select system, row_number() over (order by system) as seqnum
from (select distinct system
from t
) ts
) ts full outer join
(select owner, row_number() over (order by owner) as seqnum
from (select distinct owner
from t
) town
) town
on ts.seqnum = town.seqnum
The full outer join makes sure that the longer of the two lists is returned in full.

How to sum values when joining tables?

<hyperbole>Whoever answers this question can claim credit for solving the world's most challenging SQL query, according to yours truly.</hyperbole>
Working with 3 tables: users, badges, awards.
Relationships: user has many awards; award belongs to user; badge has many awards; award belongs to badge. So badge_id and user_id are foreign keys in the awards table.
The business logic at work here is that every time a user wins a badge, he/she receives it as an award. A user can be awarded the same badge multiple times. Each badge is assigned a designated point value (point_value is a field in the badges table). For example, BadgeA can be worth 500 Points, BadgeB 1000 Points, and so on. As further example, let's say UserX won BadgeA 10 times and BadgeB 5 times. BadgeA being worth 500 Points, and BadgeB being worth 1000 Points, UserX has accumulated a total of 10,000 Points ((10 x 500) + (5 x 1000)).
The end game here is to return a list of top 50 users who have accumulated the most badge points.
Can you do it?
My sample tables are:
user:
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| name | varchar(200) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
badge:
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| score | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
award:
+----------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| id | int(11) | YES | | NULL | |
| user_id | int(11) | YES | | NULL | |
| badge_id | int(11) | YES | | NULL | |
+----------+---------+------+-----+---------+-------+
Thus the query is:
SELECT user.name, SUM(score)
FROM badge JOIN award ON badge.id = award.badge_id
JOIN user ON user.id = award.user_id
GROUP BY user.name
ORDER BY 2
LIMIT 50
No, that's not the worlds most challenging query. Something simple like this should do it:
select u.id, u.name, sum(b.points) as Points
from users u
inner join awards a on a.user_id = u.id
inner join badges b on b.id = a.badge_id
group by u.id, u.name
order by 2 desc
limit 50