SQL Server - Get Distinct Ids from 2 column and store in one coulmn table - sql

I have the following table
user_one user_two
20151844 2016000
20151844 2017000
2018000 20151844
20151844 20151025
20151036 20151844
Generated by the following query
select * from [dbo].[Contact] C
where C.user_one=20151844 or C.user_two=20151844
I want to get the following result excluding the current user Id 20151844
contact_Ids
2016000
2017000
2018000
20151025
20151036
What is the best optimized way to accomplish it? knowing that i want to join the ids to get the contact name form the user table.
Here's my tables:
Contact
user_one (int FK User.user_id), user_two (int FK User.user_id), status, action_user (int)
User
user_id (int PK), name , ...

another option: iif
select iif(C.user_one=20151844,C.user_two,C.user_one) as contact_IDs
from [dbo].[Contact] C
where C.user_one=20151844 or C.user_two=20151844

Use UNION and INNER JOIN:
SELECT c.[contact_Ids],
u.[name]
FROM (SELECT [user_one] [contact_Ids]
FROM [Contact]
WHERE [user_one] <> 20151844
AND [user_two] = 20151844
UNION
SELECT [user_two] [contact_Ids]
FROM [Contact]
WHERE [user_two] <> 20151844
AND [user_one] = 20151844) c
INNER JOIN [User] u
ON u.[user_id] = c.[contact_Ids]
ORDER BY c.[contact_Ids];

Use apply :
select tt.contact_Ids
from table t cross apply (
values (user_one), (user_two)
) tt (contact_Ids)
group by tt.contact_Ids
having count(*) = 1;

Set operations Union and except works the best.
;with ids as (
select user_one id from contact
union
select user_two from contact
except
select 20151844
)
select u.*
from [user] u
inner join ids on u.user_id = ids.id

Related

How To use Where instead of Group by?

I wrote a query , that gives me this Output :
(This is Just a sample obviously the Output Table contains 300000 rows approximatly)
And This is my Query :
proc sql;
create Table Output as
select ID_User, Division_ID, sum(conta) as Tot_Items, max(Counts) as Max_Item
from (select c.ID_User , c.Div_ID as Division_ID, ro.code as Mat, count(*) as Counts
from Ods.R_Ordini o
inner join DMC.Cust_Dupl c
on User_ID = ID_User
inner join ods.R_Nlines ro
on ro.Orders_Id = o.Id_Orders AND RO.SERVICE = 0
inner join ods.R_Mat m
on ro.Mat_Id = Id_Mat and flag = 0
group by
ID_User,
C.Division_ID,
Ro.Code
Having Counts > 1
)
group by
Id_User,
Division_ID
Order by
Tot_Item DESC
;
quit;
So , What i want is to re-write this Query , but instead of the Group by i want to use the Where Condition , (WHERE=(DIVISION_ID=3)) this is the condition.
I tried several attempts , with some i got errors , and with others i did got an output , but the output was not like the original one.
any help would be much appreciated , thank you.
The SAS data set option (where=(<where-expression>)) can only be coded adjacent to a data set name. So the option would have to be applied to the data set containing the column div_id that is the basis for computed column division_id. That would be table alias c
DMC.Cust_Dupl(where=(div_id=3)) as c
Or just use a normal SQL where clause
…
)
where division_id=3
group by …
Just use WHERE DIVISION_ID=3 before group by.
select ID_User, Division_ID, sum(conta) as Tot_Items, max(Counts) as Max_Item from (select c.ID_User , c.Div_ID as Division_ID, ro.code as Mat, count(*) as Counts from Ods.R_Ordini o inner join DMC.Cust_Dupl c on User_ID = ID_User inner join ods.R_Nlines ro on ro.Orders_Id = o.Id_Orders AND RO.SERVICE = 0 inner join ods.R_Mat m on ro.Mat_Id = Id_Mat and flag = 0 WHERE DIVISION_ID=3 group by ID_User, C.Division_ID, Ro.Code Having Counts > 1 ) group by Id_User, Division_ID Order by Tot_Item DESC

Unable to convert this legacy SQL into Standard SQL in Google BigQuery

I am not able to validate this legacy sql into standard bigquery sql as I don't know what else is required to change here(This query fails during validation if I choose standard SQL as big query dialect):
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name,
salesperson.name,
rate_card.*
FROM (
SELECT
*
FROM
dfp_data.dfp_order_lineitem
WHERE
DATE(end_datetime) >= DATE(DATE_ADD(CURRENT_TIMESTAMP(), -1, 'YEAR'))
OR end_datetime IS NULL ) lineitem
JOIN (
SELECT
*
FROM
dfp_data.dfp_order) porder
ON
lineitem.order_id = porder.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal_lineitem) proposal_lineitem
ON
lineitem.id = proposal_lineitem.dfp_lineitem_id
JOIN (
SELECT
*
FROM
dfp_data.dfp_company) company
ON
porder.advertiser_id = company.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_product) product
ON
proposal_lineitem.product_id=product.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_proposal) proposal
ON
proposal_lineitem.proposal_id=proposal.id
LEFT JOIN (
SELECT
*
FROM
adpoint_data.dfp_rate_card) rate_card
ON
proposal_lineitem.ratecard_id=rate_card.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) trafficker
ON
porder.trafficker_id =trafficker.id
LEFT JOIN (
SELECT
id,
name
FROM
dfp_data.dfp_user) salesperson
ON
porder. salesperson_id =salesperson.id
Most likely the error you are getting is something like below
Duplicate column names in the result are not supported. Found duplicate(s): name
Legacy SQL adjust trafficker.name and salesperson.name in your SELECT statement into respectively trafficker_name and salesperson_name thus effectively eliminating column names duplication
Standard SQL behaves differently and treat both those columns as named name thus producing duplication case. To avoid it - you just need to provide aliases as in example below
SELECT
lineitem.*,
proposal_lineitem.*,
porder.*,
company.*,
product.*,
proposal.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name,
rate_card.*
FROM ( ...
You can easily check above explained using below simplified/dummy queries
#legacySQL
SELECT
porder.*,
trafficker.name,
salesperson.name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
and
#standardSQL
SELECT
porder.*,
trafficker.name AS trafficker_name,
salesperson.name AS salesperson_name
FROM (
SELECT 1 order_id, 'abc' order_name, 1 trafficker_id, 2 salesperson_id
) porder
LEFT JOIN (SELECT 1 id, 'trafficker' name) trafficker
ON porder.trafficker_id =trafficker.id
LEFT JOIN (SELECT 2 id, 'salesperson' name ) salesperson
ON porder. salesperson_id =salesperson.id
Note: if you have more duplicate names - you need to alias all of them too

How to construct SQL query to select chat participants

I have 3 tables. Users(id, fullname), Chats(id, name) and Chats_Participants(chatId, userId). I need to select all the chats in which user with specified id consists. For example:
Chats:
1. 1, 'Test'
Users:
1. 1, 'Test user'
2. 2, 'Test user2'
Chat_Participants:
1. 1(chatId), 1(userId)
2. 1(chatId), 2(userId)
As a result, I need something like this:
1(chatId) 'Test'(chatName) participants(array of users in chat)
First I've wrote this:
select chats.*, json_agg(users) as participants
from chats
inner join chats_participants c2 on chats.id = c2."chatId"
inner join users on c2."userId" = users.id
where users.id = $userId
group by chats.id;
but this query selects only one participant
You can try with array_agg in postgresql. Demo here. Users table is not linked here, printing user id from chat_participants table instead.
SELECT chats.chat_id,chats.name, array_agg(userid order by chats.chat_id)
FROM chats
INNER JOIN chat_participants ON chats.chat_id = chat_participants.chat_id
GROUP BY chats.chat_id,chats.name
You seem to want:
SELECT c.chat_id, c.name,
array_agg(cp.userid order by cp.user_id) as users
FROM chats c INNER JOIN
chat_participants cp
ON c.chat_id = cp.chat_id
GROUP BY c.chat_id, c.name
HAVING SUM( (cp.userid = $userid)::int ) > 0;
This returns only chats that have your user of interest.

how can i get the latest record published by each singer

I have 3 tables
table 1 : songs
-songname varchar
-singerlabel varchar
-date date
-category varchar
table 2 : singer
-singerlabel varchar
-singer# varchar
table 3 : singerNote
-singer# varchar
-firstname varchar
-lastname varchar
table 1 is connected to table 2 using singerlabel.
table 2 is connected to table 3 using singer#.
With this query:
select singerlabel, max(date) maxdate
from songs
group by singerlabel
you get the max date of each singerlabel, and then join to the other 3 tables:
select sn.firstname, sn.lastname, songs.songname
from (
select singerlabel, max(date) maxdate
from songs
group by singerlabel
) s inner join singer
on singer.singerlabel = s.singerlabel
inner join singernote sn
on sn.singer = singer.singer
inner join songs
on songs.singerlabel = s.singerlabel and songs.date = s.maxdate
If your RDBMS supports window functions, this can be achieved with ROW_NUMBER() :
SELECT x.*
FROM (
SELECT
si.*, sn.first_name, sn.last_name, so.songname, so.date, so.category
ROW_NUMBER() OVER(PARTITION BY so.singerlabel ORDER BY so.date DESC) rn
FROM singer si
INNER JOIN singerNote sn ON sn.singer# = si.singer#
INNER JOIN songs so ON so.singerlabel = si.singerlabel
) x WHERE x.rn = 1
Without window function, you can use a correlated subquery with a NOT EXISTS condition to ensure that you are joining with the most recent song :
SELECT si.*, sn.first_name, sn.last_name, so.songname, so.date, so.category
FROM
singer si
INNER JOIN singerNote sn
ON sn.singer# = si.singer#
INNER JOIN songs so
ON so.singerlabel = si.singerlabel
AND NOT EXISTS (
SELECT 1
FROM songs so1
WHERE so1.singerlabel = si.singerlabel AND so1.date > so.date
)

Redshift subquery not accepted

I'm trying to execute the following query against my dataset stored in Redshift:
SELECT v_users.user_id AS user_id,
v_users.first_name AS first_name,
v_users.email AS email,
COALESCE(v_users.country, accounts.region) AS country_code,
profiles.language AS language,
v_users.mobilenum AS mobile_num,
NULL as mobile_verification_date,
COALESCE(v_users.registration_date, accounts.date_created) AS activation_date,
EXISTS (SELECT 1
FROM cds.user_session_201612 AS users_session,
cds.access_logs_summary_201612 AS access_logs_summary,
views_legacy AS views_legacy
WHERE users_session.userid = v_users.user_id
OR access_logs_summary.userid = v_users.user_id
OR views_legacy.user_id = v_users.user_id) AS has_viewed,
NULL as preferred_genre_1,
NULL as preferred_genre_2,
NULL as preferred_genre_3
FROM users AS v_users,
users_metadata AS v_users_metadata,
account.account AS accounts,
account.profile AS profiles
WHERE accounts.id = v_users.user_id
AND profiles.id = v_users.user_id
AND v_users_metadata.user_id = v_users.user_id
The problem which I get is the following:
ERROR: This type of correlated subquery pattern is not supported due to internal error
which is caused by the subquery but how can I solve it? can you provide me some suggestions?
Redshift doesn't allow correlated subqueries in the SELECT clause, which I don't think is a limitation as all the examples I've encountered can be otherwise expressed.
I've refactored the subquery as a CTE, and used a left join with an is not null to mark users who have or not viewed some thing.
This particular query below may not work, but any solution will likely take the following form:
WITH has_viewed AS (
SELECT
u.user_id
FROM users u
LEFT JOIN cds.user_session_201612 AS users_session
ON users_session.userid = u.user_id
LEFT JOIN cds.access_logs_summary_201612 AS access_logs_summary
ON access_logs_summary.userid = users.user_id
LEFT JOIN views_legacy
ON views_legacy.user_id = v_users.user_id
WHERE users_session.userid IS NOT NULL
OR access_logs_summary.userid IS NOT NULL
OR views_legacy.user_id
GROUP BY 1
)
SELECT
v_users.user_id AS user_id
, v_users.first_name AS first_name
, v_users.email AS email
, COALESCE(v_users.country, accounts.region) AS country_code
, profiles.language AS language
, v_users.mobilenum AS mobile_num
, NULL as mobile_verification_date
, COALESCE(v_users.registration_date, accounts.date_created) AS activation_date
, has_viewed.user_id IS NOT NULL AS has_viewed
, NULL as preferred_genre_1
, NULL as preferred_genre_2
, NULL as preferred_genre_3
FROM users AS v_users
JOIN users_metadata AS v_users_metadata
ON v_users_metadata.user_id = v_users.user_id
JOIN account.account AS accounts
ON accounts.id = v_users.user_id
JOIN account.profile AS profiles ON profiles.id = v_users.user_id
LEFT JOIN has_viewed
ON has_viewed.user_id = v_users.user_id
I have tried all possible combinations,
SELECT subquery doesn't work
CTE (Common Table Expression) as shown by Haleemur Ali doesn't work either.
Now what I have tried - I needed an alternative to GROUP BY, as redshift doesn't accept GROUP BY.
So I got this solution -
the OVER keyword.
SO as a replacement for GROUP BY I used OVER and PARTITION BY which goes like -
SELECT *
FROM (
SELECT *,ROW_NUMBER()
OVER (PARTITION BY **VARIOUS COLUMNS** ORDER BY datetime DESC) rn
FROM schema.tableName
) derivedTable
WHERE derivedTable.rn = 1;
Maybe OVER might help you out. I am not sure though.