How to get an empty result table with FULL JOIN? - sql

I will show you the content of some tables.
Bolek=> SELECT id, description from "TOMBInput";
id | description
----+-------------------
1 | Virtual Input 111
2 | Virtual Input 112
3 | Virtual Input 113
4 | Virtual Input 114
(4 rows)
Bolek=> SELECT id, setup_id FROM "TRBTOMBConnection";
id | setup_id
----+----------
1 | 1
2 | 1
3 | 1
4 | 1
(4 rows)
Bolek=> SELECT id, setname FROM "Setup";
id | setname
----+-------------
1 | SETUP_00001
(1 row)
Bolek=> SELECT id, setup_id FROM "Run";
id | setup_id
----+----------
1 | 1
(1 row)
My query [1] is
SELECT
"TOMBInput".id AS tombinput_id,
"TRBTOMBConnection".id AS trbtombconnection_id,
"Setup".id AS setup_id,
"Run".id AS run_id
FROM "TOMBInput"
INNER JOIN "TRBTOMBConnection" ON "TOMBInput".id = "TRBTOMBConnection".tombinput_id
FULL JOIN "Setup" ON "TRBTOMBConnection".id = "Setup".id
FULL JOIN "Run" ON "Setup".id = "Run".id AND "Run".id = 1;
Result table
tombinput_id | trbtombconnection_id | setup_id | run_id
--------------+----------------------+----------+--------
1 | 1 | 1 | 1
2 | 2 | |
3 | 3 | |
4 | 4 | |
(4 rows)
The question is
I would like to have table like
tombinput_id | trbtombconnection_id | setup_id | run_id
--------------+----------------------+----------+--------
1 | 1 | 1 | 1
2 | 2 | 1 | 1
3 | 3 | 1 | 1
4 | 4 | 1 | 1
(4 rows)
because "TRBTOMBConnection" has got 4 rows with setup_id==1
and "Run" has got setup_id==1.
What is more, now when I change last line (in my query [1])
FULL JOIN "Run" ON "Setup".id = "Run".id AND "Run".id = 2;
(in "Run" table we haven`t got id==2) the result of query is
tombinput_id | trbtombconnection_id | setup_id | run_id
--------------+----------------------+----------+--------
1 | 1 | 1 |
2 | 2 | |
3 | 3 | |
4 | 4 | |
| | | 1
(5 rows)
And it`s ok, because I used FULL JOIN.
But in this case when I run my query [1]
I would like to have an empty result table because "Run" hasn't got id==2 and it hasn't got any sense to show table because everything is starting from Run.
How to change my query [1]?

You are confusing IDs:
SELECT
"TOMBInput".id AS tombinput_id,
"TRBTOMBConnection".id AS trbtombconnection_id,
"Setup".id AS setup_id,
"Run".id AS run_id
FROM "TOMBInput"
INNER JOIN "TRBTOMBConnection" ON "TOMBInput".id = "TRBTOMBConnection".tombinput_id
INNER JOIN "Setup" ON "TRBTOMBConnection".setup_id = "Setup".id
INNER JOIN "Run" ON "Setup".id = "Run".setup_id AND "Run".id = 1;
I see no reason for full outer joins here.

Related

Postgres - Unique values for id column using CTE, Joins alongside GROUP BY

I have a table referrals:
id | user_id_owner | firstname | is_active | user_type | referred_at
----+---------------+-----------+-----------+-----------+-------------
3 | 2 | c | t | agent | 3
5 | 3 | e | f | customer | 5
4 | 1 | d | t | agent | 4
2 | 1 | b | f | agent | 2
1 | 1 | a | t | agent | 1
And another table activations
id | user_id_owner | referral_id | amount_earned | activated_at | app_id
----+---------------+-------------+---------------+--------------+--------
2 | 2 | 3 | 3.0 | 3 | a
4 | 1 | 1 | 6.0 | 5 | b
5 | 4 | 4 | 3.0 | 6 | c
1 | 1 | 2 | 2.0 | 2 | b
3 | 1 | 2 | 5.0 | 4 | b
6 | 1 | 2 | 7.0 | 8 | a
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Here is the query I ran:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select id, app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id )
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
Here is the result I got:
id | activations_count | amount_earned | referred_at | last_activated_at | id | best_selling_app | best_selling_app_count | best_selling_app_rank
----+-------------------+---------------+-------------+-------------------+----+------------------+------------------------+-----------------------
2 | 3 | 14.0 | 2 | 8 | 2 | b | 2 | 1
1 | 1 | 6.0 | 1 | 5 | 1 | b | 1 | 2
2 | 3 | 14.0 | 2 | 8 | 2 | a | 1 | 2
4 | 1 | 3.0 | 4 | 6 | 4 | c | 1 | 2
The problem with this result is that the table has a duplicate id of 2. I only need unique values for the id column.
I tried a workaround by harnessing distinct that gave desired result but I fear the query results may not be reliable and consistent.
Here is the workaround query:
with agents
as
(select
referrals.id,
referral_id,
amount_earned,
referred_at,
activated_at,
activations.app_id
from referrals
left outer join activations
on (referrals.id = activations.referral_id)
where referrals.user_id_owner = 1),
distinct_referrals_by_id
as
(select
id,
count(referral_id) as activations_count,
sum(coalesce(amount_earned, 0)) as amount_earned,
referred_at,
max(activated_at) as last_activated_at
from
agents
group by id, referred_at),
distinct_referrals_by_app_id
as
(select
distinct on(id), app_id as best_selling_app,
count(app_id) as best_selling_app_count
from agents
group by id, app_id
order by id, best_selling_app_count desc)
select *, dense_rank() over (order by best_selling_app_count desc) best_selling_app_rank
from distinct_referrals_by_id
inner join distinct_referrals_by_app_id
on (distinct_referrals_by_id.id = distinct_referrals_by_app_id.id);
I need a recommendation on how best to achieve this.
I am trying to generate another table from the two tables that has only unique values for referrals.id and returns as one of the columns the count for each apps as best_selling_app_count.
Your question is really complicated with a very complicated SQL query. However, the above is what looks like the actual question. If so, you can use:
select r.*,
a.app_id as most_common_app_id,
a.cnt as most_common_app_id_count
from referrals r left join
(select distinct on (a.referral_id) a.referral_id, a.app_id, count(*) as cnt
from activations a
group by a.referral_id, a.app_id
order by a.referral_id, count(*) desc
) a
on a.referral_id = r.id;
You have not explained the other columns that are in your result set.

Check if data for update is same as before in SQL Server

I have a table Table1:
ID | RefID | Answer | Points |
----+-------+---------+--------+
1 | 1 | 1 | 5 |
2 | 1 | 2 | 0 |
3 | 1 | 3 | 3 |
4 | 2 | 1 | 4 |
I have a result set in temp table Temp1 with same structure and have update Table1 only if for refID answer and points have changed, otherwise there should be deletion for this record.
I tried:
update table1
set table1.answer = temp1.answer,
table1.points = temp1.points
from table1
join temp1 on table1.refid = temp1.refid
where table1.answer != temp1.answer or table1.points != temp1.points
Here is a fiddle http://sqlfiddle.com/#!18/60424/1/1
However this is not working and don't know how to add the delete condition.
Desired result should be if tables not the same ex. (second row answer 2 points3):
ID | RefID | Answer | Points |
----+-------+---------+--------+
1 | 1 | 1 | 5 |
2 | 1 | 2 | 3 |
3 | 1 | 3 | 3 |
4 | 2 | 1 | 4 |
if they are same all records with refID are deleted.
Explanation when temp1 has this data
ID | RefID | Answer | Points |
----+-------+---------+--------+
12 | 1 | 1 | 5 |
13 | 1 | 2 | 0 |
14 | 1 | 3 | 3 |
EDIT: adding another id column questionid solved the update by adding this also in join.
Table structure is now:
ID | RefID | Qid |Answer | Points |
----+-------+------+-------+--------+
1 | 1 | 10 | 1 | 5 |
2 | 1 | 11 | 2 | 0 |
3 | 1 | 12 | 3 | 3 |
4 | 2 | 11 | 1 | 4 |
SQL for update is: (fiddle http://sqlfiddle.com/#!18/00f87/1/1) :
update table1
set table1.answer = temp1.answer,
table1.points = temp1.points
from table1
join temp1 on table1.refid = temp1.refid and table1.qid = temp1.qid
where table1.answer != temp1.answer or table1.points != temp1.points;
SELECT ID, refid, answer, points
FROM table1
How can I make the deletion case, if data is same ?
You need to add one more condition in the join to exactly match the column.Try this one.
update table1
set table1.answer=temp1.answer,
table1.points=temp1.points
from
table1 join temp1 on table1.refid=temp1.refid and **table1.ID=temp1.ID**
where table1.answer!=temp1.answer or table1.points!=temp1.points
I would first do the delete, and only then the update.
The reason for this is that once you've deleted all the records where the three columns are the same, your update statement becomes simpler - you only need the join, and no where clause:
DELETE t1
FROM table1 AS t1
JOIN temp1 ON t1.refid = temp1.refid
AND t1.qid = temp1.qid
AND t1.answer=temp1.answer
AND t1.points=temp1.points
UPDATE t1
SET answer = temp1.answer,
points = temp1.points
FROM table1 AS t1
JOIN temp1 ON t1.refid=temp1.refid
AND t1.qid = temp1.qid
I think from what i understood that you need to use id instead of refid or both if id is unique

Efficient ROW_NUMBER increment when column matches value

I'm trying to find an efficient way to derive the column Expected below from only Id and State. What I want is for the number Expected to increase each time State is 0 (ordered by Id).
+----+-------+----------+
| Id | State | Expected |
+----+-------+----------+
| 1 | 0 | 1 |
| 2 | 1 | 1 |
| 3 | 0 | 2 |
| 4 | 1 | 2 |
| 5 | 4 | 2 |
| 6 | 2 | 2 |
| 7 | 3 | 2 |
| 8 | 0 | 3 |
| 9 | 5 | 3 |
| 10 | 3 | 3 |
| 11 | 1 | 3 |
+----+-------+----------+
I have managed to accomplish this with the following SQL, but the execution time is very poor when the data set is large:
WITH Groups AS
(
SELECT Id, ROW_NUMBER() OVER (ORDER BY Id) AS GroupId FROM tblState WHERE State=0
)
SELECT S.Id, S.[State], S.Expected, G.GroupId FROM tblState S
OUTER APPLY (SELECT TOP 1 GroupId FROM Groups WHERE Groups.Id <= S.Id ORDER BY Id DESC) G
Is there a simpler and more efficient way to produce this result? (In SQL Server 2012 or later)
Just use a cumulative sum:
select s.*,
sum(case when state = 0 then 1 else 0 end) over (order by id) as expected
from tblState s;
Other method uses subquery :
select *,
(select count(*)
from table t1
where t1.id < t.id and state = 0
) as expected
from table t;

select records where condition is true in one record

I need to select cid, project, and owner from rows in the table below where one or more rows for a cid/project combination has an owner of 1.
cid | project | phase | task | owner
-----------------------------------
1 | 1 | 1 | 1 | 1
1 | 1 | 1 | 2 | 2
1 | 1 | 1 | 3 | 2
2 | 1 | 1 | 1 | 1
2 | 1 | 1 | 2 | 1
3 | 1 | 1 | 3 | 2
My output table should look like the this:
cid | project | phase | task | owner
-----------------------------------
1 | 1 | 1 | 1 | 1
1 | 1 | 1 | 2 | 2
1 | 1 | 1 | 3 | 2
2 | 1 | 1 | 1 | 1
2 | 1 | 1 | 2 | 1
The below query is what I came up with. It does seem to test okay, but my confidence is low. Is the query an effective way to solve the problem?
select task1.cid, task1.project, task1.owner
from
(select cid, project, owner from table) task1
right join
(select distinct cid, project, owner from table where owner = 1) task2
on task1.cid = task2.cid and task1.project = task2.project
(I did not remove the phase and task columns from the sample output so that it would be easier to compare.)
You can simply use a IN clause
select cid, project, owner
from table
where cid in (select distinct id from table where owner = 1)
or a inner join with a subquery
select a.cid, a.project, a.owner
from table a
INNER JOIN ( select distinct cid , project
from table where owner = 1
) t on t.cid = a.cid and t.project = a.project

Left Join on Associative Table

I have three tables
Prospect -- holds prospect information
id
name
projectID
Sample data for Prospect
id | name | projectID
1 | p1 | 1
2 | p2 | 1
3 | p3 | 1
4 | p4 | 2
5 | p5 | 2
6 | p6 | 2
Conjoint -- holds conjoint information
id
title
projectID
Sample data
id | title | projectID
1 | color | 1
2 | size | 1
3 | qual | 1
4 | color | 2
5 | price | 2
6 | weight | 2
There is an associative table that holds the conjoint values for the prospects:
ConjointProspect
id
prospectID
conjointID
value
Sample Data
id | prospectID | conjointID | value
1 | 1 | 1 | 20
2 | 1 | 2 | 30
3 | 1 | 3 | 50
4 | 2 | 1 | 10
5 | 2 | 3 | 40
There are one or more prospects and one or more conjoints in their respective tables. A prospect may or may not have a value for each conjoint.
I'd like to have an SQL statement that will extract all conjoint values for each prospect of a given project, displaying NULL where there is no value for a value that is not present in the ConjointProspect table for a given conjoint and prospect.
Something along the lines of this for projectID = 1
prospectID | conjoint ID | value
1 | 1 | 20
1 | 2 | 30
1 | 3 | 50
2 | 1 | 10
2 | 2 | NULL
2 | 3 | 40
3 | 1 | NULL
3 | 2 | NULL
3 | 3 | NULL
I've tried using an inner join on the prospect and conjoint tables and then a left join on the ConjointProspect, but somewhere I'm getting a cartesian products for prospect/conjoint pairs that don't make any sense (to me)
SELECT p.id, p.name, c.id, c.title, cp.value
FROM prospect p
INNER JOIN conjoint c ON p.projectID = c.projectid
LEFT JOIN conjointProspect cp ON cp.prospectID = p.id
WHERE p.projectID = 2
ORDER BY p.id, c.id
prospectID | conjoint ID | value
1 | 1 | 20
1 | 2 | 30
1 | 3 | 50
1 | 1 | 20
1 | 2 | 30
1 | 3 | 50
1 | 1 | 20
1 | 2 | 30
1 | 3 | 50
2 | 1 | 10
2 | 2 | 40
2 | 1 | 10
2 | 2 | 40
2 | 1 | 10
2 | 2 | 40
3 | 1 | NULL
3 | 2 | NULL
3 | 3 | NULL
Guidance is very much appreciated!!
Then this will work for you... Prejoin a Cartesian against all prospects and elements within that project via a select as your first FROM table. Then, left join to the conjoinprospect. You can obviously change / eliminate certain columns from result, but at least all is there, in the join you want with exact results you are expecting...
SELECT
PJ.*,
CJP.Value
FROM
( SELECT
P.ID ProspectID,
P.Name,
P.ProjectID,
CJ.Title,
CJ.ID ConJointID
FROM
Prospect P,
ConJoint CJ
where
P.ProjectID = 1
AND P.ProjectID = CJ.ProjectID
ORDER BY
1, 4
) PJ
LEFT JOIN conjointProspect cjp
ON PJ.ProspectID = cjp.prospectID
AND PJ.ConjointID = cjp.conjointid
ORDER BY
PJ.ProspectID,
PJ.ConJointID
Your cartesian product is a result of joining by project Id - in your sample data there are 3 prospects with a project id of 1 and 3 conjoints with a project id of 1. Joining based on project id should then result in 9 rows of data, which is what you're getting. It looks like you really need to join via the conjointprospects table as that it what holds the mapping between prospects and conjoint.
What if you try something like:
SELECT p.id, p.name, c.id, c.title, cp.value
FROM prospect p
LEFT JOIN conjointProspect cp ON cp.prospectID = p.id
RIGHT JOIN conjoint c ON cp.conjointID = c.id
WHERE p.projectID = 2
ORDER BY p.id, c.id
Not sure if that will work, but it seems like conjointprospects needs to be at the center of your join in order to correctly map prospects to conjoints.