How can I get join results as JSON in a PostgreSQL query? - sql

I have the query below that almost works: It returns 3 rows one of which should have first_nation populated (other two should be NULL). But they all get the same data for first_nation. What I need is the person.id from the outer where to be a part of the WHERE in the inner query but I don't think that's doable. Any help would be appreciated.
Or another way, I'd like the results of the JOIN to be JSON rather than appearing as additional columns.
SELECT person.id,
(
SELECT row_to_json(x)
FROM (
SELECT ref_first_nations_gov.id
FROM ref_first_nations_gov JOIN person ON person.first_nation_id = ref_first_nations_gov.id
WHERE person.application_id = 1 AND person.archived = false
) x
) AS first_nation
FROM person
WHERE application_id = 1 AND archived = false;
EDIT: Sample Data
SELECT id, application_id, first_nation_id FROM person WHERE application_id = 1;
id | application_id | first_nation_id
----+----------------+-----------------
4 | 1 |
1 | 1 |
2 | 1 |
3 | 1 | 1
What the query above gives me:
id | first_nation
----+--------------
4 | {"id":1}
1 | {"id":1}
3 | {"id":1}
What I want
id | first_nation
----+--------------
4 |
1 |
3 | {"id":1}

Even though I don't have how to test this right now, I don't think you need a subquery.
Try something like this.
SELECT p.id, row_to_json(r.id) FROM person p
FULL OUTER JOIN ref_first_nations_gov r on r.id = p.first_nation_id
WHERE p.application_id = 1 AND p.archived = false;

Related

Comparing different columns in SQL for each row

after some transformation I have a result from a cross join (from table a and b) where I want to do some analysis on. The table for this looks like this:
+-----+------+------+------+------+-----+------+------+------+------+
| id | 10_1 | 10_2 | 11_1 | 11_2 | id | 10_1 | 10_2 | 11_1 | 11_2 |
+-----+------+------+------+------+-----+------+------+------+------+
| 111 | 1 | 0 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
| 111 | 1 | 0 | 1 | 0 | 333 | 0 | 0 | 0 | 0 |
| 111 | 1 | 0 | 1 | 0 | 444 | 1 | 0 | 1 | 1 |
| 112 | 0 | 1 | 1 | 0 | 222 | 1 | 0 | 1 | 0 |
+-----+------+------+------+------+-----+------+------+------+------+
The ids in the first column are different from the ids in the sixth column.
In a row are always two different IDs that are matched with each other. The other columns always have either 0 or 1 as a value.
I am now trying to find out how many values(meaning both have "1" in 10_1, 10_2 etc) two IDs have on average in common, but I don't really know how to do so.
I was trying something like this as a start:
SELECT SUM(CASE WHEN a.10_1 = 1 AND b.10_1 = 1 then 1 end)
But this would obviously only count how often two ids have 10_1 in common. I could make something like this for example for different columns:
SELECT SUM(CASE WHEN (a.10_1 = 1 AND b.10_1 = 1)
OR (a.10_2 = 1 AND b.10_1 = 1) OR [...] then 1 end)
To count in general how often two IDs have one thing in common, but this would of course also count if they have two or more things in common. Plus, I would also like to know how often two IDS have two things, three things etc in common.
One "problem" in my case is also that I have like ~30 columns I want to look at, so I can hardly write down for each case every possible combination.
Does anyone know how I can approach my problem in a better way?
Thanks in advance.
Edit:
A possible result could look like this:
+-----------+---------+
| in_common | count |
+-----------+---------+
| 0 | 100 |
| 1 | 500 |
| 2 | 1500 |
| 3 | 5000 |
| 4 | 3000 |
+-----------+---------+
With the codes as column names, you're going to have to write some code that explicitly references each column name. To keep that to a minimum, you could write those references in a single union statement that normalizes the data, such as:
select id, '10_1' where "10_1" = 1
union
select id, '10_2' where "10_2" = 1
union
select id, '11_1' where "11_1" = 1
union
select id, '11_2' where "11_2" = 1;
This needs to be modified to include whatever additional columns you need to link up different IDs. For the purpose of this illustration, I assume the following data model
create table p (
id integer not null primary key,
sex character(1) not null,
age integer not null
);
create table t1 (
id integer not null,
code character varying(4) not null,
constraint pk_t1 primary key (id, code)
);
Though your data evidently does not currently resemble this structure, normalizing your data into a form like this would allow you to apply the following solution to summarize your data in the desired form.
select
in_common,
count(*) as count
from (
select
count(*) as in_common
from (
select
a.id as a_id, a.code,
b.id as b_id, b.code
from
(select p.*, t1.code
from p left join t1 on p.id=t1.id
) as a
inner join (select p.*, t1.code
from p left join t1 on p.id=t1.id
) as b on b.sex <> a.sex and b.age between a.age-10 and a.age+10
where
a.id < b.id
and a.code = b.code
) as c
group by
a_id, b_id
) as summ
group by
in_common;
The proposed solution requires first to take one step back from the cross-join table, as the identical column names are super annoying. Instead, we take the ids from the two tables and put them in a temporary table. The following query gets the result wanted in the question. It assumes table_a and table_b from the question are the same and called tbl, but this assumption is not needed and tbl can be replaced by table_a and table_b in the two sub-SELECT queries. It looks complicated and uses the JSON trick to flatten the columns, but it works here:
WITH idtable AS (
SELECT a.id as id_1, b.id as id_2 FROM
-- put cross join of table a and table b here
)
SELECT in_common,
count(*)
FROM
(SELECT idtable.*,
sum(CASE
WHEN meltedR.value::text=meltedL.value::text THEN 1
ELSE 0
END) AS in_common
FROM idtable
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_a
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedL ON (idtable.id_1 = meltedL.id)
JOIN
(SELECT tbl.id,
b.*
FROM tbl, -- change here to table_b
json_each(row_to_json(tbl)) b -- and here too
WHERE KEY<>'id' ) meltedR ON (idtable.id_2 = meltedR.id
AND meltedL.key = meltedR.key)
GROUP BY idtable.id_1,
idtable.id_2) tt
GROUP BY in_common ORDER BY in_common;
The output here looks like this:
in_common | count
-----------+-------
2 | 2
3 | 1
4 | 1
(3 rows)

T-SQL Select Join 3 Tables

I'm currently working on a select query in T-SQL on SQL Server 2012. It's a complex query, I want to query a list from 3 tables. The result should look something like this:
Desired Output:
ProjectId | Title | Manager | Contact | StatusId
----------+-------------+-----------+-----------+-----------
1 | projectX | 1123 | 4453 | 1
2 | projectY | 2245 | 5567 | 1
3 | projectZ | 3335 | 8899 | 1
My 3 Tables:
1) Project: ProjectId, ProjectDataId, MemberVersionId
2) ProjectData: ProjectDataId, Title, StatusId
3) Members: MemberId, MemberVersionId, MemberTypeId, EmployeeId
The tricky part is, to implement versioning. Thus, over time the project Members can change, and it should always be possible to return to a previous version, that's why I use MemberVersionId as a foreign key inbetween Project and Members. The tables Project and ProjectData a linked with ProjectDataId.
Hence, 1 Project has 1 OfferData and 1 Project has N Members.
Some sample data:
Project
ProjectId | ProjectDataId | MemberVersionId |
----------+---------------+-----------------+
1 | 2 | 1 |
2 | 3 | 1 |
3 | 4 | 1 |
ProjectData
ProjectDataId | Title | StatusId
--------------+-------------+-----------
2 | projectX | 1
3 | projectY | 1
4 | projectZ | 1
Members: MemberTypeId 1 = Manager, MemberTypeId 2 = Contact, 3 = Other
MemberId | MemberVersionId | MemberTypeId | EmployeeId |
---------+-----------------+--------------+------------+
1 | 1 | 1 | 1123 |
2 | 1 | 2 | 4453 |
3 | 1 | 3 | 9999 |
4 | 2 | 1 | 2245 |
5 | 2 | 2 | 5567 |
6 | 2 | 3 | 9999 |
7 | 3 | 1 | 3335 |
8 | 3 | 2 | 8899 |
9 | 3 | 3 | 9999 |
My current query looks like this:
SELECT ProjectId, Title, EmployeeId AS Manager, EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b,
[MySchema].[Members] c
WHERE a.ProjectDataId = b.ProjectDataId
AND a.MemberVersionId = c.MemberVersionId
Unfortunately this doesn't work yet. Do you know how to solve this issue?
Thanks
Something like this?
SELECT
p.ProjectId,
pd.Title,
mm.EmployeeId AS Manager,
mc.EmployeeId AS Contact,
pd.StatusId
FROM
[MySchema].[Project] p
INNER JOIN [MySchema].[ProjectData] pd ON pd.ProjectDataId = p.ProjectDataId
INNER JOIN [MySchema].[Members] mm ON mm.MemberVersionId = p.MemberVersionId AND mm.MemberTypeId = 1
INNER JOIN [MySchema].[Members] mc ON mc.MemberVersionId = p.MemberVersionId AND mc.MemberTypeId = 2;
You can try this:
SELECT ProjectId, Title, C.EmployeeId AS Manager, d.EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a
INNER JOIN [MySchema].[ProjectData] b ON A.ProjectDataId=B.ProjectDataId
LEFT JOIN (SELECT * FROM [MySchema].[Members] WHERE MemberTypeID=1) c ON a.MemberVersionId=c.MemberVersionId
LEFT JOIN (SELECT * FROM [MySchema].[Members] WHERE MemberTypeID=2) d ON a.MemberVersionId=d.MemberVersionId
You must select members two times, one for the manager and another for contact:
SELECT ProjectId, Title, m.EmployeeId AS Manager, c.EmployeeId AS
Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b,
[MySchema].[Members] m
[MySchema].[Members] c
WHERE a.ProjectDataId = b.ProjectDataId
AND a.MemberVersionId = m.MemberVersionId and m.MemberTypeId = 1
AND a.MemberVersionId = c.MemberVersionId and c.MemberTypeId = 2
try this,
SELECT ProjectId, Title, cmanager.EmployeeId AS Manager, ccon.EmployeeId AS
Contact, StatusId
from [MySchema].[ProjectData] b
inner join [MySchema].[Project] a on b.ProjectDataId=a.ProjectDataId
left join [MySchema].[Members] cmanager on cmanager.MemberVersionId =
a.MemberVersionId and cmanager.MemberTypeId=1
left join [MySchema].[Members] ccon on ccon.MemberVersionId =
a.MemberVersionId and ccon.MemberTypeId=2
The simplest solution to your problem would be introducing additional field to Project table. You'd either call it LatestMemberVersion (int, holds the currently highest MemberVersionId), which would by the most up to date version of the relationship, your you can add even simpler IsLatestMemberVersion (bit, holds 1 if the record is the latest/active). You can compute both of them using ROW_NUMBER() OVER statement.
Then, the query would change to:
SELECT ProjectId, Title, EmployeeId AS Manager, EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b ON a.ProjectDataId = b.ProjectDataId
[MySchema].[Members] c ON a.MemberVersionId = c.MemberVersionId
WHERE
a.[IsLatestMemberVersion] = 1 -- alternative is a.[LatestMemberVersion] = a.[MemberVersionId]
Additionally, there are two more things you can try:
you might want to borrow ideas from data warehousing, namely you will want to have combination of Slowly Changing Dimension Type 1 and 2
you can try to use SQL Server features, such as Change Data Tracking. But I have no experience with that, so it's possible it'll lead to nowhere.
And one last piece of advice, if you can, never write join conditions into the WHERE clause. It is not readable and can lead to problems when you suddenly change JOIN to LEFT JOIN. Microsoft itself recommends using ON instead of WHERE when applicable.

Need T-SQL query to get multiple choice answer if matches

Example:
Table Question_Answers:
+------+--------+
| q_id | ans_id |
+------+--------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
+------+--------+
User_Submited_Answers:
| q_id | sub_ans_id |
+------+------------+
| 1 | 2 |
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 3 | 2 |
| 3 | 4 |
+------+------------+
I need a T-SQL query if this rows matches count 1 else 0
SELECT
t1.q_id,
CASE WHEN COUNT(t2.sub_ans_id) = COUNT(*)
THEN 1
ELSE 0 END AS is_correct
FROM Question_Answers t1
LEFT JOIN User_Submited_Answers t2
ON t1.q_id = t2.q_id AND
t1.ans_id = t2.sub_ans_id
GROUP BY t1.q_id
Try the following code:
select qa.q_id,case when qa.ans_id=sqa.ans_id then 1 else 0 end as result from questionans qa
left join subquestionans sqa
on qa.q_id=sqa.q_id and qa.ans_id=sqa.ans_id
This should give you expected result for every question.
select q_id, min(Is_Correct)Is_Correct from (
select Q.q_id,case when count(A.sub_ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q left join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by Q.q_id
UNION ALL
select A.q_id,case when count(Q.ans_id)=count(*) then 1 else 0 end as Is_Correct
from #Q Q right join #A A on Q.q_id=A.q_id and Q.ans_id=A.sub_ans_id
group by A.q_id ) I group by q_id
MySQL solution (sql fiddle):
SELECT tmp.q_id, MIN(c) as correct
FROM (
SELECT qa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT usa.q_id, IF(qa.q_id = usa.q_id, 1, 0) as c
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
) tmp
GROUP BY tmp.q_id;
Now, step by step explanation:
In order to get the right output we will need to:
extract from question_answers table the answers which were not filled in by the user (in your example: q_id = 3 with ans_id = 3)
extract from user_submited_answers table the wrong answers which were filled in by the user (in your example: q_id = 3 with sub_ans_id = 4)
To do that we can use a full outer join (for mysql left join + right join):
SELECT *
FROM question_answers qa
LEFT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id
UNION
SELECT *
FROM question_answers qa
RIGHT JOIN user_submited_answers usa
ON qa.q_id = usa.q_id AND qa.ans_id = usa.sub_ans_id;
From the previous query results, the rows which we are looking for (wrong answers) contains NULL values (based on the case, in question_answers table or user_submited_answers table).
The next step is to mark those rows with 0 (wrong answer) using an IF or CASE statement: IF(qa.q_id = usa.q_id, 1, 0).
To get the final output we need to group by q_id and look for 0 values in the grouped rows. If there is at least one 0, the answer for that question is wrong and it should be marked as that.
Check sql fiddle: SQL Fiddle

SQL left join two tables independently

If I have these tables:
Thing
id | name
---+---------
1 | thing 1
2 | thing 2
3 | thing 3
Photos
id | thing_id | src
---+----------+---------
1 | 1 | thing-i1.jpg
2 | 1 | thing-i2.jpg
3 | 2 | thing2.jpg
Ratings
id | thing_id | rating
---+----------+---------
1 | 1 | 6
2 | 2 | 3
3 | 2 | 4
How can I join them to produce
id | name | rating | photo
---+---------+--------+--------
1 | thing 1 | 6 | NULL
1 | thing 1 | NULL | thing-i1.jpg
1 | thing 1 | NULL | thing-i2.jpg
2 | thing 2 | 3 | NULL
2 | thing 2 | 4 | NULL
2 | thing 2 | NULL | thing2.jpg
3 | thing 3 | NULL | NULL
Ie, left join on each table simultaneously, rather than left joining on one than the next?
This is the closest I can get:
SELECT Thing.*, Rating.rating, Photo.src
From Thing
Left Join Photo on Thing.id = Photo.thing_id
Left Join Rating on Thing.id = Rating.thing_id
You can get the results you want with a union, which seems the most obvious, since you return a field from either ranking or photo.
Your additional case (have none of either), is solved by making the joins left join instead of inner joins. You will get a duplicate record with NULL, NULL in ranking, photo. You can filter this out by moving the lot to a subquery and do select distinct on the main query, but the more obvious solution is to replace union all by union, which also filters out duplicates. Easier and more readable.
select
t.id,
t.name,
r.rating,
null as photo
from
Thing t
left join Rating r on r.thing_id = t.id
union
select
t.id,
t.name,
null,
p.src
from
Thing t
left join Photo p on p.thing_id = t.id
order by
id,
photo,
rating
Here's what I came up with:
SELECT
Thing.*,
rp.src,
rp.rating
FROM
Thing
LEFT JOIN (
(
SELECT
Photo.src,
Photo.thing_id AS ptid,
Rating.rating,
Rating.thing_id AS rtid
FROM
Photo
LEFT JOIN Rating
ON 1 = 0
)
UNION
(
SELECT
Photo.src,
Photo.thing_id AS ptid,
Rating.rating,
Rating.thing_id AS rtid
FROM
Rating
LEFT JOIN Photo
ON 1 = 0
)
) AS rp
ON Thing.id IN (rp.rtid, rp.ptid)
MySQL has no support for full outer joins so you have to hack around it using a UNION:
Here's the fiddle: http://sqlfiddle.com/#!2/d3d2f/13
SELECT *
FROM (
SELECT Thing.*,
Rating.rating,
NULL AS photo
FROM Thing
LEFT JOIN Rating ON Thing.id = Rating.thing_id
UNION ALL
SELECT Thing.*,
NULL,
Photo.src
FROM Thing
LEFT JOIN Photo ON Thing.id = Photo.thing_id
) s
ORDER BY id, photo, rating

SQL return max value from child for each parent row

I have 2 tables - 1 with parent records, 1 with child records. For each parent record, I'm trying to return a single child record with the MAX(SalesPriceEach).
Additionally I'd like to only return a value when there is more than 1 child record.
parent - SalesTransactions table:
+-------------------+---------+
|SalesTransaction_ID| text |
+-------------------+---------+
| 1 | Blah |
| 2 | Blah2 |
| 3 | Blah3 |
+-------------------+---------+
child - SalesTransactionLines table
+--+-------------------+---------+--------------+
|id|SalesTransaction_ID|StockCode|SalesPriceEach|
+--+-------------------+---------+--------------+
| 1| 1 | 123 | 99 |
| 2| 1 | 35 | 50 |
| 3| 2 | 15 | 75 |
+--+-------------------+---------+--------------+
desired results
+-------------------+---------+--------------+
|SalesTransaction_ID|StockCode|SalesPriceEach|
+-------------------+---------+--------------+
| 1 | 123 | 99 |
| 2 | 15 | 75 |
+-------------------+---------+--------------+
I found a very similar question here, and based my query on the answer but am not seeing the results I expect.
WITH max_feature AS (
SELECT c.StockCode,
c.SalesTransaction_ID,
MAX(c.SalesPriceEach) as feature
FROM SalesTransactionLines c
GROUP BY c.StockCode, c.SalesTransaction_ID)
SELECT p.SalesTransaction_ID,
mf.StockCode,
mf.feature
FROM SalesTransactions p
LEFT JOIN max_feature mf ON mf.SalesTransaction_ID = p.SalesTransaction_ID
The results from this query are returning multiple rows for each parent, and not even the highest value first!
select stl.SalesTransaction_ID, stl.StockCode, ss.MaxSalesPriceEach
from SalesTransactionLines stl
inner join
(
select stl2.SalesTransaction_ID, max(stl2.SalesPriceEach) MaxSalesPriceEach
from SalesTransactionLines stl2
group by stl2.SalesTransaction_ID
having count(*) > 1
) ss on (ss.SalesTransaction_ID = stl.SalesTransaction_ID and
ss.MaxSalesPriceEach = stl.SalesPriceEach)
OR, alternatively:
SELECT stl1.*
FROM SalesTransactionLines AS stl1
LEFT OUTER JOIN SalesTransactionLines AS stl2
ON (stl1.SalesTransaction_ID = stl2.SalesTransaction_ID
AND stl1.SalesPriceEach < stl2.SalesPriceEach)
WHERE stl2.SalesPriceEach IS NULL;
I know I'm a year late to this party but I always prefer using Row_Number in these situations. It solves the problem when there are two rows that meet your Max criteria and makes sure that only one row is returned:
with z as (
select
st.SalesTransaction_ID
,row=ROW_NUMBER() OVER(PARTITION BY st.SalesTransaction_ID ORDER BY stl.SalesPriceEach DESC)
,stl.StockCode
,stl.SalesPriceEach
from
SalesTransactions st
inner join SalesTransactionLines stl on stl.SalesTransaction_ID = st.SalesTransaction_ID
)
select * from z where row = 1
SELECT SalesTransactions.SalesTransaction_ID,
SalesTransactionLines.StockCode,
MAX(SalesTransactionLines.SalesPriceEach)
FROM SalesTransactions RIGHT JOIN SalesTransactionLines
ON SalesTransactions.SalesTransaction_ID = SalesTransactionLines.SalesTransaction_ID
GROUP BY SalesTransactions.SalesTransaction_ID, alesTransactionLines.StockCode;
select a.SalesTransaction_ID, a.StockCode, a.SalesPriceEach
from SalesTransacions as a
inner join (select SalesTransaction_ID, MAX(SalesPriceEach) as SalesPriceEach
from SalesTransactionLines group by SalesTransaction_ID) as b
on a.SalesTransaction_ID = b.SalesTransaction_ID
and a.SalesPriceEach = b.SalesPriceEach
subquery returns table with trans ids and their maximums so just join it with transactions table itself by those 2 values