Annotating rows if backwards relationship exists (Postgres) - sql

So I have 3 tables: Recommendation, Article and User.
Recommendation has 4 columns:
id | integer
article_id |integer
user_id |integer
submit_time |integer
Article has 3 columns:
id | integer
title
url
I need to obtain a list of all articles, while also annotating each row with a new recommended column, which is 1 if the user in question has recommended the article or 0 if not. There shouldn't be any duplicate Article in the result, and I need it ordered by the Recommendation's submit_time column.
This is on Postgres - 9.1.8.
SELECT DISTINCT ON(t.title) t.title,
t.id, t.url,
MAX(recommended) as recommended
FROM (
SELECT submitter_article.title as title,
submitter_article.id as id,
submitter_article.url as url,
1 as recommended
FROM submitter_article, submitter_recommendation
WHERE submitter_recommendation.user_id=?
AND submitter_recommendation.article_id=submitter_article.id
UNION ALL
SELECT submitter_article.title as title,
submitter_article.id as id,
submitter_article.url as url,
0 as recommended
FROM submitter_article
) as t
GROUP BY t.title, t.id, t.url, recommended
And I'm passing a user id into the ?
I've been trying to do this for a while but can't figure it out. The queries I come up with either return all recommended values as 0, or return duplicate Article rows (one with recommended=0 and the other with recommended=1).
Any ideas?

You don't need a subquery, CASE will do, DISTINCT ON is useless if you also use GROUP BY and you should use explicit joins instead of implicit joins. This query should get you started:
SELECT DISTINCT ON (sa.title) sa.title, sa.id, sa.url,
(CASE
WHEN sr.id IS NULL THEN 0
ELSE 1
END) AS recommended
FROM submitter_article AS sa
LEFT JOIN submitter_recommendation AS sr ON sa.id=sr.article_id
AND sr.user_id=?
ORDER BY sa.title,sr.submit_time DESC;
But there are still some things I'm not sure. You can have two articles with the same title but diffrent id? In that case you can select that which has earlier/later recommendation submit_time but what if there are no recommendations? You need logic for how to select distinct rows and for how to order things in the end.

Related

SQL query to return results from one to many table

I'm having difficulties trying to return some data from a poorly structured one to many table.
I've been provided with a data export where everything from 'Section Codes' onwards (in cat_fullxPath) relates to a 'skillID' in my clients database.
The results previously returned on one line but I've used a split function to break these out (from the cat_fullXPath column). You can see the relevant 'skillID' from my clients DB in the far right column:
From here, there are thousands of records that may have a mixture of these skillIDs (and many others, I've just provided this one example). I want to be able to find the records that match all 4 (or however many match from another example) skillIDs and ONLY those.
For example (I just happen to know this ID gives me the results I want):
SELECT
id
skillID
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
AND id = 69580;
This returns me:
Note that these are the only columns in that table.
So this is an ID I'd want to return.
These are results I'd not want to return as one of the skillIDs are missing:
I've created a temp table with a count of all the skills for each ID but I'm not sure if I'm going down the right path at this point
I'm pretty sure that there's a simple solution to this, however I'm hitting my head against the wall. Hope someone can help!
EDIT
This might be a clearer example of when there are different groups of skillIds that I need to align. I've partitioned these off by cat_fullxpath to see if this makes things clearer:
In this screenshot, for example I want to find the ids for everything in table1 where skillID IN (1003914,1005354,1004701) then repeat for (1004659,1004492,1004493,1004701). etc
We know that you need exactly 4 skills, so just make a subquery:
select id from
(
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id;
)
where countSkill = 4;
Could work with sum, instead of count. But instead of filtering by the 4, you filter by 4022352, which is the sum of all skillID.
You can also remove the subquery and use HAVING. But you will obtain worse performance.
SELECT
id
count(skillID) countSkill
FROM table1
WHERE skillID IN ( 1004464, 1006543, 1004605, 1006740 )
group by id
having count(skillID) = 4;
You haven't told us your DBMS. Here is a standard SQL approach:
select id
from table1
group by id
having count(case when skillid = 1004464 then 1 end) > 0
and count(case when skillid = 1006543 then 1 end) > 0
and count(case when skillid = 1004605 then 1 end) > 0
and count(case when skillid = 1006740 then 1 end) > 0
and count(case when skillid not in (1004464, 1006543, 1004605, 1006740) then 1 end) = 0;
Another option is to concatenate all skills and see if the resulting skill list matches the desired skill list. In SQL Server the string aggregation function is STRING_AGG.
select id
from table1
group by id
having string_agg(skillid, ',') within group (order by skillid) in
(
'1004464,1004605,1006543,1006740'
);
You can easily extend the IN clause with other combinations or even get the list from another table. Only make sure the skill IDs in the strings are sorted in order to make the strings comparable ('1004464,1004605,1006543,1006740' <> '1006740,1004464,1004605,1006543').

TypeORM & Postgres: Count only unique distinct values from multiple columns

I have various SQL queries, which return me unique / distinct value from DB, (or count them),
like:
SELECT buyer as counterparty
FROM public.order
UNION
SELECT seller as counterparty
FROM public.order
or
SELECT COUNT(*)
FROM (
SELECT DISTINCT p
FROM public.order
CROSS JOIN LATERAL (VALUES(seller),(buyer)) AS C(p)
) AS internalQuery
Example structure of my table:
id buyer seller
0 A B
1 B A
2 B D
3 D A
4 A D
Desired result:
3 or A,B,D
I'd like to rewrite them with TypORM query builder, but I can't figure out, how to replace CROSS JOIN LATERAL (VALUES(seller),(buyer)) AS C(p) or UNION in my case. TypeORM is pretty poor with examples and doc coverage in this case.
Does there any option with that?
I have seen various methods like .getCount and .distinct(true) which could help me and easily find the solution for one column.
So I understood, that if I want to find the exact number, instead of doc results, I should use .getCount instead of .getMany
But I can't understand, how to select (and unite) values from multiple columns via typeORM to receive distinct values from multiple columns.
I am working with PostgrSQL, so when I am trying:
const query = repository.createQueryBuilder('order')
.distinctOn(['buyer', 'seller'])
.limit(100)
.getMany()
I receive docs with each distinct value in each field, so instead of 3 I get 6 values (3 distinct by column1, and 3 by column2)

How do I generate a table of IDs which have only one attribute each?

I have a table that looks like this
id attribute
1 a
1 a
2 b
2 a
And I want to collect all of the IDs which have ONLY attribute a. So in the example case:
id
1
My initial thought was to use a where, but that would return:
id
1
1
2
Because 2 also has an "a" attribute in one instance.
P.S. I realize the phrasing of the title is ambiguous; maybe there's a better term than attribute to use in this case?
ohh I just saw hive but this is pretty standard sql give it a try.
SELECT
ID
FROM
TABLENAME
GROUP BY
ID
HAVING
COUNT(DISTINCT attribute) = 1
Having is like a where statement after the GROUP BY aggregation has occurred.
HiveQL equivalent of SQL using group by ,having and distinct
select id from (select id,count(distinct attribute) cnt from table_actual group by id having cnt='1') tableouter;

COUNT DISTINCT with CONDITIONS

I want to count the number of distinct items in a column subject to a certain condition, for example if the table is like this:
tag | entryID
----+---------
foo | 0
foo | 0
bar | 3
If I want to count the number of distinct tags as "tag count" and count the number of distinct tags with entry id > 0 as "positive tag count" in the same table, what should I do?
I'm now counting from two different tables where in the second table I've only selected those rows with entryID larger than zero. I think there should be a more compact way to solve this problem.
You can try this:
select
count(distinct tag) as tag_count,
count(distinct (case when entryId > 0 then tag end)) as positive_tag_count
from
your_table_name;
The first count(distinct...) is easy.
The second one, looks somewhat complex, is actually the same as the first one, except that you use case...when clause. In the case...when clause, you filter only positive values. Zeros or negative values would be evaluated as null and won't be included in count.
One thing to note here is that this can be done by reading the table once. When it seems that you have to read the same table twice or more, it can actually be done by reading once, in most of the time. As a result, it will finish the task a lot faster with less I/O.
This may work:
SELECT Count(tag) AS 'Tag Count'
FROM Table
GROUP BY tag
and
SELECT Count(tag) AS 'Negative Tag Count'
FROM Table
WHERE entryID > 0
GROUP BY tag
Try the following statement:
select distinct A.[Tag],
count(A.[Tag]) as TAG_COUNT,
(SELECT count(*) FROM [TagTbl] AS B WHERE A.[Tag]=B.[Tag] AND B.[ID]>0)
from [TagTbl] AS A GROUP BY A.[Tag]
The first field will be the tag the second will be the whole count the third will be the positive ones count.
I agree with #ntalbs solution,
if you want to count a column's data when the condition of another column's data is valid, you can do this
select
count(distinct tag) as tag_count,
count(distinct tag, case when entryId > 0 then tag end) as positive_tag_count
from
your_table_name;
On line 3, I added the column name beside the distinct, so it will count the distinct tags when the entryId is greater than 0
Code counts the unique/distinct combination of Tag & Entry ID when [Entry Id]>0
select count(distinct(concat(tag,entryId)))
from customers
where id>0
In the output it will display the count of unique values
Hope this helps
This may also work:
SELECT
COUNT(DISTINCT T.tag) as DistinctTag,
COUNT(DISTINCT T2.tag) as DistinctPositiveTag
FROM Table T
LEFT JOIN Table T2 ON T.tag = T2.tag AND T.entryID = T2.entryID AND T2.entryID > 0
You need the entryID condition in the left join rather than in a where clause in order to make sure that any items that only have a entryID of 0 get properly counted in the first DISTINCT.

How do I check if all posts from a joined table has the same value in a column?

I'm building a BI report for a client where there is a 1-n related join involved.
The joined table has a field for employee ID (EmplId).
The query that I've built for this report is supposed to give a 1 in its field "OneEmployee" if all the related posts have the same employee in the EmplId field, null if it's different employees, i.e:
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'John'
This should give a 1 in the said field in the query
TaskTrans
TaskTransHours > EmplId: 'John'
TaskTransHours > EmplId: 'George'
This should leave the said field blank
The idea is to create a field where a case function checks this and returns the correct value. But my problem is whereas there is a way to check for this through SQL.
select not count(*) from your_table
where employee_id = GIVEN_ID
and your_field not in ( select min(your_field)
from your_table
where employee_id = GIVEN_ID);
Note: my first idea was to use LIMIT 1 in the inner query, but MYSQL didn't like it, so min it was - the points to use any, but only one. Min should work, but the field should be indexed, then this query will actually execute rather fast, as only indexes would be used (obviously employee_id should also be indexed).
Note2: Do not get too confused with not in front of count(*), you want 1 when there is none that is different, I count different ones, and then give you the not count(*), which will be one if count is 0, otherwise 0.
Seems a job for a window COUNT():
SELECT
…,
CASE COUNT(DISTINCT TaskTransHours.EmplId) OVER () WHEN 1 THEN 1 END
AS OneEmployee
FROM …