Postgresql remove duplicates from table in process of retrieval based on criteria

Postgresql remove duplicates from table in process of retrieval based on criteria - sql

I have a business logic where I want to retrieve translations from the database, The translations can be overridden so overridden translations should be retrieved when available.
Schema:
i18n
-----
id
slug // unique
i18nTranslations
--------------
id
i18nId // referencing i18n.id
langId
text
overriddenType // pageOverride / instanceOverride
i18nPageOverrides
-----------------
id
translationId // referencing i18nTranslations.id
instanceId
pageId
Example:
i18nTranslations
------------------------------------------------------
id i18nId langId text type overrideType
------------------------------------------------------
1 | ABC | En | AAX | static |
2 | ABC | En | AAX Ovd | static | pageOverride
3 | ABC | Tr | TDF | static |
i18nPageOverride
--------------------------
transId pageId instanceId
--------------------------
2 login admin
Expected Output:
------------------------------------------------------
id i18nId langId text type overrideType
------------------------------------------------------
2 | ABC | En | AAX Ovd | static | pageOverride // overridden data
3 | ABC | Tr | TDF | static |
In the expected output above, The row with "AAX" text has been eliminated since it had overridden row for the lang.
Is there any way to achieve this behavior just by using a query?

A DISTINCT ON expression with an ORDER BY could be perfect for this.
The sorting can be on a descending i18nPageOverrides.id, with the nulls sorted last.
DISTINCT ON ( expression [, ...] ) keeps only the first row of each
set of rows where the given expressions evaluate to equal. The
DISTINCT ON expressions are interpreted using the same rules as for
ORDER BY (see above).
SELECT DISTINCT ON (tr.i18nId, tr.langId)
tr.id, tr.i18nId, tr.langId, tr.itText, tr.itType, tr.overrideType
FROM i18nTranslations tr
LEFT JOIN i18nPageOverrides po ON po.translationId = tr.id
ORDER BY tr.i18nId, tr.langId, po.id DESC NULLS LAST, tr.id;
id
i18nid
langid
ittext
ittype
overridetype
2
ABC
En
AAX Ovd
static
pageOverride
3
ABC
Tr
TDF
static
null
Test on db<>fiddle here

You can use ROW_NUMBER window function PARTITION BY and ORDER BY to make a row number for duplicate number then filter rn = 1 rows.
Query 1:
SELECT "iId",
"itI18nId",
"itLangId",
"itText",
"itType",
"itOverrideType"
FROM (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY "itI18nId","itLangId" ORDER BY "itOverrideType","iCreatedAt") rn
FROM i18n i
INNER JOIN i18n_translations t
ON i."iId" = t."itI18nId"
LEFT JOIN i18n_page_override o
ON o."ipoTranslationId" = t."itId"
) t1
WHERE rn = 1
Results:
| iId | itI18nId | itLangId | itText | itType | itOverrideType |
|--------------------------------------|--------------------------------------|--------------------------------------|--------|--------------|----------------|
| b76481bc-1171-4fb3-8433-31302ae39a81 | b76481bc-1171-4fb3-8433-31302ae39a81 | 175376f6-9dc8-4bea-bbc0-bf93744999c9 | Adi | staticNormal | (null) |
| b76481bc-1171-4fb3-8433-31302ae39a81 | b76481bc-1171-4fb3-8433-31302ae39a81 | 875dbdbb-9cb2-4f1b-a8ca-096321a0cd36 | Fn Ovd | staticNormal | stPageOverride |

Related

Oracle SQL query comparing multiple rows with same identifier

I'm honestly not sure how to title this - so apologies if it is unclear.
I have two tables I need to compare. One table contains tree names and nodes that belong to that tree. Each Tree_name/Tree_node combo will have its own line. For example:
Table: treenode
| TREE_NAME | TREE_NODE |
|-----------|-----------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 1 | E |
| 2 | A |
| 2 | B |
| 2 | D |
| 3 | C |
| 3 | D |
| 3 | E |
| 3 | F |
I have another table that contains names of queries and what tree_nodes they use. Example:
Table: queryrecord
| QUERY | TREE_NODE |
|---------|-----------|
| Alpha | A |
| Alpha | B |
| Alpha | D |
| BRAVO | A |
| BRAVO | B |
| BRAVO | D |
| CHARLIE | A |
| CHARLIE | B |
| CHARLIE | F |
I need to create an SQL where I input the QUERY name, and it returns any ‘TREE_NAME’ that includes all the nodes associated with the query. So if I input ‘ALPHA’, it would return TREE_NAME 1 & 2. If I ask it for CHARLIE, it would return nothing.
I only have read access, and don’t believe I can create temp tables, so I’m not sure if this is possible. Any advice would be amazing. Thank you!

You can use group by and having as follows:
Select t.tree_name
From tree_node t
join query_record q
on t.tree_node = q.tree_node
WHERE q.query = 'ALPHA'
Group by t.tree_name
Having count(distinct t.tree_node)
= (Select count(distinct q.tree_node) query_record q WHERE q.query = 'ALPHA');

Using an IN condition (a semi-join, which saves time over a join):
with prep (tree_node) as (select tree_node from queryrecord where query = :q)
select tree_name
from treenode
where tree_node in (select tree_node from prep)
group by tree_name
having count(*) = (select count(*) from prep)
;
:q in the prep subquery (in the with clause) is the bind variable to which you will assign the various QUERY values at runtime.
EDIT
I don't generally set up the test case on online engines; but in a comment below this answer, the OP said the query didn't work for him. So, I set up the example on SQLFiddle, here:
http://sqlfiddle.com/#!4/b575e/2
A couple of notes: for some reason, SQLFiddle thinks table names should be at most eight characters, so I had to change the second table name to queryrec (instead of queryrecord). I changed the name in the query, too, of course. And, second, I don't know how I can give bind values on SQLFiddle; I hard-coded the name 'Alpha'. (Note also that in the OP's sample data, this query value is not capitalized, while the other two are; of course, text values in SQL are case sensitive, so one should pay attention when testing.)

You can do this with a join and aggregation. The trick is to count the number of nodes in query_record before joining:
select qr.query, t.tree_name
from (select qr.*,
count(*) over (partition by query) as num_tree_node
from query_record qr
) qr join
tree_node t
on t.tree_node = qr.tree_node
where qr.query = 'ALPHA'
group by qr.query, t.tree_name, qr.num_tree_node
having count(*) = qr.num_tree_node;
Here is a db<>fiddle.

TSQL Select max value associated with an ID from a table joined on a different ID?

I have 3 tables structured like this:
Shipment
+-------------+--------------------+
| Shipment_ID | Shipment_ID_Master |
+-------------+--------------------+
| 4767 | 4767 |
| 88359 | 28431 |
+-------------+--------------------+
Factory
+------------+-------------+
| Factory_ID | Shipment_ID |
+------------+-------------+
| 338161 | 4767 |
| 1178567 | 88359 |
| 1178568 | 88359 |
+------------+-------------+
Coverage
+------------+-----------+----------+
| Factory_ID | Public_ID | Revision |
+------------+-----------+----------+
| 338161 | 2354 | 2 |
| 1178567 | 32436 | 4 |
| 1178568 | 2354 | 3 |
+------------+-----------+----------+
I am trying to build a view that displays a row for only the max Public_ID associated with a Shipment_ID. The view should look like this:
+-------------+--------------------+------------+-----------+----------+
| Shipment_ID | Shipment_ID_Master | Factory_ID | Public_ID | Revision |
+-------------+--------------------+------------+-----------+----------+
| 4767 | 4767 | 338161 | 2354 | 2 |
| 88359 | 28431 | 1178567 | 32436 | 4 |
+-------------+--------------------+------------+-----------+----------+
I have a query that works to build this view, but it is too slow. When my application joins on this view the query is taking several minutes to finish execution. This is the query:
SELECT f.Shipment_ID,
s.Shipment_ID_Master,
f.Factory_ID,
c.Public_ID,
c.Revision
FROM Coverage c
JOIN Factory f ON c.Factory_ID = f.Factory_ID
JOIN Shipment s ON s.Shipment_ID = f.Shipment_ID
WHERE Public_ID = (
SELECT MAX(Public_ID)
FROM Coverage c2
JOIN Factory f2 ON c2.Factory_ID = f2.Factory_ID
WHERE f2.Shipment_ID = f.Shipment_ID
)
I think referencing this view is so slow because of the logic in the where clause. There must be a better and faster way to do this.
How can I select the maximum Public_ID associated with a Shipment_ID when the Shipment_ID is not stored on the same table as the Public_ID? Is it possible to do this without a where clause?

You can use Row_number to get the better performance over the given solution.
Try the following:
;WITH cte AS
(
SELECT s.Shipment_ID, s.Shipment_ID_Master, f.Factory_ID, c.Public_ID, c.Revision, row_number() OVER (PARTITION BY s.Shipment_ID ORDER BY c.Public_ID desc) AS rn
FROM #Shipment s
JOIN #Factory f ON f.Shipment_ID = s.Shipment_ID
JOIN #Coverage c ON c.Factory_ID = f.Factory_ID
)
SELECT c.Shipment_ID, c.Shipment_ID_Master, c.Factory_ID, c.Public_ID, c.Revision
FROM cte c WHERE rn = 1
Please see db<>fiddle here.

SQL conditional join with default values

I'm kinda struggling with this query, I have the following table:
setting
-------
id | name | value | type
--------------------------------
1 | title | Hi | string
2 | color | #ff0000 | string
user_setting
-------
id | userId | settingId | value
--------------------------------
1 | 1 | 1 | Hello
user
-------
id | email
1 | foo#test.com
I want to run a query that will select all settings for user 1, but also include the default value, so ideally I get this:
id | default | value
-----------------------
title | Hi | Hello
color | #ff0000 | null
My current query is
SELECT setting.id, setting.name, setting.value, user_setting.value, user.id
FROM setting
RIGHT JOIN user_setting
ON setting.id = "user_setting"."settingId"
LEFT OUTER JOIN user
ON "user_setting"."userId" = user.id
WHERE user.id = 1
But this only gives me the values that the user has defined.
EDIT: Updated setting table

I think you want a left join. But I think your setting is missing a column for setting_id (or whatever it is called). So the table should really be:
setting
-------
id | name | value | type
--------------------------------
1 | title | Hi | string
2 | color | #ff0000 | string
Otherwise user_settings.setting_id doesn't refer to anything. With this column, you want:
select s.name, s.value as default, us.value
from setting s left join
user_setting us
on us.setting_id = s.id and us.user_id = 1

ORDER BY FIELD LIST - Subquery returns more than 1 row

What i want to do is quite simple:
Write an SQL that will return a bunch of record and order the records by some list of id from the FIELD LIST section of my SQL
TABLE SAMPLE
lessons
+----+----------------------+
| id | name |
+----+----------------------+
| 9 | Greedy algorithms |
| 5 | Maya civilization |
| 3 | eFront Beginner |
| 2 | eFront Intermediate |
+----+----------------------+
mod_comp_rule
+----+---------------------+
| id | lesson_id | comp_id |
+----+---------------------+
| 1 | 3 | 1 |
| 2 | 2 | 1 |
| 3 | 9 | 2 |
+----+---------------------+
WHAT I WANT TO GET FROM MY QUERY
SELECT * FROM lessons ORDER BY FIELD(id,'3','2','9') ASC;
MY SQL
SELECT ls.id, ls.name
FROM lessons ls
ORDER BY FIELD(ls.id,
(SELECT mcr.lesson_id FROM mod_comp_rule mcr
INNER JOIN lessons ls ON ls.id = mcr.lesson_id))
My SQL Query returned the following error
MySQL said: #1242 - Subquery returns more than 1 row
So how can i make my SQL return FIELD(id,'3','2','9') without flagging the more than 1 row error ?

I don't see why FIELD() is needed for this. A correlated query will do what you want:
SELECT ls.id, ls.name
FROM lessons ls
ORDER BY (SELECT mcr.id FROM mod_comp_rule mcr WHERE ls.id = mcr.lesson_id);

Fetch SQL rows with priority

I have following tables structure:
forms
RID | MODULE
------------
1 | indiv
2 | indiv
3 | indiv
translations
RID | LANG | VALUE | MODULE | TAG |
-----------------------------------
1 | en |car | | |
1 | en |truck |indiv | |
1 | en |boat |indiv |C100 |
2 | en |hat | | |
3 | en |cat | | |
3 | en |dog |indiv | |
4 | en |light | | |
5 | en |dark | | |
I need to fetch only one row per RID from translations table, based on additional (but not mandatory) parameters for module and tag columns, i.e.:
RESULT without input parameters:
RID | LANG | VALUE | MODULE | TAG |
-----------------------------------
1 | en |car | | |
2 | en |hat | | |
3 | en |cat | | |
RESULT with one input parameter module='indiv':
RID | LANG | VALUE | MODULE | TAG |
-----------------------------------
1 | en |truck |indiv | |
2 | en |hat | | |
3 | en |dog |indiv | |
If I have two input parameters the result to be:
RESULT with two parameters: module='indiv' AND tag='c100'
RID | LANG | VALUE | MODULE | TAG |
-----------------------------------
1 | en |boat |indiv |C100 |
2 | en |hat | | |
3 | en |dog |indiv | |
How can I achieve this with SQL only on ORACLE DB server? A query example for the last case with two parameters will be enough for me as previous cases are subsets from last one with NULL of these columns I believe. If you think that all these cases are too different and require different SQL statements, you are more than welcome to write them as well.
Thank you!

SELECT *
FROM (
SELECT t.*,
ROW_NUMBER() OVER (
PARTITION BY RID
ORDER BY CASE
WHEN module = LOWER( :mod ) AND tag = UPPER( :tag ) THEN 1
WHEN tag = UPPER( :tag ) THEN 2
WHEN module = LOWER( :mod ) AND tag IS NULL THEN 3
WHEN module IS NULL AND tag IS NULL THEN 4
ELSE 5
END
) AS rn
FROM translations t
WHERE ( module IS NULL OR module = LOWER( :mod ) )
OR ( tag IS NULL OR tag = UPPER( :tag ) )
)
WHERE rn = 1;

I think this should do.
SELECT *
FROM (
SELECT F.RID,
T.LANG,
T.VALUE,
T.MODULE,
T.TAG,
RANK() OVER(PARTITION BY F.RID
ORDER BY DECODE(T.MODULE, :module, 1, 2),
DECODE(T.TAG, :tag, 1, 2)) RANK
FROM forms F INNER JOIN translations T
ON T.RID = F.RID)
WHERE RANK = 1
So you rank rows with MODULE = :module or/and TAG = :tag higher. You still need something to do with ties, but you get the idea. RANK leaves ties, ROW_NUMBER does not.
I put MODULE higher than TAG because of your examples. You might need to change it if you can input tags without modules.
And also, DECODE maps NULL to NULL, so if :module is not set, you will get match with rows having NULL in MODULE.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Postgresql remove duplicates from table in process of retrieval based on criteria - sql

Related

Oracle SQL query comparing multiple rows with same identifier

TSQL Select max value associated with an ID from a table joined on a different ID?

SQL conditional join with default values

ORDER BY FIELD LIST - Subquery returns more than 1 row

Fetch SQL rows with priority

Categories

Resources