How can I have a duplicated row to show only one row with the most recent updating of the the table in PostgreSQL? - sql

I am new at PostgreSQL and I am building a new project where I need to work mostly on database. I have a table called products and I have filled it with some data. The problem here is that I need to find all the duplicated rows on the table. The table is like this:
id | name | created_date | updated_date |
-----|----------|---------------|---------------|
1 | hat | 01/05/2022 | 01/06/2022 |
2 | jeans | 01/05/2022 | 01/06/2022 |
3 | shoes | 01/05/2022 | 01/06/2022 |
4 | hat | 01/05/2022 | 01/06/2022 |
...
The duplicated rows are 1 and 4 with the name hat.
After finding the duplicated rows, I need to display the row with the most recent date and then update the table where the table is filled without any duplicates.
Any ideas?

You can use DISTINCT ON (keeps only the first row of each set of rows where the given expressions evaluate to equal):
SELECT DISTINCT ON (name) *
FROM products
ORDER BY name, updated_date DESC
In this query rows with same name sorted by updated_date and all rows except first (with most recent date) are eliminated.

Related

Remove duplicate rows in MS-Access

I am using Microsoft Access and in it, I have a table with data that is sometimes repeated. I'm not able to create an SQL query that removes duplicate data, leaving only distinct data in the table. Can someone help me?
My current table:
Date | Level | Name
---------+--------+--------
12/25/2021 | 2 | Jack
12/25/2021 | 2 | Jack
12/10/2021 | 3 | Ana
12/01/2021 | 1 | Lenon
12/01/2021 | 1 | Lenon
12/30/2021 | 3 | Ana
Expected result:
Date | Level | Name
---------+--------+--------
12/25/2021 | 2 | Jack
12/10/2021 | 3 | Ana
12/01/2021 | 1 | Lenon
12/30/2021 | 3 | Ana
PS: Ana appears twice in the expected result table because the dates of the two rows referring to Ana are different, so they are not duplicated values.
Just use select distinct:
select distinct t.*
from t;
I would add that tables should not have duplicate rows. Something is wrong with the table generation if you are getting duplicates -- either the query being used or the process for inserting rows into the table.
You can do a group by of the Date, Level and Name columns.
Use this query:
SELECT Date
,Level
,Name
FROM <TableName>
GROUP BY Date, Level, Name

SQL Server selecting data as array from two tables

I have a database in which there are two tables tableA, tableB. Now for each primary id in tableA there may be multiple rows in tableB.
Table A primary key (ServiceOrderId)
+----------------+-------+-------+-------------+
| ServiceOrderId | Tax | Total | OrderNumber |
+----------------+-------+-------+-------------+
| 12 | 45.00 | 347 | 1011 |
+----------------+-------+-------+-------------+
Table B foreign key (ServiceOrderId)
+----+-------------+---------------------+----------+-------+------+----------------+
| Id | ServiceName | ServiceDescription | Quantity | Price | Cost | ServiceOrderId |
+----+-------------+---------------------+----------+-------+------+----------------+
| 39 | MIN-C | Commercial Pretreat | NULL | 225 | 23 | 12 |
+----+-------------+---------------------+----------+-------+------+----------------+
| 40 | MIN-C | Commercial Pretreat | NULL | 225 | 25 | 12 |
+----+-------------+---------------------+----------+-------+------+----------------+
Is there a way in which I can fetch the values as an array of multiple rows of tableB with single row of tableA. Because when I am saving to database I am using temp table to save multiple rows of tableB with single row of tableA.
Query I am using
SELECT
ordr.*,
info.*
FROM
tblServiceOrder as ordr
JOIN
tblServiceOrderInfo as info ON ordr.ServiceOrderId = info.ServiceOrderId
But above query is giving two rows for each ServiceOrderId. I am using node api to fetch data. I want something like;
Object:{
objectA:{id:12,tax:45.00:total:347,ordernumber:1011},
objectB:[
{id:39,servicename:'MIN-C',description:'Commercial Pretreat',Quantity :NULL,Price:225,Cost:23,ServiceOrderId:12 },
{id:40,servicename:'MIN-C',description:'Commercial Pretreat',Quantity :NULL,Price:225,Cost:25,ServiceOrderId:12}
]
}
There are several solutions. The first one is to use your SELECT, but with adding ORDER BY ServiceOrderID and when data are converting to object, to use the first row only in the loop for new ServiceOrderId from ordr table and add every row for the data from info table.
Other possibility is to select data from ordr table only and for every row to make another select by ServiceOrderId from info table. This solution should not be used for huge tables.

SQL / Oracle to Tableau - How to combine to sort based on two fields?

I have tables below as follows:
tbl_tasks
+---------+-------------+
| Task_ID | Assigned_ID |
+---------+-------------+
| 1 | 8 |
| 2 | 12 |
| 3 | 31 |
+---------+-------------+
tbl_resources
+---------+-----------+
| Task_ID | Source_ID |
+---------+-----------+
| 1 | 4 |
| 1 | 10 |
| 2 | 42 |
| 4 | 8 |
+---------+-----------+
A task is assigned to at least one person (denoted by the "assigned_ID") and then any number of people can be assigned as a source (denoted by "source_ID"). The ID numbers are all linked to names in another table. Though the ID numbers are named differently, they all return to the same table.
Would there be any way for me to combine the two tables based on ID such that I could search based on someone's ID number? For example- if I decide to search on or do a WHERE User_ID = 8, in order to see what Tasks that 8 is involved in, I would get back Task 1 and Task 4.
Right now, by joining all the tables together, I can easily filter on "Assigned" but not "Source" due to all the multiple entries in the table.
Use union all:
select distinct task_id
from ((select task_id, assigned_id as id
from tbl_tasks
) union all
(select task_id, source_id
from tbl_resources
)
) ti
where id = ?;
Note that this uses select distinct in case someone is assigned to the same task in both tables. If not, remove the distinct.

Calculating Number of Columns that have no Null value

I want to make a table like following
| ID | Sibling1 | Sibling2 | Sibling 3 | Total_Siblings |
______________________________________________________________
| 1 | Tom | Lisa | Null | 2 |
______________________________________________________________
| 2 | Bart | Jason | Nelson | 3 |
______________________________________________________________
| 3 | George | Null | Null | 1 |
______________________________________________________________
| 4 | Null | Null | Null | 0 |
For Sibling1, Sibling2, Sibling3: they are all nvarchar(50) (can't change this as the requirement).
My concern is that how can I calculate the value for Total_Siblings so it will display the number of siblings like above, using SQL? i attempted to use (Sibling1 + Sibling 2) but it does not display the result I want.
Cheers
A query like this would do the trick.
SELECT ID,Sibling1,Sibling2,Sibling3
,COUNT(Sibling1)+Count(Sibling2)+Count(Sibling3) AS Total
FROM MyTable
GROUP BY ID
A little explanation is probably required here. Count with a field name will count the number of non-null values. Since you are grouping by ID, It will only ever return 0 or 1. Now, if you're using anything other than MySQL, you'll have to substitute
GROUP BY ID
FOR
GROUP BY ID,Sibling1,Sibling2,Sibling3
Because most other databases require that you specify all columns that don't contain an aggregate function in the GROUP BY section.
Also, as an aside, you may want to consider changing your database schema to store the siblings in another table, so that each person can have any number of siblings.
You can do this by adding up individual counts:
select id,sibling1,sibling2,sibling3
,count(sibling1)+count(sibling2)+count(sibling3) as total_siblings
from table
group by 1,2,3,4;
However, your table structure makes this scale crappily (what if an id can belong to, say, 50 siblings?). If you store your data into a table with columns of id and sibling, then this query would be as simple as:
select id,count(sibling)
from table
group by id;

Select multiple top 1

I have this table with data:
ID | Data
--------+---------
1 | ONE
2 | ONE
3 | ONE
5 | TWO
7 | TWO
10 | TWO
15 | THREE
14 | THREE
8 | THREE
and I want to get this result
ID | Data
--------+---------
1 | ONE
5 | TWO
15 | THREE
so I want to collect only first record of each value in Data. Values ONE, TWO, THREE may exist in second table so I can get them merged using join. How can I do this?
Try this -
Select Min(ID), Data
From YourTableAfterAllJoinsYouWant
Group By Data
You won't need to join if your Data table already has column Data in it. Or else, you can replace the YourTableAfterAllJoinsYouWant with joined tables.