I have a table in postgresql as follow:
id | name | parent_id |
1 | morteza | null |
2 | ali | null |
3 | morteza2 | 1 |
4 | morteza3 | 1 |
My unique data are records with id=1,2, and record id=1 modified twice. now I want to select data with last modified. Query result for above data is as follow:
id | name |
1 | morteza3 |
2 | ali |
What's the suitable query?
If I am following correctly, you can use distinct on and coalesce():
select distinct on (coalesce(parent_id, id)) coalesce(parent_id, id) as new_id, name
from mytable
order by coalesce(parent_id, id), id desc
Demo on DB Fiddle:
new_id | name
-----: | :-------
1 | morteza3
2 | ali
From your description it would seem that the latest version of each row has parent_id IS NULL. (And obsoleted row versions have parent_id IS NOT NULL.)
The query is simple then:
SELECT id, name
FROM tbl
WHERE parent_id IS NULL;
db<>fiddle here
If you have many updates (hence, many obsoleted row versions), a partial index will help performance a lot:
CREATE INDEX ON tbl(id) WHERE parent_id IS NULL;
The actual index column is mostly irrelevant (unless there are additional requirements). The WHERE clause is the point here, to exclude the many obsoleted rows from the index. See:
Postgres partial index on IS NULL not working
Slow PostgreSQL query in production - help me understand this explain analyze output
Related
I have a table variable (#tableVar) with one column (tableCol) of unique values.
I have a target table with many existing rows that also has a column that is filled entirely with the NULL value.
What type of statement can I use to iterate through #tableVar and assign a different value from #tableVar.tableCol to the null field for each of the rows in my target table?
*Edit (to provide info)
My Target table has this structure:
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | NULL |
| Byron | NULL |
| Steve | NULL |
+-------+------------+
My table variable has this structure
+------------+
| CallNumber |
+------------+
| 6348 |
| 2675 |
| 9898 |
+------------+
I need to assign a different call number to each row in the target table, to achieve this kind of result.
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | 6348 |
| Byron | 2675 |
| Steve | 9898 |
+-------+------------+
Note: Each row does not need a specific CallNumber. The only requirement is that each row have a unique CallNumber. For example, "James" does not specifically need 6348; He can have any number as long as it's unique, and the unique number must come from the table variable. You can assume that the table variable will have enough CallNumbers to meet this requirement.
What type of query can I use for this result?
You can use an update with a sequence number:
with toupdate as (
select t.*, row_number() over (order by (select null)) as seqnum
from target t
)
update toupdate
set col = tv.tablecol
from (select tv.*, row_number() over (order by (select null)) as seqnum
from #tablevar tv
) tv
where tv.seqnum = toupdate.seqnum;
This assumes that #tablevar has a sufficient number of rows to assign in target. If not, I would suggest that you ask a new question with sample data and desired results.
Here is a db<>fiddle.
I have two tables drivers and drivers_names. What I want is for every driver I select from the first table to have a random name from the second, but what I get is one name for all drivers in the result. Yes, it is different every time but is one for all. Here is my query, I'm using postgresql.
SELECT
drivers.driver_id AS drivers_driver_id,
(
SELECT
drivers_names.name_en
FROM
drivers_names
ORDER BY random() LIMIT 1
) AS driver_name
FROM
drivers
Result:
11 Denis
13 Denis
7 Denis
Tables structure.
drivers
+--------------+
| column_name |
+--------------+
| driver_id |
| property_1 |
| property_2 |
| property_3 |
+--------------+
drivers_names
+-------------+
| column_name |
+-------------+
| name_id |
| name_en |
+-------------+
Postgres probably evaluates the subselect only once because technically there's no reason to evaluate it for every row.
You could force it by referencing a column from the drivers table into the subselect, like this:
SELECT
drivers.driver_id AS drivers_driver_id,
(
SELECT
drivers_names.name_en
FROM
drivers_names
ORDER BY random()+drivers.driver_id LIMIT 1
) AS driver_name
FROM
drivers
I want to make a table like following
| ID | Sibling1 | Sibling2 | Sibling 3 | Total_Siblings |
______________________________________________________________
| 1 | Tom | Lisa | Null | 2 |
______________________________________________________________
| 2 | Bart | Jason | Nelson | 3 |
______________________________________________________________
| 3 | George | Null | Null | 1 |
______________________________________________________________
| 4 | Null | Null | Null | 0 |
For Sibling1, Sibling2, Sibling3: they are all nvarchar(50) (can't change this as the requirement).
My concern is that how can I calculate the value for Total_Siblings so it will display the number of siblings like above, using SQL? i attempted to use (Sibling1 + Sibling 2) but it does not display the result I want.
Cheers
A query like this would do the trick.
SELECT ID,Sibling1,Sibling2,Sibling3
,COUNT(Sibling1)+Count(Sibling2)+Count(Sibling3) AS Total
FROM MyTable
GROUP BY ID
A little explanation is probably required here. Count with a field name will count the number of non-null values. Since you are grouping by ID, It will only ever return 0 or 1. Now, if you're using anything other than MySQL, you'll have to substitute
GROUP BY ID
FOR
GROUP BY ID,Sibling1,Sibling2,Sibling3
Because most other databases require that you specify all columns that don't contain an aggregate function in the GROUP BY section.
Also, as an aside, you may want to consider changing your database schema to store the siblings in another table, so that each person can have any number of siblings.
You can do this by adding up individual counts:
select id,sibling1,sibling2,sibling3
,count(sibling1)+count(sibling2)+count(sibling3) as total_siblings
from table
group by 1,2,3,4;
However, your table structure makes this scale crappily (what if an id can belong to, say, 50 siblings?). If you store your data into a table with columns of id and sibling, then this query would be as simple as:
select id,count(sibling)
from table
group by id;
Firstly, sorry for the wording of the question. I'm not too sure how to express it. Hopefully the example below is clear.
If I have a table
Id | Type | Order
0 | Test | null
1 | Test | null
2 | Blah | null
3 | Blah | null
I want to turn it into this
Id | Type | Order
0 | Test | 1
1 | Test | 2
2 | Blah | 1
3 | Blah | 2
So I'm grouping the table by 'type' and allocating a number to 'order' incrementally. It'll start at 1 per type.
How should I go about doing it?
Db I'm using is Sybase 15.
select
Id,
Type,
row_number() over(partition by Type order by Id) as [Order]
from YourTable
You should utilize the ROW_NUMBER function to get you what you're looking for.
ROW_NUMBER (Transact-SQL)
Returns the sequential number of a row within a partition of a result
set, starting at 1 for the first row in each partition.
I just enabled the slow-log (+not using indexes) and I'm getting hundreds of entries for the same kind of query (only user changes)
SELECT id
, name
FROM `all`
WHERE id NOT IN(SELECT id
FROM `picks`
WHERE user=999)
ORDER BY name ASC;
EXPLAIN gives:
+----+--------------------+-------------------+-------+------------------+--------+---------+------------+------+------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------------------+-------+------------------+--------+---------+------------+------+------------------------------------------+
| 1 | PRIMARY | all | index | NULL | name | 156 | NULL | 209 | Using where; Using index; Using filesort |
| 2 | DEPENDENT SUBQUERY | picks | ref | user,user_2,pick | user_2 | 8 | const,func | 1 | Using where; Using index |
+----+--------------------+-------------------+-------+------------------+--------+---------+------------+------+------------------------------------------+
Any idea about how to optimize this query? I've tried with a bunch of different indexes on different fields but nothing.
I don't necessarily agree that 'not in' and 'exists' are ALWAYS bad performance choices, however, it could be in this situation.
You might be able to get your results using a much simpler query:
SELECT id
, name
FROM `all`
, 'picks'
WHERE all.id = picks.id
AND picks.user <> 999
ORDER BY name ASC;
"not in" and "exists" always bad choices for performance. May be left join with cheking "NULL" will be better try it.
This is probably the best way to write the query. Select everything from all and try to find matching rows from picks that share the same id and user is 999. If such a row doesn't exist, picks.id will be NULL, because it's using a left outer join. Then you can filter the results to return only those rows.
SELECT all.id, all.name
FROM
all
LEFT JOIN picks ON picks.id=all.id AND picks.user=999
WHERE picks.id IS NULL
ORDER BY all.name ASC