Postgresql: Random value for a column selected frm second table - sql

I have two tables drivers and drivers_names. What I want is for every driver I select from the first table to have a random name from the second, but what I get is one name for all drivers in the result. Yes, it is different every time but is one for all. Here is my query, I'm using postgresql.
SELECT
drivers.driver_id AS drivers_driver_id,
(
SELECT
drivers_names.name_en
FROM
drivers_names
ORDER BY random() LIMIT 1
) AS driver_name
FROM
drivers
Result:
11 Denis
13 Denis
7 Denis
Tables structure.
drivers
+--------------+
| column_name |
+--------------+
| driver_id |
| property_1 |
| property_2 |
| property_3 |
+--------------+
drivers_names
+-------------+
| column_name |
+-------------+
| name_id |
| name_en |
+-------------+

Postgres probably evaluates the subselect only once because technically there's no reason to evaluate it for every row.
You could force it by referencing a column from the drivers table into the subselect, like this:
SELECT
drivers.driver_id AS drivers_driver_id,
(
SELECT
drivers_names.name_en
FROM
drivers_names
ORDER BY random()+drivers.driver_id LIMIT 1
) AS driver_name
FROM
drivers

Related

SQL Join to the latest record in MS ACCESS

I want to join tables in MS Access in such a way that it fetches only the latest record from one of the tables. I've looked at the other solutions available on the site, but discovered that they only work for other versions of SQL. Here is a simplified version of my data:
PatientInfo Table:
+-----+------+
| ID | Name |
+-----+------+
| 1 | John |
| 2 | Tom |
| 3 | Anna |
+-----+------+
Appointments Table
+----+-----------+
| ID | Date |
+----+-----------+
| 1 | 5/5/2001 |
| 1 | 10/5/2012 |
| 1 | 4/20/2018 |
| 2 | 4/5/1999 |
| 2 | 8/8/2010 |
| 2 | 4/9/1982 |
| 3 | 7/3/1997 |
| 3 | 6/4/2015 |
| 3 | 3/4/2017 |
+----+-----------+
And here is a simplified version of the results that I need after the join:
+----+------+------------+
| ID | Name | Date |
+----+------+------------+
| 1 | John | 4/20/2018 |
| 2 | Tom | 8/8/2010 |
| 3 | Anna | 3/4/2017 |
+----+------+------------+
Thanks in advance for reading and for your help.
You can use aggregation and JOIN:
select pi.id, pi.name, max(a.date)
from appointments as a inner join
patientinfo as pi
on a.id = pi.id
group by pi.id, pi.name;
something like this:
select P.ID, P.name, max(A.Date) as Dt
from PatientInfo P inner join Appointments A
on P.ID=A.ID
group by P.ID, P.name
Both Bing and Gordon's answers work if your summary table only needs one field (the Max(Date)) but gets more tricky if you also want to report other fields from the joined table, since you would need to include them either as an aggregated field or group by them as well.
Eg if you want your summary to also include the assessment they were given at their last appointment, GROUP BY is not the way to go.
A more versatile structure may be something like
SELECT Patient.ID, Patient.Name, Appointment.Date, Appointment.Assessment
FROM Patient INNER JOIN Appointment ON Patient.ID=Appointment.ID
WHERE Appointment.Date = (SELECT Max(Appointment.Date) FROM Appointment WHERE Appointment.ID = Patient.ID)
;
As an aside, you may want to think whether you should use a field named 'ID' to refer to the ID of another table (in this case, the Apppintment.ID field refers to the Patient.ID). You may make your db more readable if you leave the 'ID' field as an identifier specific to that table and refer to that field in other tables as OtherTableID or similar, ie PatientID in this case. Or go all the way and include the name of the actual table in its own ID field.
Edited after comment:
Not quite sure why it would crash. I just ran an equivalent query on 2 tables I have which are about 10,000 records each and it was pretty instanteneous. Are your ID fields (i) unique numbers and (ii) indexed?
Another structure which should do the same thing (adapted for your field names and assuming that there is an ID field in Appointments which is unique) would be something like:
SELECT PatientInfo.UID, PatientInfo.Name, Appointments.StartDateTime, Appointments.Assessment
FROM PatientInfo INNER JOIN Appointments ON PatientInfo_UID = Appointments.PatientFID
WHERE Appointments.ID = (SELECT TOP 1 ID FROM Appointments WHERE Appointments.PatientFID = PatientInfo_UID ORDER BY StartDateTime DESC)
;
But that is starting to look a bit contrived. On my data they both produce the same result (as they should!) and are both almost instantaneous.
Always difficult to troubleshoot Access when it crashes - I guess you see no error codes or similar? Is this against a native .accdb database or another server?

How can you assign all the different values in a table variable to fields in existing rows of a table?

I have a table variable (#tableVar) with one column (tableCol) of unique values.
I have a target table with many existing rows that also has a column that is filled entirely with the NULL value.
What type of statement can I use to iterate through #tableVar and assign a different value from #tableVar.tableCol to the null field for each of the rows in my target table?
*Edit (to provide info)
My Target table has this structure:
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | NULL |
| Byron | NULL |
| Steve | NULL |
+-------+------------+
My table variable has this structure
+------------+
| CallNumber |
+------------+
| 6348 |
| 2675 |
| 9898 |
+------------+
I need to assign a different call number to each row in the target table, to achieve this kind of result.
+-------+------------+
| Name | CallNumber |
+-------+------------+
| James | 6348 |
| Byron | 2675 |
| Steve | 9898 |
+-------+------------+
Note: Each row does not need a specific CallNumber. The only requirement is that each row have a unique CallNumber. For example, "James" does not specifically need 6348; He can have any number as long as it's unique, and the unique number must come from the table variable. You can assume that the table variable will have enough CallNumbers to meet this requirement.
What type of query can I use for this result?
You can use an update with a sequence number:
with toupdate as (
select t.*, row_number() over (order by (select null)) as seqnum
from target t
)
update toupdate
set col = tv.tablecol
from (select tv.*, row_number() over (order by (select null)) as seqnum
from #tablevar tv
) tv
where tv.seqnum = toupdate.seqnum;
This assumes that #tablevar has a sufficient number of rows to assign in target. If not, I would suggest that you ask a new question with sample data and desired results.
Here is a db<>fiddle.

How select data from two column in sql?

I have a table in postgresql as follow:
id | name | parent_id |
1 | morteza | null |
2 | ali | null |
3 | morteza2 | 1 |
4 | morteza3 | 1 |
My unique data are records with id=1,2, and record id=1 modified twice. now I want to select data with last modified. Query result for above data is as follow:
id | name |
1 | morteza3 |
2 | ali |
What's the suitable query?
If I am following correctly, you can use distinct on and coalesce():
select distinct on (coalesce(parent_id, id)) coalesce(parent_id, id) as new_id, name
from mytable
order by coalesce(parent_id, id), id desc
Demo on DB Fiddle:
new_id | name
-----: | :-------
1 | morteza3
2 | ali
From your description it would seem that the latest version of each row has parent_id IS NULL. (And obsoleted row versions have parent_id IS NOT NULL.)
The query is simple then:
SELECT id, name
FROM tbl
WHERE parent_id IS NULL;
db<>fiddle here
If you have many updates (hence, many obsoleted row versions), a partial index will help performance a lot:
CREATE INDEX ON tbl(id) WHERE parent_id IS NULL;
The actual index column is mostly irrelevant (unless there are additional requirements). The WHERE clause is the point here, to exclude the many obsoleted rows from the index. See:
Postgres partial index on IS NULL not working
Slow PostgreSQL query in production - help me understand this explain analyze output

SQL / Oracle to Tableau - How to combine to sort based on two fields?

I have tables below as follows:
tbl_tasks
+---------+-------------+
| Task_ID | Assigned_ID |
+---------+-------------+
| 1 | 8 |
| 2 | 12 |
| 3 | 31 |
+---------+-------------+
tbl_resources
+---------+-----------+
| Task_ID | Source_ID |
+---------+-----------+
| 1 | 4 |
| 1 | 10 |
| 2 | 42 |
| 4 | 8 |
+---------+-----------+
A task is assigned to at least one person (denoted by the "assigned_ID") and then any number of people can be assigned as a source (denoted by "source_ID"). The ID numbers are all linked to names in another table. Though the ID numbers are named differently, they all return to the same table.
Would there be any way for me to combine the two tables based on ID such that I could search based on someone's ID number? For example- if I decide to search on or do a WHERE User_ID = 8, in order to see what Tasks that 8 is involved in, I would get back Task 1 and Task 4.
Right now, by joining all the tables together, I can easily filter on "Assigned" but not "Source" due to all the multiple entries in the table.
Use union all:
select distinct task_id
from ((select task_id, assigned_id as id
from tbl_tasks
) union all
(select task_id, source_id
from tbl_resources
)
) ti
where id = ?;
Note that this uses select distinct in case someone is assigned to the same task in both tables. If not, remove the distinct.

Adding column to table with value from next row

I have a table in PostgreSQL with a timestamp column, and I want to modify the table to have a second timestamp column and seed it with the value of the immediately successive timestamp. Is there a way to do this? The tables are fairly large, so a correlated subquery might kill the machine.
More concretely, I want to go from this to that:
+----+------+ +----+------+------+
| ts | data | | ts | te | data |
+----+------+ +----+------+------+
| T | ... | --> | T | U | ... |
| U | ... | | U | V | ... |
| V | ... | | V | null | ... |
+----+------+ +----+------+------+
Basically, I want to be able to hand point in time queries much better (i.e., give me the data for time X).
Basically I think you could just retrieve the timestamp at the query time, not storing it in the table, but if you're performing such action and think that this is what you need then:
You need to add that column to your table:
ALTER TABLE tablename ADD COLUMN te timestamp;
Then perform an update feeding the value with the use of LEAD window function.
UPDATE tablename t
SET te = x.te
FROM (
SELECT ts, lead(ts, 1) OVER (order by ts) AS te
FROM tablename t2
) x
WHERE t.ts = x.ts
Here's an example of how it works using sample integer data: SQL Fiddle.
It will perform exactly the same for timestamp data type values.
select ts,LEAD(ts) over(order by (select null)) as te,data from table_name