SQL subselect queries with use of the table outside subselect

SQL subselect queries with use of the table outside subselect - sql

I'm trying to retrieve the count the number of times that a teammate has beat his teammate based on DRIVERPOSITION, however i keep getting that invalid select-list in subselect i guess this is because i use the b and a table inside the subselect query?
Sample Data
RACEID CONSTRUCTORID DRIVERID DRIVERPOSITION
970 4 826 3
970 4 807 7
960 4 826 4
960 4 807 7
970 3 820 10
970 3 810 12
960 3 820 13
960 3 810 11
DESIRED RESULT
RACEID CONSTRUCTORID DRIVERID WINS
970 4 826 2
970 4 807 0
960 3 820 1
960 3 810 1
What i tried so far
SELECT
(
SELECT COUNT(
CASE
WHEN b.DRIVERPOSITION > a.DRIVERPOSITION THEN 1
ELSE 0 END
)
FROM QUALIFYING b
WHERE RACEYEAR = to_char(NOW(), 'YYYY')
AND a.CONSTRUCTORID = b.CONSTRUCTORID
AND a.RACEID = b.RACEID
AND a.DRIVERID != b.DRIVERID
)
FROM QUALIFYING a
INNER JOIN RACES
ON a.RACEID = RACES.RACEID
INNER JOIN DRIVERS
ON a.DRIVERID = DRIVERS.DRIVERID
INNER JOIN CONSTRUCTORS
ON a.CONSTRUCTORID = CONSTRUCTORS.CONSTRUCTORID
WHERE RACEYEAR = to_char(NOW(), 'YYYY');

I think this does what you want:
select raceid, constructorid, driverid,
sum(case when seqnum = 1 then 1 else 0 end) as numwins
from (select d.*,
row_number() over (partition by raceid, constructorid order by driverposition) as seqnum
from data d
) d
group by raceid, constructorid;
However, I have no idea how this fits into your query. Your sample data refers to one table. Your query has multiple table references.

Related

PostgreSQL fill in the blanks in an outer join

Outer Join 'fill-in-the blanks'
I have a pair of master-detail tables in a PostgreSQL database where master table 'samples' has some samples with a timestamp in each.
The detail table 'sample_values' has some values for some parameters at any given sample timestamp.
My Query
SELECT s.sample_id, s.sample_time, v.parameter_id, v.sample_value
FROM samples s LEFT OUTER JOIN sample_values v ON v.sample_id=s.sample_id
ORDER BY s.sample_id, v.parameter_id;
returns (as expected):
sample_id
sample_time
parameter_id
sample_value
1
2023-01-13T01:00:00.000Z
1
1.23
1
2023-01-13T01:00:00.000Z
2
4.98
2
2023-01-13T01:01:00.000Z
3
2023-01-13T01:02:00.000Z
4
2023-01-13T01:03:00.000Z
5
2023-01-13T01:04:00.000Z
2
6.08
6
2023-01-13T01:05:00.000Z
7
2023-01-13T01:06:00.000Z
1
1.89
8
2023-01-13T01:07:00.000Z
9
2023-01-13T01:08:00.000Z
10
2023-01-13T01:09:00.000Z
11
2023-01-13T01:10:00.000Z
12
2023-01-13T01:11:00.000Z
13
2023-01-13T01:12:00.000Z
14
2023-01-13T01:13:00.000Z
15
2023-01-13T01:14:00.000Z
1
2.11
16
2023-01-13T01:15:00.000Z
17
2023-01-13T01:16:00.000Z
18
2023-01-13T01:17:00.000Z
19
2023-01-13T01:18:00.000Z
2
3.57
20
2023-01-13T01:19:00.000Z
21
2023-01-13T01:20:00.000Z
22
2023-01-13T01:21:00.000Z
23
2023-01-13T01:22:00.000Z
1
3.21
23
2023-01-13T01:22:00.000Z
2
5.31
How do I write a query that returns one row per timestamp per parameter, where sample_value is the 'latest known' sample_value for that parameter like this:
sample_id
sample_time
parameter_id
sample_value
1
2023-01-13T01:00:00.000Z
1
1.23
1
2023-01-13T01:00:00.000Z
2
4.98
2
2023-01-13T01:01:00.000Z
1
1.23
2
2023-01-13T01:01:00.000Z
2
4.98
3
2023-01-13T01:02:00.000Z
1
1.23
3
2023-01-13T01:02:00.000Z
2
4.98
4
2023-01-13T01:03:00.000Z
1
1.23
4
2023-01-13T01:03:00.000Z
2
4.98
5
2023-01-13T01:04:00.000Z
1
1.23
5
2023-01-13T01:04:00.000Z
2
6.08
6
2023-01-13T01:05:00.000Z
1
1.23
6
2023-01-13T01:05:00.000Z
2
6.08
7
2023-01-13T01:06:00.000Z
1
1.89
7
2023-01-13T01:06:00.000Z
2
6.08
8
2023-01-13T01:07:00.000Z
1
1.89
8
2023-01-13T01:07:00.000Z
2
6.08
View on DB Fiddle
I cannot get my head around the LAST_VALUE function (if that is even the right tool for this?):
LAST_VALUE ( expression )
OVER (
[PARTITION BY partition_expression, ... ]
ORDER BY sort_expression [ASC | DESC], ...
)

First of all you need two rows for each of your sample ids. You can achieve it by cross joining your sample values with the distinct amount of parameters, and ensuring the condition on parameters is met as well on the left join.
...
FROM samples s
CROSS JOIN (SELECT DISTINCT parameter_id FROM sample_values) p
LEFT JOIN sample_values v
ON v.sample_id = s.sample_id AND v.parameter_id = p.parameter_id
...
In addition to this, your intuition of using the LAST_VALUE window function was correct. Problem is that PostgreSQL is unable to ignore null values till its current version. The only workaround for this problem is to generate partitioning on your parameter_ids and sample_value (each partition will contain one non-null value and the other null values), then taking the maximum value from each partition.
WITH cte AS (
SELECT s.sample_id, s.sample_time, p.parameter_id, v.sample_value,
COUNT(v.sample_value) OVER(
PARTITION BY p.parameter_id
ORDER BY s.sample_id
) AS partitions
FROM samples s
CROSS JOIN (SELECT DISTINCT parameter_id FROM sample_values) p
LEFT JOIN sample_values v
ON v.sample_id = s.sample_id AND v.parameter_id = p.parameter_id
)
SELECT sample_id, sample_time, parameter_id,
COALESCE(sample_value,
MAX(sample_value) OVER (PARTITION BY parameter_id, partitions)
) AS sample_value
FROM cte
ORDER BY sample_id, parameter_id
Check the demo here.

SQL merge 3 tables

I have an sql query involving 2 tables and try to add a third one.
These are the tables
FreeBookPos
FreeBooK_ID
ArticleNr
Amount
FreeBook
ID
BookNr
Date
FreeFields
FreeFieldType
Value
SQLPrimeKey
The first two are linked this way
select FreeBookPos.ArticleNr, Format(FreeBooking.Date, 'yyyy_MM') as dt,
SUM(CASE WHEN FreeBook.BookNr = 0 THEN FreeBookPos.Amount ELSE 0 END) as TotalEntryAmount,
SUM(CASE WHEN FreeBook.BookNr = 1 THEN FreeBookPos.Amount ELSE 0 END) as TotalLeftAmount
From FreeBookPos
INNER JOIN FreeBook on FreeBookPos.FreeBook_ID = FreeBook.ID
group by FORMAT ( FreeBook.Date, 'yyyy_MM'), FreeBookPos.ArticleNr
order by dt, ArticleNr
Now I need to add the table 3. This table is linked via SQLPrimeKey to FeeBook table ID. I then need to have only the fields where FreeFields.Value 2 or 4 and FreeFields.FreeFieldType = 54.
I tried various options with join but never get the result. Would I need to first join table 2 and 3 and then with 1 in a separate step?
Table 1: FreeBookPos
FreeBook_ID ArticleNr Amount
1 145 12
2 145 6
3 143 4
4 145 1
5 145 42
Table 2: FreeBook
ID BookNr Date
1 1 2012-05-19
2 -1 2012-05-21
3 1 2012-05-22
4 -1 2012-05-24
5 -1 2012-06-25
Table 3: FreeFields
SQLPrimareyKey FreeFieldType Value
1 54 1
2 52 2
3 54 4
4 54 2
5 54 2
Result should be:
ArticleNr Dt TotalEntryAmount TotalLeftAmount
143 2012-05 4 0
145 2012-05 0 -1
145 2012-06 0 -42

Try the below -
select FreeBookPos.ArticleNr, Format(FreeBooking.Date, 'yyyy_MM') as dt,
SUM(CASE WHEN FreeBook.BookNr = 0 THEN FreeBookPos.Amount ELSE 0 END) as TotalEntryAmount,
SUM(CASE WHEN FreeBook.BookNr = 1 THEN FreeBookPos.Amount ELSE 0 END) as TotalLeftAmount
From FreeBookPos
INNER JOIN FreeBook on FreeBookPos.FreeBook_ID = FreeBook.ID
inner join FreeFields on FreeBook.ID=SQLPrimareyKey
where value in (2,4) and FreeFieldType = 54
group by FORMAT ( FreeBook.Date, 'yyyy_MM'), FreeBookPos.ArticleNr
order by dt, ArticleNr

Count distinct values of a Column based on Distinct values of First Column

I am dealing with a huge volume of traffic data. I want to identify the vehicles which have changed their lanes, I'm Microsoft Access with VB.Net.
Traffic Data:
Vehicle_ID Lane_ID Frame_ID Distance
1 2 12 100
1 2 13 103
1 2 14 105
2 1 16 130
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141
I have tried to distinct the Vehicle_ID and then count(distinct Lane_ID).
I could list the distinct Vehicle_ID but the it counts the total Lane_ID instead of Distinct Lane_ID.
SELECT
Distinct Vehicle_ID, count(Lane_ID)
FROM Table1
GROUP BY Vehicle_ID
Shown Result:
Vehicle_ID Lane Count
1 3
2 3
3 2
Correct Result:
Vehicle_ID Lane Count
1 1
2 2
3 2
Further to that i would like to get all Vehicle_ID who have changed their lane (all data including previous lane and new lane). Output result would be somehow like: Vehicle_ID Lane_ID Frame_ID Distance
2 1 17 135
2 2 18 136
3 1 19 140
3 2 20 141

Access does not support COUNT(DISTINCT columnname) so do this:
SELECT t.Vehicle_ID, COUNT(t.Lane_ID) AS [Lane Count]
FROM (
SELECT DISTINCT Vehicle_ID, Lane_ID FROM Table1
) AS t
GROUP BY t.Vehicle_ID
So
to identify the vehicles which have changed their lanes
you need to add to the above query:
HAVING COUNT(t.Lane_ID) > 1

SELECT
Table1.Vehicle_ID,
LANE_COUNT
FROM Table1
JOIN (
SELECT Vehicle_ID, COUNT(*) as LANE_COUNT FROM (
SELECT distinct Vehicle_ID, Lane_ID FROM Table1
) dTable1 # distinct vehicle and land id
GROUP BY Vehicle_ID # counting the distinct
) cTable1 ON cTable1.Vehicle_ID = Table1.Vehicle_ID # join the table with the counting
I think you should do one by one,
Distinct the vehicle id and land id
counting the distinct combination
and merge the result with the actual table.

If you want vehicles that have changed their lanes, then you can do:
SELECT Vehicle_ID,
IIF(MIN(Lane_ID) = MAX(Lane_ID), 0, 1) as change_lane_flag
FROM Table1
GROUP BY Vehicle_ID;
I think this is as good as counting the number of distinct lanes, because you are not counting actual "lane changes". So this would return "2" even though the vehicle changes lanes multiple times:
2 1 16 130
2 1 17 135
2 2 18 136
2 1 16 140
2 1 17 145
2 2 18 146

Query to return all results except for the first record

I have a archive table that has records of transactions per locationID.
A location will have 0, 1 or many rows in this table.
I need a SELECT query that will return rows for any location that has more than 1 row, and to skip the first entry.
e.g.
Transactions table
transactionId locationId amount
1 11 2343
2 11 23434
3 25 342
4 32 234
5 77 234
6 11 38938
7 43 234
8 43 1235
So given the above, since the locationID has multiple rows, I will get back all rows except for the first one (lowest transacton_id):
2 11 23434
6 11 38938
8 43 1235

You can use row_number to do this. This assumes there would be no duplicate transactionid's.
select transactionid,locationid,amount
from
(select t.*, row_number() over(partition by locationid order by transactionid) as rn
from transactions t) t
where rn > 1

The other answer is fine. You could also write it this way, it might give you a little insight into grouping practices:
SELECT Transactions.TransactionID, Transactions.locationID, Transactions.amount
FROM Transactions INNER JOIN
(SELECT locationID, MIN(TransactionID) AS MinTransaction,
COUNT(TransactionID) AS CountTransaction
FROM Transactions
GROUP BY locationID) TableSum ON Transactions.locationID = TableSum.locationID
WHERE (Transactions.TransactionID <> TableSum.MinTransaction) AND
(TableSum.CountTransaction > 1)

SELECT clause with SUM condition

Have this table :
//TEST
NUMBER TOTAL
----------------------------
1 158
2 355
3 455
//TEST1
NUMBER QUANTITY UNITPRICE
--------------------------------------------
1 3 5
1 3 6
1 3 4
2 4 8
3 5 4
I used following query:
SELECT t.NUMBER,sum(t.TOTAL),NVL(SUM(t2.quantity*t2.unitprice),0)
FROM test t INNER JOIN test1 t2 ON t.NUMBER=t2.NUMBER
GROUP BY t.NUMBER;
OUTPUT:
NUMBER SUM(TOTAL) SUM(t2.quantity*t2.unitprice)
-----------------------------------------------------------
1 474 45 <--- only this wrong
2 355 32
It seem like loop for three times so 158*3 in the record.
EXPECTED OUTPUT:
NUMBER SUM(TOTAL) SUM(t2.quantity*t2.unitprice)
-----------------------------------------------------------
1 158 45
2 355 32

You have to understand that the result of your join is something like this:
//TEST1
NUMBER QUANTITY UNITPRICE TOTAL
--------------------------------------------------------------
1 3 5 158
1 3 6 158
1 3 4 158
2 4 8 355
3 5 4 455
It means you don't need to apply a SUM on TOTAL
SELECT t.NUMBER,t.TOTAL,NVL(SUM(t2.quantity*t2.unitprice),0)
FROM test t INNER JOIN test1 t2 ON t.NUMBER=t2.NUMBER
GROUP BY t.NUMBER, t.TOTAL;

Something like this should work using a subquery separating the sums:
select t.num,
sum(t.total),
test1sum
from test t
join (
select num, sum(qty*unitprice) test1sum
from test1
group by num
) t2 on t.num = t2.num
group by t.num, test1sum
SQL Fiddle Demo
In regards to your sample data, you may not even need the additional group by on the test total field. If that table only contains distinct ids, then this would work the same:
select t.num,
t.total,
sum(qty*unitprice)
from test t
join test1 t2 on t.num = t2.num
group by t.num, t.total

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL subselect queries with use of the table outside subselect - sql

Related

PostgreSQL fill in the blanks in an outer join

SQL merge 3 tables

Count distinct values of a Column based on Distinct values of First Column

Query to return all results except for the first record

SELECT clause with SUM condition

Categories

Resources