Postgresql how to update a column for max values for each row? - sql

So I have a tablea that I need to upgrade the column highestprog with values from the table called buffer
If a point intersects with 2 buffers, one with 80 progression and another with 90 progression, the column should be updated with a 90.
So I thought the max operator should be used here. My query was as follows:
UPDATE test.tablea
SET highestprog = (SELECT max(b.progression) FROM test.tablea a, test.buffer b WHERE ST_Contains(b.geom, a.geom))
However, this just updates every single row in the entire table with a 100, instead of the correct value for each row. How to update with the correct value from the right buffer?

If I understand your question correctly, the maximum should be taken per point. Assuming that the table tablea contains an "id column" idx, one might proceed as:
WITH stat AS (
SELECT a.idx, MAX(b.progression) AS maxprog
FROM
test.tablea a, test.buffer b
WHERE ST_Contains(b.geom, a.geom)
GROUP BY a.idx
)
UPDATE test.tablea
SET highestprog = stat.maxprog
FROM stat
WHERE test.tablea.idx = stat.idx

Related

How do I do a sum per id?

SELECT distinct
A.PROPOLN, C.LIFCLNTNO, A.PROSASORG, sum (A.PROSASORG) as sum
FROM [FPRODUCTPF] A
join [FNBREQCPF] B on (B.IQCPLN=A.PROPOLN)
join [FLIFERATPF] C on (C.LIFPOLN=A.PROPOLN and C.LIFPRDCNT=A.PROPRDCNT and C.LIFBNFCNT=A.PROBNFCNT)
where C.LIFCLNTNO='2012042830507' and A.PROSASORG>0 and A.PROPRDSTS='10' and
A.PRORECSTS='1' and A.PROBNFLVL='M' and B.IQCODE='B10000' and B.IQAPDAT>20180101
group by C.LIFCLNTNO, A.PROPOLN, A.PROSASORG
This does not sum correctly, it returns two lines instead of one:
PROPOLN LIFCLNTNO PROSASORG sum
1 209814572 2012042830507 3881236 147486968
2 209814572 2012042830507 15461074 463832220
You are seeing two rows because A.PROSASORG has two different values for the "C.LIFCLNTNO, A.PROPOLN" grouping.
i.e.
C.LIFCLNTNO, A.PROPOLN, A.PROSASORG together give you two unique rows.
If you want a single row for C.LIFCLNTNO, A.PROPOLN, then you may want to use an aggregate on A.PROSASORG as well.
Your entire query is being filtered on your "C" table by the one LifClntNo,
so you can leave that out of your group by and just have it as a MAX() value
in your select since it will always be the same value.
As for you summing the PROSASORG column via comment from other answer, just sum it. Hour column names are not evidently clear for purpose, so I dont know if its just a number, a quantity, or whatever. You might want to just pull that column out of your query completely if you want based on a single product id.
For performance, I would suggest the following indexes on
Table Index
FPRODUCTPF ( PROPRDSTS, PRORECSTS, PROBNFLVL, PROPOLN )
FNBREQCPF ( IQCODE, IQCPLN, IQAPDAT )
FLIFERATPF ( LIFPOLN, LIFPRDCNT, LIFBNFCNT, LIFCLNTNO )
I have rewritten your query to put the corresponding JOIN components to the same as the table they are based on vs all in the where clause.
SELECT
P.PROPOLN,
max( L.LIFCLNTNO ) LIFCLNTNO,
sum (P.PROSASORG) as sum
FROM
[FPRODUCTPF] P
join [FNBREQCPF] N
on N.IQCODE = 'B10000'
and P.PROPOLN = N.IQCPLN
and N.IQAPDAT > 20180101
join [FLIFERATPF] L
on L.LIFCLNTNO='2012042830507'
and P.PROPOLN = L.LIFPOLN
and P.PROPRDCNT = L.LIFPRDCNT
and P.PROBNFCNT = L.LIFBNFCNT
where
P.PROPRDSTS = '10'
and P.PRORECSTS = '1'
and P.PROBNFLVL = 'M'
and P.PROSASORG > 0
group by
P.PROPOLN
Now, one additional issue you will PROBABLY be running into. You are doing a query with multiple joins, and it appears that there will be multiple records in EACH of your FNBREQCPF and FLIFERATPF tables for the same FPRODUCTPF entry. If you, you will be getting a Cartesian result as the PROSASORG value will be counted for each instance combination in the two other tables.
Ex: FProductPF has ID = X with a Prosasorg value of 3
FNBreQCPF has matching records of Y1 and Y2
FLIFERATPF has matching records of Z1, Z2 and Z3.
So now your total will be equal to 3 times 6 = 18.
If you look at the combinations, Y1:Z1, Y1:Z2, Y1:Z3 AND Y2:Z1, Y2:Z2, Y2:Z3 giving your 6 entries that qualify, times the original value of 3, thus bloating your numbers -- IF such multiple records may exist in each respective table. Now, imagine if your tables have 30 and 40 matching instances respectively, you have just bloated your totals by 1200 times.

How to create an additional column in a SQL query that contains the number of rows with a column value equal to a column value from the current row?

This is what I currently have (it doesn't work):
select MOCKSTEMS.WORD_ID,
MOCKSTEMS.STEM_ID,
MOCKSTEMS.LABSTEM,
MOCKSTEMS.LABSTEMCATEGORY,
MOCKLEMMAS.LEMMAFORM,
MOCKSTEMS.LEMMA_ID,
MOCKWORDS.ORIGINALWORD,
MOCKSTEMS.CONTAINEDIN,
COUNT(*) as SAMEVALUE from MOCKSTEMS where CONTAINEDIN=STEM_ID
from MOCKSTEMS
inner join MOCKWORDS on MOCKSTEMS.WORD_ID = MOCKWORDS.WORD_ID
inner join MOCKLEMMAS on MOCKSTEMS.LEMMA_ID = MOCKLEMMAS.LEMMA_ID
Basically, I wish to create a column called 'SAMEVALUE' that shows the number of rows in this query with 'CONTAINEDIN' values equal to the 'STEM_ID' value of each row. Is this possible, and if so, how can I do it with SQL?
EDITED:
This is what I get when I run the query without the 'COUNT(*) as SAMEVALUE from MOCKSTEMS where CONTAINEDIN=STEM_ID' row:
image of a few rows returned by the query.
For example, for the row with STEM_ID='stem-003' and LABSTEM='owotan okitz', I would like the SAMEVALUE column to have value 2, because there are 2 rows with CONTAINEDIN='stem-003', as circled in this image.
It would also be fine if the SAMEVALUE column just indicates true/false (or 0/1) depending on whether there are rows with CONTAINEDIN values equal to the STEM_ID of each row.
To get overall count alongside the query results, you need an analytic function. So to count only rows with some condition, we put this condition in case expression, which returns something in case of "true", and null in other cases. Then count will ignore nulls.
select MOCKSTEMS.WORD_ID,
MOCKSTEMS.STEM_ID,
MOCKSTEMS.LABSTEM,
MOCKSTEMS.LABSTEMCATEGORY,
MOCKLEMMAS.LEMMAFORM,
MOCKSTEMS.LEMMA_ID,
MOCKWORDS.ORIGINALWORD,
MOCKSTEMS.CONTAINEDIN,
COUNT(
case
when CONTAINEDIN=STEM_ID
then 1
end
) over() as SAMEVALUE
/*Over is empty to consider all the result set as a single window*/
from MOCKSTEMS
inner join MOCKWORDS on MOCKSTEMS.WORD_ID = MOCKWORDS.WORD_ID
inner join MOCKLEMMAS on MOCKSTEMS.LEMMA_ID = MOCKLEMMAS.LEMMA_ID

SQL : Avoiding duplicates by comparing the consecutive rows

Lets consider I've 10 rows of data in which there are duplicates in a column named PS_DRIVER. I need to get only one row for each value in PS_DRIVER based on the highest value in PS_Completion_TS(timestamp).
Note: Some PS_DRIVER doesn't have duplicates and in that case i need that row too.
Try this:
select td.PS_DRIVER
from tabDrivers td
where td.PS_Completion_TS in (select max(td2.PS_Completion_TS)
from tabDrivers td2
where td2.PS_DRIVER = td.PS_DRIVER )

How do I return a value of an entity in a table that is less than but closest to the value in another table for each element in the last table in SQL?

I have two tables in MS Access and I am trying to add a field for one of those tables that tells which record from another table has a value that is less than the first field's value, but comes the closest? I have this query so far (just a select statement to test output and not alter existing tables), but it lists all values that are less than the querying value:
SELECT JavaClassFileList.ClassFile, ModuleList.Module
FROM JavaClassFileList, ModuleList
WHERE ModuleList.Order<JavaClassFileList.Order;`
I tried using things likeSELECT JavaClassFileList.Classfile, MAX(ModuleList.Module), which will only display the maximum module but combined it with the select statement above, but it would say that it would only return one record.
Output desired: I have some records, a, b, and c, I shall call them, each storing various information, while a is storing a value of 732 in a column, and b is storing a value of 731 in the same column. c is storing a value of 720. In another table, d is storing a value of 730 and e is storing a value of 718. I want the output like this (they are ordered largest to smallest):
a 732 d 730
b 731 d 730
c 720 e 718
There can be duplicates on the right, but no duplicates on the left. How can I get this result?
I would approach this type of query using a correlated subquery. I think the following words in Access:
SELECT jc.ClassFile,
(select top 1 ml.Module
from ModuleList as ml
where ml.[Order] < jc.[Order]
)
FROM JavaClassFileList as jc;
I'm assuming Order is unique for Module. If it isn't, JavaClassFileRecords may show up multiple times in the resultset.
If no module can be found for a JavaClassFile then it will not show up in the results. If you do want it to show up in cases like that (with a null module), replace INNER JOIN with LEFT OUTER JOIN.
SELECT j.ClassFile, m.Module
FROM JavaClassFileList j
INNER JOIN ModuleList m
ON m.Order =
(SELECT MAX(Order)
FROM ModuleList
WHERE Order < j.Order)

how do I pull out a random record from an SQL table?

I have an SQL table which has two integers. Let these integers be a and b.
I want to SELECT out a random record, such that the record is selected with probability proportional to C + a/b for some constant C which I will choose.
So for example, if C = 0, and there are two records with a=1,b=2 and a=2,b=3, then we have that for the first record C+a/b = 1/2 and for the second record C+a/b = 2/3, and therefore with probability 0.3 I will choose the first record, and probability 0.7 I will choose the second record from that SELECT query.
I know SQL well (I thought), but I am not even sure where to begin here. I thought of doing a select for the "SUM(a/b)" first, and then doing a select for the first record the sum of C+a/b up to it exceeds a random number between C*number_of_records + SUM(a/b) for the first time. But, I don't really know how to do that.
You could do something like sorting by a random number multiplied by your other stuff, and just select top 1 from that query - something like:
SELECT TOP 1 (your column names)
FROM (your table)
ORDER BY Rand() * (your calculation)