How do I return a value of an entity in a table that is less than but closest to the value in another table for each element in the last table in SQL? - sql

I have two tables in MS Access and I am trying to add a field for one of those tables that tells which record from another table has a value that is less than the first field's value, but comes the closest? I have this query so far (just a select statement to test output and not alter existing tables), but it lists all values that are less than the querying value:
SELECT JavaClassFileList.ClassFile, ModuleList.Module
FROM JavaClassFileList, ModuleList
WHERE ModuleList.Order<JavaClassFileList.Order;`
I tried using things likeSELECT JavaClassFileList.Classfile, MAX(ModuleList.Module), which will only display the maximum module but combined it with the select statement above, but it would say that it would only return one record.
Output desired: I have some records, a, b, and c, I shall call them, each storing various information, while a is storing a value of 732 in a column, and b is storing a value of 731 in the same column. c is storing a value of 720. In another table, d is storing a value of 730 and e is storing a value of 718. I want the output like this (they are ordered largest to smallest):
a 732 d 730
b 731 d 730
c 720 e 718
There can be duplicates on the right, but no duplicates on the left. How can I get this result?

I would approach this type of query using a correlated subquery. I think the following words in Access:
SELECT jc.ClassFile,
(select top 1 ml.Module
from ModuleList as ml
where ml.[Order] < jc.[Order]
)
FROM JavaClassFileList as jc;

I'm assuming Order is unique for Module. If it isn't, JavaClassFileRecords may show up multiple times in the resultset.
If no module can be found for a JavaClassFile then it will not show up in the results. If you do want it to show up in cases like that (with a null module), replace INNER JOIN with LEFT OUTER JOIN.
SELECT j.ClassFile, m.Module
FROM JavaClassFileList j
INNER JOIN ModuleList m
ON m.Order =
(SELECT MAX(Order)
FROM ModuleList
WHERE Order < j.Order)

Related

How do I do a sum per id?

SELECT distinct
A.PROPOLN, C.LIFCLNTNO, A.PROSASORG, sum (A.PROSASORG) as sum
FROM [FPRODUCTPF] A
join [FNBREQCPF] B on (B.IQCPLN=A.PROPOLN)
join [FLIFERATPF] C on (C.LIFPOLN=A.PROPOLN and C.LIFPRDCNT=A.PROPRDCNT and C.LIFBNFCNT=A.PROBNFCNT)
where C.LIFCLNTNO='2012042830507' and A.PROSASORG>0 and A.PROPRDSTS='10' and
A.PRORECSTS='1' and A.PROBNFLVL='M' and B.IQCODE='B10000' and B.IQAPDAT>20180101
group by C.LIFCLNTNO, A.PROPOLN, A.PROSASORG
This does not sum correctly, it returns two lines instead of one:
PROPOLN LIFCLNTNO PROSASORG sum
1 209814572 2012042830507 3881236 147486968
2 209814572 2012042830507 15461074 463832220
You are seeing two rows because A.PROSASORG has two different values for the "C.LIFCLNTNO, A.PROPOLN" grouping.
i.e.
C.LIFCLNTNO, A.PROPOLN, A.PROSASORG together give you two unique rows.
If you want a single row for C.LIFCLNTNO, A.PROPOLN, then you may want to use an aggregate on A.PROSASORG as well.
Your entire query is being filtered on your "C" table by the one LifClntNo,
so you can leave that out of your group by and just have it as a MAX() value
in your select since it will always be the same value.
As for you summing the PROSASORG column via comment from other answer, just sum it. Hour column names are not evidently clear for purpose, so I dont know if its just a number, a quantity, or whatever. You might want to just pull that column out of your query completely if you want based on a single product id.
For performance, I would suggest the following indexes on
Table Index
FPRODUCTPF ( PROPRDSTS, PRORECSTS, PROBNFLVL, PROPOLN )
FNBREQCPF ( IQCODE, IQCPLN, IQAPDAT )
FLIFERATPF ( LIFPOLN, LIFPRDCNT, LIFBNFCNT, LIFCLNTNO )
I have rewritten your query to put the corresponding JOIN components to the same as the table they are based on vs all in the where clause.
SELECT
P.PROPOLN,
max( L.LIFCLNTNO ) LIFCLNTNO,
sum (P.PROSASORG) as sum
FROM
[FPRODUCTPF] P
join [FNBREQCPF] N
on N.IQCODE = 'B10000'
and P.PROPOLN = N.IQCPLN
and N.IQAPDAT > 20180101
join [FLIFERATPF] L
on L.LIFCLNTNO='2012042830507'
and P.PROPOLN = L.LIFPOLN
and P.PROPRDCNT = L.LIFPRDCNT
and P.PROBNFCNT = L.LIFBNFCNT
where
P.PROPRDSTS = '10'
and P.PRORECSTS = '1'
and P.PROBNFLVL = 'M'
and P.PROSASORG > 0
group by
P.PROPOLN
Now, one additional issue you will PROBABLY be running into. You are doing a query with multiple joins, and it appears that there will be multiple records in EACH of your FNBREQCPF and FLIFERATPF tables for the same FPRODUCTPF entry. If you, you will be getting a Cartesian result as the PROSASORG value will be counted for each instance combination in the two other tables.
Ex: FProductPF has ID = X with a Prosasorg value of 3
FNBreQCPF has matching records of Y1 and Y2
FLIFERATPF has matching records of Z1, Z2 and Z3.
So now your total will be equal to 3 times 6 = 18.
If you look at the combinations, Y1:Z1, Y1:Z2, Y1:Z3 AND Y2:Z1, Y2:Z2, Y2:Z3 giving your 6 entries that qualify, times the original value of 3, thus bloating your numbers -- IF such multiple records may exist in each respective table. Now, imagine if your tables have 30 and 40 matching instances respectively, you have just bloated your totals by 1200 times.

How to select multiple values with the same ids and put them in one row, while maintaining the id to value connection?

I have a processknowledgeentry table that has the following data:
pke_id prc_id knw_id
1 1 2
2 1 4
3 2 4
The column knw_id references another table called knowledge, which also has its own id column. I want to be able to select all knw_id values with the same prc_id, and have them retain its nature as an id (so that it remains referenceable to the knowledge table).
Desired result:
prc_id knw_ids
1 [2, 4]
My code is shown below. (It also selects a Process Name from another table called process by inner joining the prc_ids. That part works correctly at least.)
SELECT * FROM (
SELECT
p.prc_name,
(SELECT knw_id
FROM processknowledgeentry
GROUP BY knw_id
HAVING COUNT(*) > 1)
FROM processknowledgeentry pke
INNER JOIN process p
ON pke.prc_id=p.prc_id
WHERE pke.prc_id = %s) as temp
I get the error: "CardinalityViolation: more than one row returned by a subquery used as an expression", and I understand why the error exists, so I want to know how to work around it. I'm also not sure if my logic is correct.
Would appreciate any assistance, thank you!
Seems you need a STRING_AGG() function instead of GROUP_CONCAT(), which some other DBMS has, containing a string type parameter as the first argument along with HAVING clause which filters multiple prc_id values such as
SELECT p.prc_id, STRING_AGG(knw_id::TEXT,',') AS knw_ids
FROM processknowledgeentry pke
JOIN process p
ON pke.prc_id = p.prc_id
-- WHERE pke.prc_id = %s
GROUP BY p.prc_id
HAVING COUNT(pke.prc_id) > 1
Indeed this case, a WHERE clause won't be needed.
Demo

SQL concat value as column

I'm trying to convert one string to a valid column.
As you can see I need to get something like 'mo._olddb_uid_'+nu._olddb_name_db that should use the column mo._olddb_uid_001
How can I achieve this:
SELECT *
FROM [PI_CONSOLIDATION].[dbo].[new_unite] nu
LEFT JOIN [PI_CONSOLIDATION].[dbo].[Motif_Orientation] mo ON CONCAT('mo._olddb_uid_',nu._olddb_name_db) = CONCAT(nu._olddb_name_db,'_',nu.id_motif_orientation)
WHERE nu.nom_res = 'TEST' and nu.prenom_res = 'Foobar'
Thanks
EDITED:
I have 18 application each with his DB. Those applications are almost similar with different data but sometimes the data can be found on also on the other sources.
So [new_unite] has the id of patient, id_group and the source database
id_resident id_groupe_res _olddb_name_db
728 31 src1
629 21 src6
731 25 src9
934 12 src18
...
The other table has some parameters that have identical params but with different IDs depending on the DB of the source.
So the [Motif_Orientation] looks like :
So mainly this query is just only to test if the data is stored correctly on the final application where there is just one DB and all the data merged
id_motif_orientation label
1407 Famille
1410 Structures d'hébergement
1422 Etablissement d'Education Spéciale
What you could try to do is make a function with the output being the value you want to left join. Your output should be a single value doesn't matter the type. Then, call it as:
LEFT JOIN
(func_name(param1,param2,...) AS CONCAT(...) FROM DUAL)
ON 1=1
The result should be your value left joined to the current table under whatever column name you need. The only problem is that you will need to do this one by one for each row, so it is only really useful for smaller tables. Wish you gave more info so I could give a better answer but this worked for me when I was having a similar issue with renaming a column.

SAS: how to properly use intck() in proc sql

I have the following codes in SAS:
proc sql;
create table play2
as select a.anndats,a.amaskcd,count(b.amaskcd) as experience
from test1 as a, test1 as b
where a.amaskcd = b.amaskcd and intck('day', b.anndats, a.anndats)>0
group by a.amaskcd, a.ANNDATS;
quit;
The data test1 has 32 distinct obs, while this play2 only returns 22 obs. All I want to do is for each obs, count the number of appearance for the same amaskcd in history. What is the best way to solve this? Thanks.
The reason this would return 22 observations - which might not actually be 22 distinct from the 32 - is that this is a comma join, which in this case ends up being basically an inner join. For any given row a if there are no rows b which have a later anndats with the same amaskcd, then that a will not be returned.
What you want to do here is a left join, which returns all rows from a once.
create table play2
as select ...
from test1 a
left join test1 b
on a.amaskcd=b.amaskcd
where intck(...)>0
group by ...
;
I would actually write this differently, as I'm not sure the above will do exactly what you want.
create table play2
as select a.anndats, a.amaskcd,
(select count(1) from test1 b
where b.amaskcd=a.amaskcd
and b.anndats>a.anndats /* intck('day') is pointless, dates are stored as integer days */
) as experience
from test1 a
;
If your test1 isn't already grouped by amaskcd and anndats, you may need to rework this some. This kind of subquery is easier to write and more accurately reflects what you're trying to do, I suspect.
If both the anndats variables in each dataset are date type (not date time) then you can simple do an equals. Date variables in SAS are simply integers where 1 represents one day. You would not need to use the intck function to tell the days differnce, just use subtraction.
The second thing I noticed is your code looks for > 0 days returned. The intck function can return a negative value if the second value is less than the first.
I am still not sure I understand what your looking to produce in the query. It's joining two datasets using the amaskcd field as the key. Your then filtering based on anndats, only selecting records where b anndats value is less than a anndats or b.anndats < a.anndats.

Remove duplicate column after SQL query

I have this query but I'm getting two columns of houseid:
How do I only get one?
SELECT vehv2pub.houseid, vehv2pub.vehid, vehv2pub.epatmpg,
dayv2pub.houseid, dayv2pub.trpmiles
FROM vehv2pub, dayv2pub
WHERE vehv2pub.vehid >= 1
AND dayv2pub.trpmiles < 15
AND dayv2pub.houseid = vehv2pub.houseid;
And also, how do I get the average of the epatmpg? So the query would just return the value?
The most elegant way would be to use the USING clause in an explicit join condition:
SELECT houseid, v.vehid, v.epatmpg, d.houseid, d.trpmiles
FROM vehv2pub v
JOIN dayv2pub d USING (houseid)
WHERE v.vehid >= 1
AND d.trpmiles < 15;
This way, the column houseid is in the result only once, even if you use SELECT *.
Per documentation:
USING is a shorthand notation: it takes a comma-separated list of
column names, which the joined tables must have in common, and forms a
join condition specifying equality of each of these pairs of columns.
Furthermore, the output of JOIN USING has one column for each of the
equated pairs of input columns, followed by the remaining columns from each table.
To get the average epatmpg for the selected rows:
SELECT avg(v.epatmpg) AS avg_epatmpg
FROM vehv2pub v
JOIN dayv2pub d USING (houseid)
WHERE v.vehid >= 1
AND d.trpmiles < 15;
If there are multiple matches in dayv2pub, the derived table can hold multiple instances of each row in vehv2pub after the join. avg() is based on the derived table.
not 100% sure this works in postgres sql, but something like this gets the average in SQL server:
SELECT vehv2pub.houseid, avg(vehv2pub.epatmpg)
FROM vehv2pub, dayv2pub
WHERE vehv2pub.vehid >= 1
AND dayv2pub.trpmiles < 15
AND dayv2pub.houseid = vehv2pub.houseid
GROUP BY vehv2pub.houseid