Selecting elements that don't exist - sql

I am working on an application that has to assign numeric codes to elements. This codes are not consecutives and my idea is not to insert them in the data base until have the related element, but i would like to find, in a sql matter, the not assigned codes and i dont know how to do it.
Any ideas?
Thanks!!!
Edit 1
The table can be so simple:
code | element
-----------------
3 | three
7 | seven
2 | two
And I would like something like this: 1, 4, 5, 6. Without any other table.
Edit 2
Thanks for the feedback, your answers have been very helpful.

This will return NULL if a code is not assigned:
SELECT assigned_codes.code
FROM codes
LEFT JOIN
assigned_codes
ON assigned_codes.code = codes.code
WHERE codes.code = #code
This will return all non-assigned codes:
SELECT codes.code
FROM codes
LEFT JOIN
assigned_codes
ON assigned_codes.code = codes.code
WHERE assigned_codes.code IS NULL
There is no pure SQL way to do exactly the thing you want.
In Oracle, you can do the following:
SELECT lvl
FROM (
SELECT level AS lvl
FROM dual
CONNECT BY
level <=
(
SELECT MAX(code)
FROM elements
)
)
LEFT OUTER JOIN
elements
ON code = lvl
WHERE code IS NULL
In PostgreSQL, you can do the following:
SELECT lvl
FROM generate_series(
1,
(
SELECT MAX(code)
FROM elements
)) lvl
LEFT OUTER JOIN
elements
ON code = lvl
WHERE code IS NULL

Contrary to the assertion that this cannot be done using pure SQL, here is a counter example showing how it can be done. (Note that I didn't say it was easy - it is, however, possible.) Assume the table's name is value_list with columns code and value as shown in the edits (why does everyone forget to include the table name in the question?):
SELECT b.bottom, t.top
FROM (SELECT l1.code - 1 AS top
FROM value_list l1
WHERE NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code = l1.code - 1)) AS t,
(SELECT l1.code + 1 AS bottom
FROM value_list l1
WHERE NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code = l1.code + 1)) AS b
WHERE b.bottom <= t.top
AND NOT EXISTS (SELECT * FROM value_list l2
WHERE l2.code >= b.bottom AND l2.code <= t.top);
The two parallel queries in the from clause generate values that are respectively at the top and bottom of a gap in the range of values in the table. The cross-product of these two lists is then restricted so that the bottom is not greater than the top, and such that there is no value in the original list in between the bottom and top.
On the sample data, this produces the range 4-6. When I added an extra row (9, 'nine'), it also generated the range 8-8. Clearly, you also have two other possible ranges for a suitable definition of 'infinity':
-infinity .. MIN(code)-1
MAX(code)+1 .. +infinity
Note that:
If you are using this routinely, there will generally not be many gaps in your lists.
Gaps can only appear when you delete rows from the table (or you ignore the ranges returned by this query or its relatives when inserting data).
It is usually a bad idea to reuse identifiers, so in fact this effort is probably misguided.
However, if you want to do it, here is one way to do so.

This the same idea which Quassnoi has published.
I just linked all ideas together in T-SQL like code.
DECLARE
series #table(n int)
DECLARE
max_n int,
i int
SET i = 1
-- max value in elements table
SELECT
max_n = (SELECT MAX(code) FROM elements)
-- fill #series table with numbers from 1 to n
WHILE i < max_n BEGIN
INSERT INTO #series (n) VALUES (i)
SET i = i + 1
END
-- unassigned codes -- these without pair in elements table
SELECT
n
FROM
#series AS series
LEFT JOIN
elements
ON
elements.code = series.n
WHERE
elements.code IS NULL
EDIT:
This is, of course, not ideal solution. If you have a lot of elements or check for non-existing code often this could cause performance issues.

Related

How to extract a json value inside an array inside a json in SQL

I have a JSON being returned in my query called MetaDataJSON, inside which is an array of JSONs called Results. Inside each JSON in Results are two values, Chronic and Probability. There are a couple other tables that have been joined too. Is there a way to get Chronic in a column by itself? Right now I have gotten this far (table and variable names have been made generic):
SELECT DISTINCT
JSON_QUERY(mdj.value, '$.Results[0]') [Results]
FROM table1 t1
JOIN table2 t2 ON t2.parameter1 = t1.parameter1
AND t2.parameter2 = 'ASDF'
JOIN table3 t3 ON oad.parameter3 = oa.parameter3
AND t3.parameter4 = 11
CROSS APPLY OPENJSON(t3.MetaDataJSON) as mdj
This gets me a column called Results where each entry looks like:
{"Chronic": 0, "Probability": 0.0016}
Is there an efficient way to get Chronic in a column by itself? Thanks!
You can do it like this,
WITH jsons (x)
AS
(
-- replace this part with your query
Select a.x from (select '{"Chronic": 0, "Probability": 0.0016}' x ) as a, (SELECT 1 as y union select 2 as y ) as b
)
select
(SELECT value FROM OPENJSON(x,'$') where [key]='Chronic') as "Chronic",
(SELECT value FROM OPENJSON(x,'$') where [key]='Probability') as "Probability"
from jsons;
I think you can change the #json equation to use your query. But I cannot try since I don't have your tables...
BTW, I assume you are using MSSQL...

How to correctly order a recursive DB2 query

Hello friendly internet wizards.
I am attempting to extract a levelled bill of materials (BOM) from a dataset, running in DB2 on an AS400 server.
I have constructed most of the query (with a lot of help from online resources), and this is what I have so far;
#set item = '10984'
WITH BOM (origin, PMPRNO, PMMTNO, BOM_Level, BOM_Path, IsCycle, IsLeaf) AS
(SELECT CONNECT_BY_ROOT PMPRNO AS origin, PMPRNO, PMMTNO,
LEVEL AS BOM_Level,
SYS_CONNECT_BY_PATH(TRIM(PMMTNO), ' : ') BOM_Path,
CONNECT_BY_ISCYCLE IsCycle,
CONNECT_BY_ISLEAF IsLeaf
FROM MPDMAT
WHERE PMCONO = 405 AND PMFACI = 'M01' AND PMSTRT = 'STD'
START WITH PMPRNO = :item
CONNECT BY NOCYCLE PRIOR PMMTNO = PMPRNO)
SELECT 0 AS BOM_Level, '' AS BOM_Path, MMITNO AS Part_Number, MMITDS AS Part_Name,
IFSUNO AS Supplier_Number, IDSUNM AS Supplier_Name, IFSITE AS Supplier_Part_Number
FROM MITMAS
LEFT OUTER JOIN MITVEN ON MMCONO = IFCONO AND MMITNO = IFITNO AND IFSUNO <> 'ZGA'
LEFT OUTER JOIN CIDMAS ON MMCONO = IDCONO AND IDSUNO = IFSUNO
WHERE MMCONO = 405
AND MMITNO = :item
UNION ALL
SELECT BOM.BOM_Level, BOM_Path, BOM.PMMTNO AS Part_Number, MMITDS AS Part_Name,
IFSUNO AS Supplier_Number, IDSUNM AS Supplier_Name, IFSITE AS Supplier_Part_Number
FROM BOM
LEFT OUTER JOIN MITMAS ON MMCONO = 405 AND MMITNO = BOM.PMMTNO
LEFT OUTER JOIN MITVEN ON IFCONO = MMCONO AND IFITNO = MMITNO AND IFSUNO <> 'ZGA' AND MMMABU = '2'
LEFT OUTER JOIN CIDMAS ON MMCONO = IDCONO AND IDSUNO = IFSUNO
;
This is correctly extracting the components for a given item, as well as the sub-components (etc).
Current data looks like this (I have stripped out some columns that aren't relevant to the issue);
https://pastebin.com/LUnGKRqH
My issue is the order that the data is being presented in.
As you can see in the pastebin above, the first column is the 'level' of the component. This starts with the parent item at level 0, and can theoretically go down as far as 99 levels.
The path is also show there, so for example the second component 853021 tells us that it's a 2nd level component, the paths up to INST363 (shown later in the list as a 1st level), then up to the parent at level 0.
I would like for the output to show in path order (for lack of a better term).
Therefore, after level 0, it should be showing the first level 1 component, and then immediately be going into it's level 2 components and so on, until no further level is found. Then at that point, it returns back up the path to the next valid record.
I hope I have explained that adequately, but essentially the data should come out as;
Level
Path
Item
0
10984
1
: INST363
INST363
2
: INST363 : 853021
853021
1
: 21907
21907
Any help that can be provided would be very much appreciated!
Thanks,
This is an interesting query. Frankly I am surprised it works as well as it does since it is not structured the way I usually structure queries with a recursive CTE. The main issue is that while you have the Union in there, it does not appear to be within the CTE portion of the query.
When I write a recursive CTE, it is generally structured like this:
with cte as (
priming select
union all
secondary select)
select * from cte
So to get a BOM from an Item Master that looks something like:
CREATE TABLE item (
ItemNo Char(10) PRIMARY KEY,
Description Char(50));
INSERT INTO item
VALUES ('Item0', 'Root Item'),
('Item1a', 'Second Level Item'),
('Item1b', 'Another Second Level Item'),
('Item2a', 'Third Level Item');
and a linkage table like this:
CREATE TABLE linkage (
PItem Char(10),
CItem Char(10),
Quantity Dec(5,0),
PRIMARY KEY (PItem, CItem));
INSERT INTO linkage
VALUES ('Item0', 'Item1a', 2),
('Item0', 'Item1b', 3),
('Item1b', 'Item2a', 5)
The recursive CTE to list a BOM for 'Item0' looks like this:
WITH bom (Level, ItemNo, Description, Quantity)
AS (
-- Load BOM with root item
SELECT 0,
ItemNo,
Description,
1
FROM Item
WHERE ItemNo = 'Item0'
UNION ALL
-- Retrieve all child items
SELECT a.Level + 1,
b.CItem,
c.Description,
a.Quantity * b.Quantity
FROM bom a
join linkage b ON b.pitem = a.itemno
join item c ON c.itemno = b.citem)
-- Set the list order
SEARCH DEPTH FIRST BY itemno SET seq
-- List BOM
SELECT * FROM bom
ORDER BY seq
Here are my results:
LEVEL
ITEMNO
DESCRIPTION
QUANTITY
0
Item0
Root Item
1
1
Item1a
Second Level Item
2
1
Item1b
Another Second Level Item
3
2
Item2a
Third Level Item
15
Notice the search clause, that generates a column named seq which you can use to sort the output either depth first or breadth first. Depth first is what you want here.
NOTE: This isn't necessarily an optimum query since the description is in the CTE, and that increases the size of the CTE result set without really adding anything to it that couldn't be added in the final select. But it does make things a bit simpler since the 'priming query' retrieves the description.
Note also: the column list on the with clause following BOM. This is there to remove the confusion that DB2 had with the expected column list when the explicit column list was omitted. It is not always necessary, but if DB2 complains about an invalid column list, this will fix it.

Completely Unique Rows and Columns in SQL

I want to randomly pick 4 rows which are distinct and do not have any entry that matches with any of the 4 chosen columns.
Here is what I coded:
SELECT DISTINCT en,dialect,fr FROM words ORDER BY RANDOM() LIMIT 4
Here is some data:
**en** **dialect** **fr**
number SFA numero
number TRI numero
hotel CAI hotel
hotel SFA hotel
I want:
**en** **dialect** **fr**
number SFA numero
hotel CAI hotel
Some retrieved rows would have something similar with each other, like having the same en or the same fr, I would like to retrieved rows that do not share anything similar with each other, how do I do that?
I think I’d do this in the front end code rather the dB, here’s a pseudo code (don’t know what your node looks like):
var seenEn = “en not in (''“;
var seenFr = “fr not in (''“;
var rows =[];
while(rows.length < 4)
{
var newrow = sqlquery(“SELECT *
FROM table WHERE “ + seenEn + “) and ”
+ seenFr + “) ORDER BY random() LIMIT 1”);
if(!newrow)
break;
rows.push(newrow);
seenEn += “,‘“+ newrow.en + “‘“;
seenFr += “,‘“+ newrow.fr + “‘“;
}
The loop runs as many times as needed to retrieve 4 rows (or maybe make it a for loop that runs 4 times) unless the query returns null. Each time the query returns the values are added to a list of values we don’t want the query to return again. That list had to start out with some values (null) that are never in the data, to prevent a syntax error when concatenation a comma-value string onto the seenXX variable. Those syntax errors can be avoided in other ways like having a Boolean of “if it’s the first value don’t put the comma” but I chose to put dummy ineffective values into the sql to make the JS simpler. Same goes for the
As noted, it looks like JS to ease your understanding but this should be treated as pseudo code outlining a general algorithm - it’s never been compiled/run/tested and may have syntax errors or not at all work as JS if pasted into your file; take the idea and work it into your solution
Please note this was posted from an iphone and it may have done something stupid with all the apostrophes and quotes (turned them into the curly kind preferred by writers rather than the straight kind used by programmers)
You can use Rank or find first row for each group to achieve your result,
Check below , I hope this code will help you
SELECT 'number' AS Col1, 'SFA' AS Col2, 'numero' AS Col3 INTO #tbl
UNION ALL
SELECT 'number','TRI','numero'
UNION ALL
SELECT 'hotel','CAI' ,'hotel'
UNION ALL
SELECT 'hotel','SFA','hotel'
UNION ALL
SELECT 'Location','LocationA' ,'Location data'
UNION ALL
SELECT 'Location','LocationB','Location data'
;
WITH summary AS (
SELECT Col1,Col2,Col3,
ROW_NUMBER() OVER(PARTITION BY p.Col1 ORDER BY p.Col2 DESC) AS rk
FROM #tbl p)
SELECT s.Col1,s.Col2,s.Col3
FROM summary s
WHERE s.rk = 1
DROP TABLE #tbl

How to group by more than one row value?

I am working with POSTGRESQL and I can't find out how to solve a problem. I have a model called Foobar. Some of its attributes are:
FOOBAR
check_in:datetime
qr_code:string
city_id:integer
In this table there is a lot of redundancy (qr_code is not unique) but that is not my problem right now. What I am trying to get are the foobars that have same qr_code and have been in a well known group of cities, that have checked in at different moments.
I got this by querying:
SELECT * FROM foobar AS a
WHERE a.city_id = 1
AND EXISTS (
SELECT * FROM foobar AS b
WHERE a.check_in < b.check_in
AND a.qr_code = b.qr_code
AND b.city_id = 2
AND EXISTS (
SELECT * FROM foobar as c
WHERE b.check_in < c.check_in
AND c.qr_code = b.qr_code
AND c.city_id = 3
AND EXISTS(...)
)
)
where '...' represents more queries to get more persons with the same qr_code, different check_in date and those well known cities.
My problem is that I want to group this by qr_code, and I want to show the check_in fields of each qr_code like this:
2015-11-11 14:14:14 => [2015-11-11 14:14:14, 2015-11-11 16:16:16, 2015-11-11 17:18:20] (this for each different qr_code)
where the data at the left is the 'smaller' date for that qr_code, and the right part are all the other dates for that qr_code, including the first one.
Is this possible to do with a sql query only? I am asking this because I am actually doing this app with rails, and I know that I can make a different approach with array methods of ruby (a solution with this would be well received too)
You could solve that with a recursive CTE - if I interpret your question correctly:
Assuming you have a given list of cities that must be visited in order by the same qr_code. Your text doesn't say so, but your query indicates as much.
WITH RECURSIVE
c AS (SELECT '{1,2,3}'::int[] AS cities) -- your list of city_id's here
, route AS (
SELECT f.check_in, f.qr_code, 2 AS idx
FROM foobar f
JOIN c ON f.city_id = c.cities[1]
UNION ALL
SELECT f.check_in, f.qr_code, r.idx + 1
FROM route r
JOIN foobar f USING (qr_code)
JOIN c ON f.city_id = c.cities[r.idx]
WHERE r.check_in < f.check_in
)
SELECT qr_code, array_agg(check_in) AS check_in_list
FROM (
SELECT *
FROM route
ORDER BY qr_code, idx -- or check_in
) sub
HAVING count(*) = (SELECT array_length(cities) FROM c);
GROUP BY 1;
Provide the list as array in the first (non-recursive) CTE c.
In the recursive part start with any rows in the first city and travel along your array until the last element.
In the final SELECT aggregate your check_in column in order. Only return qr_code that have visited all cities of the array.
Similar:
Recursive query used for transitive closure

Help with a complex join query

Keep in mind I am using SQL 2000
I have two tables.
tblAutoPolicyList contains a field called PolicyIDList.
tblLossClaims contains two fields called LossPolicyID & PolicyReview.
I am writing a stored proc that will get the distinct PolicyID from PolicyIDList field, and loop through LossPolicyID field (if match is found, set PolicyReview to 'Y').
Sample table layout:
PolicyIDList LossPolicyID
9651XVB19 5021WWA85, 4421WWA20, 3314WWA31, 1121WAW11, 2221WLL99 Y
5021WWA85 3326WAC35, 1221AXA10, 9863AAA44, 5541RTY33, 9651XVB19 Y
0151ZVB19 4004WMN63, 1001WGA42, 8587ABA56, 8541RWW12, 9329KKB08 N
How would I go about writing the stored proc (looking for logic more than syntax)?
Keep in mind I am using SQL 2000.
Select LossPolicyID, * from tableName where charindex('PolicyID',LossPolicyID,1)>0
Basically, the idea is this:
'Unroll' tblLossClaims and return two columns: a tblLossClaims key (you didn't mention any, so I guess it's going to be LossPolicyID) and Item = a single item from LossPolicyID.
Find matches of unrolled.Item in tblAutoPolicyList.PolicyIDList.
Find matches of distinct matched.LossPolicyID in tblLossClaims.LossPolicyID.
Update tblLossClaims.PolicyReview accordingly.
The main UPDATE can look like this:
UPDATE claims
SET PolicyReview = 'Y'
FROM tblLossClaims claims
JOIN (
SELECT DISTINCT unrolled.LossPolicyID
FROM (
SELECT LossPolicyID, Item = itemof(LossPolicyID)
FROM unrolling_join
) unrolled
JOIN tblAutoPolicyList
ON unrolled.ID = tblAutoPolicyList.PolicyIDList
) matched
ON matched.LossPolicyID = claims.LossPolicyID
You can take advantage of the fixed item width and the fixed list format and thus easily split LossPolicyID without a UDF. I can see this done with the help of a number table and SUBSTRING(). unrolling_join in the above query is actually tblLossClaims joined with the number table.
Here's the definition of unrolled 'zoomed in':
...
(
SELECT LossPolicyID,
Item = SUBSTRING(LossPolicyID,
(v.number - 1) * #ItemLength + 1,
#ItemLength)
FROM tblLossClaims c
JOIN master..spt_values v ON v.type = 'P'
AND v.number BETWEEN 1 AND (LEN(c.LossPolicyID) + 2) / (#ItemLength + 2)
) unrolled
...
master..spt_values is a system table that is used here as the number table. Filter v.type = 'P' gives us a rowset with number values from 0 to 2047, which is narrowed down to the list of numbers from 1 to the number of items in LossPolicyID. Eventually v.number serves as an array index and is used to cut out single items.
#ItemLength is of course simply LEN(tblAutoPolicyList.PolicyIDList). I would probably also declared #ItemLength2 = #ItemLength + 2 so it wasn't calculated every time when applying the filter.
Basically, that's it, if I haven't missed anything.
If the PolicyIDList field is a delimited list, you have to first separate the individual policy IDs and create a temporary table with all of the results. Next up, use an update query on the tblLossClaims with 'where exists (select * from #temptable tt where tt.PolicyID = LossPolicyID).
Depending on the size of the table/data, you might wish to add an index to your temporary table.