kdb+ conditional insert: only insert when column value doesn't exist - insert-update

What is the best way to insert a row into a table, only if a column of that row doesn't exist in the table.
E.g.:
q)table:([] col1:(); col2:(); col3:());
q)`table insert (1;2;3);
q)conditionalInsert:{if[first where table.col1=x(0)~0N;`table insert x]};
Now when doing the following:
q)conditionalInsert[(1;2;3)];
q)conditionalInsert[(7;8;9)];
The result yields:
q)table
col1 col2 col3
--------------
1 2 3
7 8 9
This can probably be accomplished more easily. My question: what is the easiest/best way?
To be clear: the column may be a non-keyed one.
Or in other words: Table is either keyed or non keyed and target column is not a key (or part of the compound key columns)

Use a keyed table?
q)table
col1| col2 col3
----| ---------
1 | 2 3
q)`table insert (1;2;4)
'insert
q)`table insert (2;2;4)
,1
q)table
col1| col2 col3
----| ---------
1 | 2 3
2 | 2 4
you can always use protected evaluation to silent the error.
q).[insert;(`table;(1;2;4));{`already_there}]
`already_there
q).[insert;(`table;(3;2;4));{`already_there}]
,2
q)table
col1| col2 col3
----| ---------
1 | 2 3
2 | 2 4
3 | 2 4

First thing is to have proper attributes (sort,group) on the target column which will make function faster.
Now there are 2 scenarios I can think of:
a) Table is keyed and target column is keyed column : In this case normal insert will work in way like your conditional insert.
b) Table is either keyed or non keyed and target column is not a key (or part of the compound key columns) :
q)conditionalInsert: {if[not x[0] in table.col1;`table insert x]}
Its better to use 'exec' in place of 'table.col1' as dot notation doesn't work for keyed table:
q)conditionalInsert: {if[not x[0] in exec col1 from table;`table insert x]}

Related

How to extract characters from a string stored as json data and place them in dynamic number of columns in SQL Server

I have a column of string in SQL Server that stores JSON data with all the braces and colons included.
My problem is to extract all the key and value pairs and store them in separate columns with the key as the column header. What makes this challenging is that every record has different number of key/value pairs.
For example in the image below showing 3 records, the first record has 5 key/value pairs- EndUseCommunityMarket of 2, EndUseProvincial Market of 0, and so on. The second record has 1 key/value pair, and the third record has two key/value pairs.
If I have to show how I want this in excel it would be like:
I have seen some SQL code examples that does something similar but for a fixed number of columns, unlike this one it varies for every record.
Please I need a SQL statement that can achieve this as I am working on thousands of records.
Below is this data copied from sql server:
catch_ext
{"NfdsFadMonitoring":{"EndUseEaten":1}}
{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}
{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}
{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}
{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}
{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}
I expect you don't want to dynamically create a table, instead you probably want to create a property mapping table. Here is a quick overview of the design.
Object table -- this stores the base information about your object
============
ID -- unique id field for every object.
Name
Property types table -- this stores all the property types
====================
Property_Type_ID -- unique type id
Description -- describes property
Object2Property -- stores the values for each property
===============
ObjectID -- the object
Property_Type_ID -- the property type
Value -- the value.
Using a model like this lets your properties be as dynamic as you wish but you don't have to create columns dynamically -- something that is hard and error prone.
using your specific example the tables would look like this
OBJECT
ID NAME
1 WHAOO
2 RED SNAMPPER
3 KAWAKAWA
Property Types
ID DESC
1 EndUseCommunityMarket
2 EndUseProvincialMarket
3 EndUseUrbanMarket
4 EndUseEaten
5 EndUseGivenAway
6 Comment
Map
ObjID TypeID Value
1 1 2
1 2 0
1 3 0
1 4 0
1 5 0
2 2 50
3 3 8
3 5 1
A. ROWS
Dynamic columns are a lot like rows.
You could use OPENJSON (Transact-SQL)
DECLARE #json2 NVARCHAR(4000) = N'{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}';
SELECT [key], value
FROM OPENJSON(#json2,'lax $.NfdsFadMonitoring')
Output
key value
SpeciesComment 10 fish with a total of 18kg
EndUseCommunityMarket 0
EndUseProvincialMarket 0
EndUseUrbanMarket 8
EndUseEaten 1
EndUseGivenAway 1
Your inputs
CREATE TABLE ForEloga (Id int,Json nvarchar(max));
Insert into ForEloga Values
(1,'{"NfdsFadMonitoring":{"EndUseEaten":1}}'),
(2,'{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}'),
(3,'{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}'),
(4,'{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}'),
(5,'{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}'),
(6,'{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}');
SELECT Id, [key], value
FROM ForEloga CROSS APPLY OPENJSON(Json,'lax $.NfdsFadMonitoring')
Output
Id key value
1 EndUseEaten 1
2 EndUseCommunityMarket 3
3 SpeciesComment
3 EndUseCommunityMarket 2
4 SpeciesComment mix reef fis
4 EndUseEaten 31
5 SpeciesComment 10 fish with a total of 18kg
5 EndUseCommunityMarket 0
5 EndUseProvincialMarket 0
5 EndUseUrbanMarket 8
5 EndUseEaten 1
5 EndUseGivenAway 1
6 SpeciesComment mix reef fis
6 EndUseEaten 18
B. COLUMNS: CROSS APPLY WITH WITH
If you know all possible properties then I recommend CROSS APPLY with WITHas shown in Example 3 - Join rows with JSON data stored in table cells using CROSS APPLY in OPENJSON (Transact-SQL).
SELECT store.title, location.street, location.lat, location.long
FROM store
CROSS APPLY OPENJSON(store.jsonCol, 'lax $.location')
WITH (
street varchar(500),
postcode varchar(500) '$.postcode',
lon int '$.geo.longitude',
lat int '$.geo.latitude'
) AS location
Try this:
Table Schema:
CREATE TABLE #JsonValue(sp_name VARCHAR(100),catch_ext VARCHAR(1000))
INSERT INTO #JsonValue VALUES ('WAHOO','{"NfdsFadMonitoring":{"EndUseEaten":1}}')
INSERT INTO #JsonValue VALUES ('RUBY SNAPPER','{"NfdsFadMonitoring":{"EndUseCommunityMarket":3}}')
INSERT INTO #JsonValue VALUES ('KAWAKAWA','{"NfdsFadMonitoring":{"SpeciesComment":"","EndUseCommunityMarket":2}}')
INSERT INTO #JsonValue VALUES ('XXXXXXXX','{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":31}}')
INSERT INTO #JsonValue VALUES ('YYYYYYYY','{"NfdsFadMonitoring":{"SpeciesComment":"10 fish with a total of 18kg","EndUseCommunityMarket":0,"EndUseProvincialMarket":0,"EndUseUrbanMarket":8,"EndUseEaten":1,"EndUseGivenAway":1}}')
INSERT INTO #JsonValue VALUES ('ZZZZZZZZZZ','{"NfdsFadMonitoring":{"SpeciesComment":"mix reef fis","EndUseEaten":18}}')
Query:
SELECT sp_name
,ISNULL(MAX(CASE WHEN [Key]='EndUseCommunityMarket' THEN Value END),'')EndUseCommunityMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseProvincialMarket' THEN Value END),'')EndUseProvincialMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseUrbanMarket' THEN Value END),'')EndUseUrbanMarket
,ISNULL(MAX(CASE WHEN [Key]='EndUseEaten' THEN Value END),'')EndUseEaten
,ISNULL(MAX(CASE WHEN [Key]='EndUseGivenAway' THEN Value END),'')EndUseGivenAway
FROM(
SELECT sp_name, [key], value
FROM #JsonValue CROSS APPLY OPENJSON(catch_ext,'$.NfdsFadMonitoring')
)D
GROUP BY sp_name
Output:
sp_name EndUseCommunityMarket EndUseProvincialMarket EndUseUrbanMarket EndUseEaten EndUseGivenAway
------------- --------------------- ---------------------- ----------------- ----------- ---------------
KAWAKAWA 2
RUBY SNAPPER 3
WAHOO 1
XXXXXXXX 31
YYYYYYYY 0 0 8 1 1
ZZZZZZZZZZ 18
Hope this will help you.

How to update a Column from a Bunch of other columns

I have a Table A where i column 1 Column 2 Column 3 Column 4 and Column 5.
Column 1,2,3,4 already have data and we need to update Column 5 based on that data and on priority .
Column 1 has Priority 5 , Col 2 has Priority 4 ,Col 3 has priority 3 and Col 4 has priority 2.
So if a particular row has all the column , then it should pick up Col 1 since it has highest priority and update Col 5 ,
If a record has data only in Col 3 and 4 then it should be Col3 and update in Col 5 since 3 has higher priority than Col4 .
If there is no data from Col 1-4 , col 5 should be null.
I have 24k records in my Table and i need to run this for all rows.
Any pointers for this query would he highly appreciated .
I think you want coalesce() -- assuming that the columns with no values have NULL:
update t
set col5 = coalesce(col1, col2, col3, col4);
You can also put the coalesce() in a select, if you don't want to actually change the data.

Explode range of integers out for joining in SQL

I have one table that stores a range of integers in a field, sort of like a print range, (e.g. "1-2,4-7,9-11"). This field could also contain a single number.
My goal is to join this table to a second one that has discrete values instead of ranges.
So if table one contains
1-2,5
9-15
7
And table two contains
1
2
3
4
5
6
7
8
9
10
The result of the join would be
1-2,5 1
1-2,5 2
1-2,5 5
7 7
9-15 9
9-15 10
Working in SQL Server 2008 R2.
Use a string split function of your choice to split on comma. Figure out the min/max values and join using between.
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table T1(Col1 varchar(10))
create table T2(Col2 int)
insert into T1 values
('1-2,5'),
('9-15'),
('7')
insert into T2 values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)
Query 1:
select T1.Col1,
T2.Col2
from T2
inner join (
select T1.Col1,
cast(left(S.Item, charindex('-', S.Item+'-')-1) as int) MinValue,
cast(stuff(S.Item, 1, charindex('-', S.Item), '') as int) MaxValue
from T1
cross apply dbo.Split(T1.Col1, ',') as S
) as T1
on T2.Col2 between T1.MinValue and T1.MaxValue
Results:
| COL1 | COL2 |
----------------
| 1-2,5 | 1 |
| 1-2,5 | 2 |
| 1-2,5 | 5 |
| 9-15 | 9 |
| 9-15 | 10 |
| 7 | 7 |
Like everybody has said, this is a pain to do natively in SQL Server. If you must then I think this is the proper approach.
First determine your rules for parsing the string, then break down the process into well-defined and understood problems.
Based on your example, I think this is the process:
Separate comma separated values in the string into rows
If the data does not contain a dash, then it's finished (it's a standalone value)
If it does contain a dash, parse the left and right sides of the dash
Given the left and right sides (the range) determine all the values between them into rows
I would create a temp table to populate the parsing results into which needs two columns:
SourceRowID INT, ContainedValue INT
and another to use for intermediate processing:
SourceRowID INT, ContainedValues VARCHAR
Parse your comma-separated values into their own rows using a CTE like this Step 1 is now a well-defined and understood problem to solve:
Turning a Comma Separated string into individual rows
So your result from the source
'1-2,5'
will be:
'1-2'
'5'
From there, SELECT from that processing table where the field does not contain a dash. Step 2 is now a well-defined and understood problem to solve These are standalone numbers and can go straight into the results temp table. The results table should also get the ID reference to the original row.
Next would be to parse the values to the left and right of the dash using CHARINDEX to locate it, then the appropriate LEFT and RIGHT functions as needed. This will give you the starting and ending value.
Here is a relevant question for accomplishing this step 3 is now a well-defined and understood problem to solve:
T-SQL substring - separating first and last name
Now you have separated the starting and ending values. Use another function which can explode this range. Step 4 is now a well-defined and understood problem to solve:
SQL: create sequential list of numbers from various starting points
SELECT all N between #min and #max
What is the best way to create and populate a numbers table?
and, also, insert it into the temp table.
Now what you should have is a temp table with every value in the exploded range.
Simply JOIN that to the other table on the values now, then to your source table on the ID reference and you're there.
My suggestion is to add one more field and many more records to your ranges table. Specifically, the primary key would be the integer and the other field would be the range. Records would look like this:
number range
1 1-2,5
2 1-2,5
3 na
4 na
5 1-2,5
etc
Having said that, this is still rather limiting because a number can only have one range. If you want to be thorough, set up a many to many relationship between numbers and ranges.
As far as I can tell you best option is something like below:
Create a table value function that accepts your ranges an converts them to a collection of ints. So 1-3,5 would return:
1
2
3
5
Then use these results to join to other tables. I don't have an exact function to do this at hand, but this one seems like an excellent start.

Will multiple columns concatenate in the same order if using STUFF and For Xml Path

Please see http://www.sqlfiddle.com/#!3/fb107/3 for an example schema and query I want to run.
I want to use the STUFF and FOR XML PATH('') solution to concatenate columns having grouped by another column.
If I use this method to concatenate multiple columnns into a csv list, am I guaranteed that the order will be the same in each concatenated string? So if the table was:
ID Col1 Col2 Col3
1 1 1 1
1 2 2 2
1 3 3 3
2 4 4 4
2 5 5 5
2 5 5 5
Am I certain that if Col1 is concatenated such that the result is:
ID Col1Concatenated
1 1,2,3
2 4,5,6
That Col2Concatenated will also be in the same order ("1,2,3", "4,5,6") as opposed to ("2,3,1", "5,6,4") for example?
This solution will only work for me if the index of each row's value is the same in each of the concatenated values. i.e. first row is first in each csv list, second row is second in each csv list etc.
You can add an ORDER BY clause in the query within your STUFF function

Oracle Partitioned Sequence

I'm trying to see if exists something to create a sequence with partition logic.
I need a sequence number that depend on other primary key ex:
id_person sequence id
1 | 1
1 | 2
2 | 1
3 | 1
1 | 3
so the sequence must depend on the id_person partition. Is there something like this on oracle or i must implement it by myself on the application level?
thank you.
Hi have create this PLSQL package one function and procedure:
PROCEDURE INIT_SEQUENCE(NAME varchar2, pkColumnNameList PARTITIONED_SEQUENCE_PK_COLUMN);
FUNCTION GET_NEXT_SEQUENCE_VALUE(NAME varchar2, pkPartitionColValue PARTITIONED_SEQUENCE_COL_VALUE) RETURN NUMBER;
INIT_SEQUENCE - get in input the name to associate at the sequence and a list of column name that are the fixed primary key part that vincolate the sequence Ex:'ID_PERSON'
the work of this procedure is to create the table that will manage the increment of sequence according to pkColumnNameList column.
GET_NEXT_SEQUENCE_VALUE- get the name of sequence to increment and the value of pkColumnNameList primary key and make the next step:
1) Create dynamically the sql to work
2) dbms_lock.allocate_unique(); to lock the table
3) check if is present a record in the table for pk value in input
4) if a record is present update the record with max + 1 in the sequence column
5) if a record is not present insert the new record with the 1 in the sequence column
6) return new id;
i would like to receive comment about this thanks in advance...
Is the actual requirement that the secondary sequence be gap free? If so, you've got a giant serialization/scalability issue.
If you need to present a gap-free sequence for human consumption, you could use an actual sequence (or a timestamp, for that matter) as Nick Pierpont suggests and preserve scalability, you could use analytic functions.
Dataset (t1):
ID_PERSON SEQUENCE_ID
---------- -----------
1 1
2 2
3 3
1 4
1 5
1 6
2 7
3 8
1 9
SQL:
select *
from
(select id_person,
sequence_id as orig_sequence_id,
rank ()
over (partition by id_person
order by sequence_id)
as new_sequence_id
from t1
)
order by id_person, new_sequence_id;
Result:
ID_PERSON ORIG_SEQUENCE_ID NEW_SEQUENCE_ID
---------- ---------------- ---------------
1 1 1
1 4 2
1 5 3
1 6 4
1 9 5
2 2 1
2 7 2
3 3 1
3 8 2
I'm afraid you have to do it like this:
INSERT INTO t
(
id_person,
sequence_id
)
VALUES
(
<your_person_id>,
( SELECT 1 + NVL( MAX( sequence_id ), 0 )
FROM t
WHERE t.id_person = <your_person_id>
)
)
What you are looking for is not a sequence, as the Oracle Documentation claims: "The sequence generator provides a sequential series of numbers".
You are looking for a calculated field depending on another, in this case the primary key. As other suggested you need to add the logic on your code. It means in a procedure or in the insert sentence.