SQL Server 2014 equivalent to mysql's find_in_set() - sql

I'm working with a database that has a locations table such as:
locationID | locationHierarchy
1 | 0
2 | 1
3 | 1,2
4 | 1
5 | 1,4
6 | 1,4,5
which makes a tree like this
1
--2
----3
--4
----5
------6
where locationHierarchy is a csv string of the locationIDs of all its ancesters (think of a hierarchy tree). This makes it easy to determine the hierarchy when working toward the top of the tree given a starting locationID.
Now I need to write code to start with an ancestor and recursively find all descendants. MySQL has a function called 'find_in_set' which easily parses a csv string to look for a value. It's nice because I can just say "find in set the value 4" which would give all locations that are descendants of locationID of 4 (including 4 itself).
Unfortunately this is being developed on SQL Server 2014 and it has no such function. The CSV string is a variable length (virtually unlimited levels allowed) and I need a way to find all ancestors of a location.
A lot of what I've found on the internet to mimic the find_in_set function into SQL Server assumes a fixed depth of hierarchy such as 4 levels maximum) which wouldn't work for me.
Does anyone have a stored procedure or anything that I could integrate into a query? I'd really rather not have to pull all records from this table to use code to individually parse the CSV string.
I would imagine searching the locationHierarchy string for locationID% or %,{locationid},% would work but be pretty slow.

I think you want like -- in either database. Something like this:
select l.*
from locations l
where l.locationHierarchy like #LocationHierarchy + ',%';
If you want the original location included, then one method is:
select l.*
from locations l
where l.locationHierarchy + ',' like #LocationHierarchy + ',%';
I should also note that SQL Server has proper support for recursive queries, so it has other options for hierarchies apart from hierarchy trees (which are still a very reasonable solution).

Finally It worked for me..
SELECT * FROM locations WHERE locationHierarchy like CONCAT(#param,',%%') OR
o.unitnumber like CONCAT('%%,',#param,',%%') OR
o.unitnumber like CONCAT('%%,',#param)

Related

Access 2016 SQL: Find minimum absolute difference between two columns of different tables

I haven't been able to figure out exactly how to put together this SQL string. I'd really appreciate it if someone could help me out. I am using Access 2016, so please only provide answers that will work with Access. I have two queries that both have different fields except for one in common. I need to find the minimum absolute difference between the two similar columns. Then, I need to be able to pull the data from that corresponding record. For instance,
qry1.Col1 | qry1.Col2
-----------|-----------
10245.123 | Have
302044.31 | A
qry2.Col1 | qry2.Col2
----------------------
23451.321 | Great
345622.34 | Day
Find minimum absolute difference in a third query, qry3. For instance, Min(Abs(qry1!Col1 - qry2!Col1) I imagine it would produce one of these tables for each value in qry1.Col1. For the value 10245.123,
qry3.Col1
----------
13206.198
335377.217
Since 13206.198 is the minimum absolute difference, I want to pull the record corresponding to that from qry2 and associate it with the data from qry1 (I'm assuming this uses a JOIN). Resulting in a fourth query like this,
qry4.Col1 (qry1.Col1) | qry4.Col2 (qry1.Col2) | qry4.Col3 (qry2.Col2)
----------------------------------------------------------------------
10245.123 | Have | Great
302044.31 | A | Day
If this is all doable in one SQL string, that would be great. If a couple of steps are required, that's okay as well. I just would like to avoid having to time consumingly do this using loops and RecordSet.Findfirst in VBA.
You can use a correlated subquery:
select q1.*,
(select top 1 q2.col2
from qry2 as q2
order by abs(q2.col1 - q1.col1), q2.col2
) as qry2_col2
from qry1 as q1;

comma separated column in linq where clause

i have string of value like "4,3,8"
and i had comma separated column in table as below.
ID | PrdID | cntrlIDs
1 | 1 | 4,8
2 | 2 | 3
3 | 3 | 3,4
4 | 4 | 5,6
5 | 5 | 10,14,18
i want only those records from above table which match in above mention string
eg.
1,2,3 this records will need in output because its match with the passing string of "4,3,8"
Note : i need this in entity framework LINQ Query.
string[] arrSearchFilter = "4,3,8".Split(',');
var query = (from prdtbl in ProductTables
where prdtbl.cntrlIDs.Split(',').Any(x=> arrSearchFilter.Contains(x))
but its not working and i got below error
LINQ to Entities does not recognize the method 'System.String[] Split(Char[])' method, and this method cannot be translated into a store expression.
LINQ to Entities tries to convert query expressions to SQL. String.Split is not one of the supported methods. See http://msdn.microsoft.com/en-us/library/vstudio/bb738534(v=vs.100).aspx
Assuming you are unable to redesign the database structure, you have to bypass the SQL filter and obtain ALL records and then apply the filter. You can do this by using ProductTables.ToList() and then using this in second query with the string split, e.g.
string[] arrSearchFilter = "4,3,8".Split(',');
var products = ProductTables.ToList();
var query = (from prdtbl in products
where prdtbl.cntrlIDs.Split(',').Any(x=> arrSearchFilter.Contains(x))
This is not a good idea if the Product table is large, as you are losing a key benefit of SQL and loading ALL the data before filtering.
Redesign
If that is a problem and you can change the database structure, you should create a child table that replaces the comma-separated values with a proper normalised structure. Comma separated variables might look like a convenient shortcut but they are not a good design and as you have found, are not easy to work with in SQL.
SQL
If the design cannot be changed and the table is large, then your only other option is to hand-roll the SQL and execute this directly, but this would lose some of the benefits of having Linq.

user defined psuedocolumn oracle

I have a large dataset in an oracle database that is currently accessed from Java one item at a time. For example if a user is trying to do a bulk get of 50 items it will process them sequentially, calling a stored procedure for each one. I am now trying to implement a bulk get, but am having some difficulty due to the way the user can pass in a range query:
An example table:
prim_key | identifier | start | end
----------+--------------+---------+-------
1 | aaa | 1 | 3
2 | aaa | 3 | 7
3 | bbb | 1 | 5
The way it works is that if you have a query like (id='aaa' and pos=1) it will find prim_key = 1, but if you query (id='aaa' and pos=2) it won't find anything. If you do (id='aaa' and pos=-2) then it will again find prim_key=1 because the stored proc converts the -2 into a range scan equivalent to start<=2 and end>2.
(Extra context: the start/end are actually dates and this querying mechanism allows efficient "latest as of date" queries as opposed to doing something like select prim_key,
start from myTable
where start = (select max(start) from myTable where start <= 2))
This is all fine and works correctly for single gets, but now I'm trying to do bulk gets so that we can speed up the batch considerably. The first attempt was to multithread the individual calls, but it put too much stress on the database to be doing so many parallel queries on the same table. To solve this I've been trying to create a query like
select prim_key
from myTable
where (identifier='aaa' and start=3)
or (identifier='aaa' and start<=2 and end>2)
building this up from the list of input parameters ('aaa',3 ; 'bbb',-2), which works well and produces an explain plan using all of the indexes I would expect.
My Problem: I need to know what the input parameters were that retrieved that row in order to do further processing and return the relevant prim_key. I need to use something like a psuedocolumn that I can define myself:
select prim_key, PSUEDO
from myTable
where (identifier='aaa' and start=3 and PSUEDO='a3')
or (identifier='aaa' and start<=2 and end>2 and PSUEDO='a-2')
but I can't find any way to return a value from the where clause, and I think subqueries would lose the indexing efficiencies gained by doing it all in one select.
Try something like:
select
prim_key,
case when start = 3 then 'a3' else 'a-2' end pseudo
from
you_table
where
...

Locating all reachable nodes using SQL

Suppose a table with two columns: From and To. Example:
From To
1 2
2 3
2 4
4 5
I would like to know the most effective way to locate all nodes that are reachable from a node using a SQL Query. Example: given 1 it would return 2,3,4 and 5. It is possible to use several queries united by UNION clauses but it would limit the number of levels that can be reached. Perhaps a different data structure would make the problem more tractable but this is what is available.
I am using Firebird but I would like have a solution that only uses standard SQL.
You can use a recursive common table expression if you use most brands of database -- except for MySQL and SQLite and a few other obscure ones (sorry, I do consider Firebird obscure). This syntax is ANSI SQL standard, but Firebird doesn't support it yet.
Correction: Firebird 2.1 does support recursive CTE's, as #Hugues Van Landeghem comments.
Otherwise see my presentation Models for Hierarchical Data with SQL for several different approaches.
For example, you could store additional rows for every path in your tree, not just the immediate parent/child paths. I call this design Closure Table.
From To Length
1 1 0
1 2 1
1 3 2
1 4 2
1 5 3
2 2 0
2 3 1
2 4 1
3 3 0
4 4 0
4 5 1
5 5 0
Now you can query SELECT * FROM MyTable WHERE From = 1 and get all the descendants of that node.
PS: I'd avoid naming a column From, because that's an SQL reserved word.
Unfortunately there isn't a good generic solution to this that will work for all situations on all databases.
I recommend that you look at these resources for a MySQL solution:
Managing Hierarchical Data in MySQL
Models for hierarchical data - presentation by Bill Karwin which discusses this subject, demonstrates different solutions, and compares the adjacency list model you are using with other alternative models.
For PostgreSQL and SQL Server you should take a look at recursive CTEs.
If you are using Oracle you should look at CONNECT BY which is a proprietary extension to SQL that makes dealing with tree structures much easier.
With standard SQL the only way to store a tree with acceptable read performance is by using a hack such as path enumeration. Note that this is very heavy on writes.
ID PATH
1 1
2 1;2
3 1;2;3
4 1;2;4
SELECT * FROM tree WHERE path LIKE '%2;%'

SQL Query with multiple values in one column

I've been beating my head on the desk trying to figure this one out. I have a table that stores job information, and reasons for a job not being completed. The reasons are numeric,01,02,03,etc. You can have two reasons for a pending job. If you select two reasons, they are stored in the same column, separated by a comma. This is an example from the JOBID table:
Job_Number User_Assigned PendingInfo
1 user1 01,02
There is another table named Pending, that stores what those values actually represent. 01=Not enough info, 02=Not enough time, 03=Waiting Review. Example:
Pending_Num PendingWord
01 Not Enough Info
02 Not Enough Time
What I'm trying to do is query the database to give me all the job numbers, users, pendinginfo, and pending reason. I can break out the first value, but can't figure out how to do the second. What my limited skills have so far:
select Job_number,user_assigned,SUBSTRING(pendinginfo,0,3),pendingword
from jobid,pending
where
SUBSTRING(pendinginfo,0,3)=pending.pending_num and
pendinginfo!='00,00' and
pendinginfo!='NULL'
What I would like to see for this example would be:
Job_Number User_Assigned PendingInfo PendingWord PendingInfo PendingWord
1 User1 01 Not Enough Info 02 Not Enough Time
Thanks in advance
You really shouldn't store multiple items in one column if your SQL is ever going to want to process them individually. The "SQL gymnastics" you have to perform in those cases are both ugly hacks and performance degraders.
The ideal solution is to split the individual items into separate columns and, for 3NF, move those columns to a separate table as rows if you really want to do it properly (but baby steps are probably okay if you're sure there will never be more than two reasons in the short-medium term).
Then your queries will be both simpler and faster.
However, if that's not an option, you can use the afore-mentioned SQL gymnastics to do something like:
where find ( ',' |fld| ',', ',02,' ) > 0
assuming your SQL dialect has a string search function (find in this case, but I think charindex for SQLServer).
This will ensure all sub-columns begin and start with a comma (comma plus field plus comma) and look for a specific desired value (with the commas on either side to ensure it's a full sub-column match).
If you can't control what the application puts in that column, I would opt for the DBA solution - DBA solutions are defined as those a DBA has to do to work around the inadequacies of their users :-).
Create two new columns in that table and make an insert/update trigger which will populate them with the two reasons that a user puts into the original column.
Then query those two new columns for specific values rather than trying to split apart the old column.
This means that the cost of splitting is only on row insert/update, not on _every single select`, amortising that cost efficiently.
Still, my answer is to re-do the schema. That will be the best way in the long term in terms of speed, readable queries and maintainability.
I hope you are just maintaining the code and it's not a brand new implementation.
Please consider to use a different approach using a support table like this:
JOBS TABLE
jobID | userID
--------------
1 | user13
2 | user32
3 | user44
--------------
PENDING TABLE
pendingID | pendingText
---------------------------
01 | Not Enough Info
02 | Not Enough Time
---------------------------
JOB_PENDING TABLE
jobID | pendingID
-----------------
1 | 01
1 | 02
2 | 01
3 | 03
3 | 01
-----------------
You can easily query this tables using JOIN or subqueries.
If you need retro-compatibility on your software you can add a view to reach this goal.
I have a tables like:
Events
---------
eventId int
eventTypeIds nvarchar(50)
...
EventTypes
--------------
eventTypeId
Description
...
Each Event can have multiple eventtypes specified.
All I do is write 2 procedures in my site code, not SQL code
One procedure converts the table field (eventTypeIds) value like "3,4,15,6" into a ViewState array, so I can use it any where in code.
This procedure does the opposite it collects any options your checked and converts it in
If changing the schema is an option (which it probably should be) shouldn't you implement a many-to-many relationship here so that you have a bridging table between the two items? That way, you would store the number and its wording in one table, jobs in another, and "failure reasons for jobs" in the bridging table...
Have a look at a similar question I answered here
;WITH Numbers AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS N
FROM JobId
),
Split AS
(
SELECT JOB_NUMBER, USER_ASSIGNED, SUBSTRING(PENDING_INFO, Numbers.N, CHARINDEX(',', PENDING_INFO + ',', Numbers.N) - Numbers.N) AS PENDING_NUM
FROM JobId
JOIN Numbers ON Numbers.N <= DATALENGTH(PENDING_INFO) + 1
AND SUBSTRING(',' + PENDING_INFO, Numbers.N, 1) = ','
)
SELECT *
FROM Split JOIN Pending ON Split.PENDING_NUM = Pending.PENDING_NUM
The basic idea is that you have to multiply each row as many times as there are PENDING_NUMs. Then, extract the appropriate part of the string
While I agree with DBA perspective not to store multiple values in a single field it is doable, as bellow, practical for application logic and some performance issues. Let say you have 10000 user groups, each having average 1000 members. You may want to have a table user_groups with columns such as groupID and membersID. Your membersID column could be populated like this:
(',10,2001,20003,333,4520,') each number being a memberID, all separated with a comma. Add also a comma at the start and end of the data. Then your select would use like '%,someID,%'.
If you can not change your data ('01,02,03') or similar, let say you want rows containing 01 you still can use " select ... LIKE '01,%' OR '%,01' OR '%,01,%' " which will insure it match if at start, end or inside, while avoiding similar number (ie:101).