convert PL/SQL to SQL - Connect By Statement [duplicate] - sql

How can I get the functionality of CONNECT BY PRIOR of Oracle in SQL Server 2000/2005/2008?

The SQL standard way to implement recursive queries, as implemented e.g. by IBM DB2 and SQL Server, is the WITH clause. See this article for one example of translating a CONNECT BY into a WITH (technically a recursive CTE) -- the example is for DB2 but I believe it will work on SQL Server as well.
Edit: apparently the original querant requires a specific example, here's one from the IBM site whose URL I already gave. Given a table:
CREATE TABLE emp(empid INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(10),
salary DECIMAL(9, 2),
mgrid INTEGER);
where mgrid references an employee's manager's empid, the task is, get the names of everybody who reports directly or indirectly to Joan. In Oracle, that's a simple CONNECT:
SELECT name
FROM emp
START WITH name = 'Joan'
CONNECT BY PRIOR empid = mgrid
In SQL Server, IBM DB2, or PostgreSQL 8.4 (as well as in the SQL standard, for what that's worth;-), the perfectly equivalent solution is instead a recursive query (more complex syntax, but, actually, even more power and flexibility):
WITH n(empid, name) AS
(SELECT empid, name
FROM emp
WHERE name = 'Joan'
UNION ALL
SELECT nplus1.empid, nplus1.name
FROM emp as nplus1, n
WHERE n.empid = nplus1.mgrid)
SELECT name FROM n
Oracle's START WITH clause becomes the first nested SELECT, the base case of the recursion, to be UNIONed with the recursive part which is just another SELECT.
SQL Server's specific flavor of WITH is of course documented on MSDN, which also gives guidelines and limitations for using this keyword, as well as several examples.

#Alex Martelli's answer is great!
But it work only for one element at time (WHERE name = 'Joan')
If you take out the WHERE clause, the query will return all the root rows together...
I changed a little bit for my situation, so it can show the entire tree for a table.
table definition:
CREATE TABLE [dbo].[mar_categories] (
[category] int IDENTITY(1,1) NOT NULL,
[name] varchar(50) NOT NULL,
[level] int NOT NULL,
[action] int NOT NULL,
[parent] int NULL,
CONSTRAINT [XPK_mar_categories] PRIMARY KEY([category])
)
(level is literally the level of a category 0: root, 1: first level after root, ...)
and the query:
WITH n(category, name, level, parent, concatenador) AS
(
SELECT category, name, level, parent, '('+CONVERT(VARCHAR (MAX), category)+' - '+CONVERT(VARCHAR (MAX), level)+')' as concatenador
FROM mar_categories
WHERE parent is null
UNION ALL
SELECT m.category, m.name, m.level, m.parent, n.concatenador+' * ('+CONVERT (VARCHAR (MAX), case when ISNULL(m.parent, 0) = 0 then 0 else m.category END)+' - '+CONVERT(VARCHAR (MAX), m.level)+')' as concatenador
FROM mar_categories as m, n
WHERE n.category = m.parent
)
SELECT distinct * FROM n ORDER BY concatenador asc
(You don't need to concatenate the level field, I did just to make more readable)
the answer for this query should be something like:
I hope it helps someone!
now, I'm wondering how to do this on MySQL... ^^

I haven't used connect by prior, but a quick search shows it's used for tree structures. In SQL Server, you use common table expressions to get similar functionality.

Related

Is there a better way to write this gross SQL?

So I'm creating a query for a report that could have several optional filters. I've only included client and station here to keep it simple. Each of these options could be an include or an exclude and could contain NULL, 1, or multiple values. So I split the varchar into a table before joining it to the query.
This test takes about 15 minutes to execute, which... just won't do :p Is there a better way? We have similar queries written with dynamic sql, and I was trying to avoid that, but maybe there's no way around it for this?
DECLARE
#ClientsInc VARCHAR(10) = 'ABCD, EFGH',
#ClientsExc VARCHAR(10) = NULL,
#StationsInc VARCHAR(10) = NULL,
#StationsExc VARCHAR(10) = 'SomeStation'
SELECT *
INTO #ClientsInc
FROM dbo.StringSplit(#ClientsInc, ',')
SELECT *
INTO #ClientsExc
FROM dbo.StringSplit(#ClientsExc, ',')
SELECT *
INTO #StationsInc
FROM dbo.StringSplit(#StationsInc, ',')
SELECT *
INTO #StationsExc
FROM dbo.StringSplit(#StationsExc, ',')
SELECT [some stuff]
FROM media_order mo
LEFT JOIN #ClientsInc cInc WITH(NOLOCK) ON cInc.Value = mo.client_code
LEFT JOIN #ClientsExc cExc WITH(NOLOCK) ON cExc.Value = mo.client_code
LEFT JOIN #StationsInc sInc WITH(NOLOCK) ON sInc.Value = mo.station_name
LEFT JOIN #StationsExc sExc WITH(NOLOCK) ON sExc.Value = mo.station_name
WHERE ((#ClientsInc IS NOT NULL AND cInc.Value IS NOT NULL)
OR (#ClientsExc IS NOT NULL AND cExc.Value IS NULL)
)
AND ((#StationsInc IS NOT NULL AND sInc.Value IS NOT NULL)
OR (#StationsExc IS NOT NULL AND sExc.Value IS NULL)
)
First of all, I always tend to mention Erland Sommarskog's Dynamic Search Conditions in such cases.
However, you already seem to be aware of the two options: one is dynamic SQL. The other is usually the old trick and (#var is null or #var=respective_column). This trick, however, works only for one value per variable.
Your solution indeed seems to work for multiple values. But in my opinion, you are trying too hard to avoid dynamic sql. Your requirements are complex enough to guarantee it. And remember, usually, dynamic sql is harder for you to code, but easier for the server in complex cases - and this one certainly is. Making a performance guess is always risky, but I would guess an improvement in this case.
I would use exists and not exists:
select ...
from media_order mo
where
(
#ClientsInc is null
or exists (
select 1
from string_split(#ClientsInc, ',')
where value = mo.client_code
)
)
and not exist (
select 1
from string_split(#ClientsExc, ',')
where value = mo.client_code
)
and (
#StationsInc is null
or exists (
select 1
from string_split(#StationsInc, ',')
where value = mo.station_name
)
)
and not exist (
select 1
from string_split(#StationsExc, ',')
where value = mo.station_name
)
Notes:
I used buil-in function string_split() rather than the custom splitter that you seem to be using. It is available in SQL Server 2016 and higher, and returns a single column called value. You can change that back to your customer function if you are running an earlier version
as I understand the logic you want, "include" parameters need to be checked for nullness before using exists, while it is unnecessary for "exclude" variables

SQL Server : join on uniqueidentifier

I have two tables Backup and Requests.
Below is the script for both the tables
Backup
CREATE TABLE UserBackup(
FileName varchar(70) NOT NULL,
)
File name is represented by a guid. Sometimes there is some additional information related to the file. Hence we have entries like guid_ADD entried in table.
Requests
CREATE TABLE Requests(
RequestId UNIQUEIDENTIFIER NOT NULL,
Status int Not null
)
Here are some sample rows :
UserBackup table:
FileName
15b993cc-e8be-405d-bb9f-0c58b66dcdfe
4cffe724-3f68-4710-b785-30afde5d52f8
4cffe724-3f68-4710-b785-30afde5d52f8_Add
7ad22838-ddee-4043-8d1f-6656d2953545
Requests table:
RequestId Status
15b993cc-e8be-405d-bb9f-0c58b66dcdfe 1
4cffe724-3f68-4710-b785-30afde5d52f8 1
7ad22838-ddee-4043-8d1f-6656d2953545 2
What I need is to return all the rows from userbackup table whose name (the guid) is matches RequestId in the Requests table and the status is 1. So here is the query I wrote
Select *
from UserBackup
inner join Requests on UserBackup.FileName = Requests.RequestId
where Requests.Status = 1
And this works fine. It returns me the following result
FileName RequestId Status
15b993cc-e8be-405d-bb9f-0c58b66dcdfe 15b993cc-e8be-405d-bb9f-0c58b66dcdfe 1
4cffe724-3f68-4710-b785-30afde5d52f8 4cffe724-3f68-4710-b785-30afde5d52f8 1
4cffe724-3f68-4710-b785-30afde5d52f8_Add 4cffe724-3f68-4710-b785-30afde5d52f8 1
This is exactly what I want. But what I don't understand is how it is working. If you notice the result is returning 4cffe724-3f68-4710-b785-30afde5d52f8_Add row as well. The inner join is on varchar and uniqueidentifier, and this join instead of working like "Equals to" comparison works like "contains" comparison. I want to know how this works so that I can be sure to use this code without any unexpected scenarios.
The values on both sides of a comparison have to be of the same data type. There's no such thing as, say, comparing a uniqueidentifier and a varchar.
uniqueidentifier has a higher precedence than varchar so the varchars will be converted to uniqueidentifiers before the comparison occurs.
Unfortunately, you get no error or warning if the string contains more characters than are needed:
select CONVERT(uniqueidentifier,'4cffe724-3f68-4710-b785-30afde5d52f8_Add')
Result:
4CFFE724-3F68-4710-B785-30AFDE5D52F8
If you want to force the comparison to occur between strings, you'll have to perform an explicit conversion:
Select *
from UserBackup
inner join Requests
on UserBackup.FileName = CONVERT(varchar(70),Requests.RequestId)
where Requests.Status = 1
When you compare two columns of different data types SQL Server will attempt to do implicit conversion on lower precedence.
The following comes from MSDN docs on uniqueidentifier
The following example demonstrates the truncation of data when the
value is too long for the data type being converted to. Because the
uniqueidentifier type is limited to 36 characters, the characters that
exceed that length are truncated.
DECLARE #ID nvarchar(max) = N'0E984725-C51C-4BF4-9960-E1C80E27ABA0wrong';
SELECT #ID, CONVERT(uniqueidentifier, #ID) AS TruncatedValue;
http://msdn.microsoft.com/en-us/library/ms187942.aspx
Documentation is clear that data is truncated
When ever you are unsure about your join operation you can verify Actual Execution Plan.
Here is test sample that you can run inside SSMS or SQL Sentry Plan Explorer
DECLARE #userbackup TABLE ( _FILENAME VARCHAR(70) )
INSERT INTO #userbackup
VALUES ( '15b993cc-e8be-405d-bb9f-0c58b66dcdfe' ),
( '4cffe724-3f68-4710-b785-30afde5d52f8' ),
( '4cffe724-3f68-4710-b785-30afde5d52f8_Add' )
, ( '7ad22838-ddee-4043-8d1f-6656d2953545' )
DECLARE #Requests TABLE
(
requestID UNIQUEIDENTIFIER
,_Status INT
)
INSERT INTO #Requests
VALUES ( '15b993cc-e8be-405d-bb9f-0c58b66dcdfe', 1 )
, ( '4cffe724-3f68-4710-b785-30afde5d52f8', 1 )
, ( '7ad22838-ddee-4043-8d1f-6656d2953545', 2 )
SELECT *
FROM #userbackup u
JOIN #Requests r
ON u.[_FILENAME] = r.requestID
WHERE r.[_Status] = 1
Instead of regular join operation SQL Server is doing HASH MATCH with EXPR 1006 in SSMS it is hard to see what is doing but if you open XML file you will find this
<ColumnReference Column="Expr1006" />
<ScalarOperator ScalarString="CONVERT_IMPLICIT(uniqueidentifier,#userbackup.[_FILENAME] as [u].[_FILENAME],0)">
When ever in doubt check execution plan and always make sure to match data types when comparing.
This is great blog Data Mismatch on WHERE Clause might Cause Serious Performance Problems from Microsoft engineer on exact problem.
What is happening here is the FileName is being converted from varchar to a UniqueIdentifier, and during that process it ignores anything after the first 36 characters.
You can see it in action here
Select convert(uniqueidentifier, UserBackup.FileName), FileName
from UserBackup
It works, but to reduce confusion for the next person to come along, you might want to store the RequestId associated with the UserBackup as a GUID in the UserBackup table and join on that.
At the very least put a comment in ;)

SQL conversion from varchar to uniqueidentifier fails in view

I'm stuck on the following scenario.
I have a database with a table with customer data and a table where I put records for monitoring what is happening on our B2B site.
The customer table is as follow:
ID, int, not null
GUID, uniqueidentfier, not null, primary key
Other stuff...
The monitoring table:
ID, int, not null
USERGUID, uniqueidentifier, null
PARAMETER2, varchar(50), null
Other stuff...
In PARAMETER1 are customer guids as wel as other data types stored.
Now the question came to order our customers according their last visit date, the most recent visited customers must come on the top of a grid.
I'm using Entity Framework and I had problems of comparing the string and the guid type, so I decided to make a view on top of my monitoring table:
SELECT
ID,
CONVERT(uniqueidentifier, parameter2) AS customerguid,
USERguid,
CreationDate
FROM
MONITORING
WHERE
(dbo.isuniqueidentifier(parameter2) = 1)
AND
(parameter1 LIKE 'Customers_%' OR parameter1 LIKE 'Customer_%')
I imported the view in EF and made my Linq query. It returned nothing, so I extracted the generated SQL query. When testing the query in SQL Management Studio I got the following error:
Conversion failed when converting from a character string to uniqueidentifier.
The problem lies in the following snippet (simplified for this question, but also gives an error:
SELECT *,
(
SELECT
[v_LastViewDateCustomer].[customerguid] AS [customerguid]
FROM [dbo].[v_LastViewDateCustomer] AS [v_LastViewDateCustomer]
WHERE c.GUID = [v_LastViewDateCustomer].[customerguid]
)
FROM CM_CUSTOMER c
But when I do a join, I get my results:
SELECT *
FROM CM_CUSTOMER c
LEFT JOIN
[v_LastViewDateCustomer] v
on c.GUID = v.customerguid
I tried to make a SQL fiddle, but it is working on that site. http://sqlfiddle.com/#!3/66d68/3
Anyone who can point me in the right direction?
Use
TRY_CONVERT(UNIQUEIDENTIFIER, parameter2) AS customerguid
instead of
CONVERT(UNIQUEIDENTIFIER, parameter2) AS customerguid
Views are inlined into the query and the CONVERT can run before the WHERE.
For some additional discussion see SQL Server should not raise illogical errors

SQL IN query produces strange result

Please see the table structure below:
CREATE TABLE Person (id int not null, PID INT NOT NULL, Name VARCHAR(50))
CREATE TABLE [Order] (OID INT NOT NULL, PID INT NOT NULL)
INSERT INTO Person VALUES (1,1,'Ian')
INSERT INTO Person VALUES (2,2,'Maria')
INSERT INTO [Order] values (1,1)
Why does the following query return two results:
select * from Person WHERE id IN (SELECT ID FROM [Order])
ID does not exist in Order. Why does the query above produce results? I would expect it to error because I'd does not exist in order.
This behavior, while unintuitive, is very well defined in Microsoft's Knowledge Base:
KB #298674 : PRB: Subquery Resolves Names of Column to Outer Tables
From that article:
To illustrate the behavior, use the following two table structures and query:
CREATE TABLE X1 (ColA INT, ColB INT)
CREATE TABLE X2 (ColC INT, ColD INT)
SELECT ColA FROM X1 WHERE ColA IN (Select ColB FROM X2)
The query returns a result where the column ColB is considered from table X1.
By qualifying the column name, the error message occurs as illustrated by the following query:
SELECT ColA FROM X1 WHERE ColA in (Select X2.ColB FROM X2)
Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'ColB'.
Folks have been complaining about this issue for years, but Microsoft isn't going to fix it. It is, after all, complying with the standard, which essentially states:
If you don't find column x in the current scope, traverse to the next outer scope, and so on, until you find a reference.
More information in the following Connect "bugs" along with multiple official confirmations that this behavior is by design and is not going to change (so you'll have to change yours - i.e. always use aliases):
Connect #338468 : CTE Column Name resolution in Sub Query is not validated
Connect #735178 : T-SQL subquery not working in some cases when IN operator used
Connect #302281 : Non-existent column causes subquery to be ignored
Connect #772612 : Alias error not being reported when within an IN operator
Connect #265772 : Bug using sub select
In your case, this "error" will probably be much less likely to occur if you use more meaningful names than ID, OID and PID. Does Order.PID point to Person.id or Person.PID? Design your tables so that people can figure out the relationships without having to ask you. A PersonID should always be a PersonID, no matter where in the schema it is; same with an OrderID. Saving a few characters of typing is not a good price to pay for a completely ambiguous schema.
You could write an EXISTS clause instead:
... FROM dbo.Person AS p WHERE EXISTS
(
SELECT 1 FROM dbo.[Order] AS o
WHERE o.PID = p.id -- or is it PID? See why it pays to be explicit?
);
The problem here is that you're not using Table.Column notation in your subquery, table Order doesn't have column ID and ID in subquery really means Person.ID, not [Order].ID. That's why I always insist on using aliases for tables in production code. Compare these two queries:
select * from Person WHERE id IN (SELECT ID FROM [Order]);
select * from Person as p WHERE p.id IN (SELECT o.ID FROM [Order] as o)
The first one will execute but will return incorrect results, and the second one will raise an error. It's because the outer query's columns may be referenced in a subquery, so in this case you can use Person columns inside the subquery.
Perhaps you wanted to use the query like this:
select * from Person WHERE pid IN (SELECT PID FROM [Order])
But you never know when the schema of the [Order] table changes, and if somebody drops the column PID from [Order] then your query will return all rows from the table Person. Therefore, use aliases:
select * from Person as P WHERE P.pid IN (SELECT O.PID FROM [Order] as O)
Just quick note - this is not SQL Server specific behaviour, it's standard SQL:
SQL Server demo
PostgreSQL demo
MySQL demo
Oracle demo
Order table doesnt have id column
Try these instead:
select * from Person WHERE id IN (SELECT OID FROM [Order])
OR
select * from Person WHERE pid IN (SELECT PID FROM [Order])

Sql query that numerates the returned result

How to write one SQL query that selects a column from a table but returns two columns where the additional one contains an index of the row (a new one, starting with 1 to n). It must be without using functions that do that (like row_number()).
Any ideas?
Edit: it must be a one-select query
You can do this on any database:
SELECT (SELECT COUNT (1) FROM field_company fc2
WHERE fc2.field_company_id <= fc.field_company_id) AS row_num,
fc.field_company_name
FROM field_company fc
SET NOCOUNT ON
DECLARE #item_table TABLE
(
row_num INT IDENTITY(1, 1) NOT NULL PRIMARY KEY, --THE IDENTITY STATEMENT IS IMPORTANT!
field_company_name VARCHAR(255)
)
INSERT INTO #item_table
SELECT field_company_name FROM field_company
SELECT * FROM #item_table
if you are using Oracle or a database that supports Sequence objects, make a new db sequence object for this purpose. Next create a view, and run this.
insert into the view as select column_name, sequence.next from table
In mysql you can :
SELECT Row,Column1
FROM (SELECT #row := #row + 1 AS Row, Column1 FROM table1 )
As derived1
I figured out a hackish way to do this that I'm a bit ashamed of. On Postgres 8.1:
SELECT generate_series, (SELECT username FROM users LIMIT 1 OFFSET generate_series) FROM generate_series(0,(SELECT count(*) - 1 FROM users));
I believe this technique will work even if your source table does not have unique ids or identifiers.
On SQL Server 2005 and higher, you can use OVER to accomplish this:
SELECT rank() over (order by company_id) as rownum
, company_name
FROM company