How to use Join with like operator and then casting columns

How to use Join with like operator and then casting columns - sql

I have 2 tables with these columns:
CREATE TABLE #temp
(
Phone_number varchar(100) -- example data: "2022033456"
)
CREATE TABLE orders
(
Addons ntext -- example data: "Enter phone:2022033456<br>Thephoneisvalid"
)
I have to join these two tables using 'LIKE' as the phone numbers are not in same format. Little background I am joining the #temp table on the phone number with orders table on its Addons value. Then again in WHERE condition I am trying to match them and get some results. Here is my code. But my results that I am getting are not accurate. As its not returning any data. I don't know what I am doing wrong. I am using SQL Server.
select
*
from
order_no as n
join
orders as o on n.order_no = o.order_no
join
#temp as t on t.phone_number like '%'+ cast(o.Addons as varchar(max))+'%'
where
t.phone_number = '%' + cast(o.Addons as varchar(max)) + '%'

You can not use LIKE statement in the JOIN condition. Please provide more information on your tables. You have to convert the format of one of the phone field to compile with other phone field format in order to join.

I think your join condition is in the wrong order. Because your question explicitly mentions two tables, let's stick with those:
select *
from orders o JOIN
#temp t
on cast(o.Addons as varchar(max)) like '%' + t.phone_number + '%';
It has been so long since I dealt with the text data type (in SQL Server), that I don't remember if the cast() is necessary or not.

Instead of trying to do everything in a single top-level query, you should apply a transformation projection to your orders table and use that as a subquery, which will make the query easier to understand.
Using the CHARINDEX function will make this a lot easier, however it does not support ntext, you will need to change your schema to use nvarchar(max) instead - which you should be doing anyway as ntext is deprecated, fortunately you can use CONVERT( nvarchar(max), someNTextValue ), though this will reduce performance as you won't be able to use any indexes on your ntext values - but this query will run slowly anyway.
SELECT
orders2.*,
CASE WHEN orders2.PhoneStart > 0 AND orders2.PhoneEnd > 0 THEN
SUBSTRING( orders2.Addons, orders2.PhoneStart, orders2.PhoneEnd - orders2.PhoneStart )
ELSE
NULL
END AS ExtractedPhoneNumber
FROM
(
SELECT
orders.*, -- never use `*` in production, so replace this with the actual columns in your orders table
CHARINDEX('Enter phone:', Addons) AS PhoneStart,
CHARINDEX('<br>Thephoneisvalid', AddOns, CHARINDEX('Enter phone:', Addons) ) AS PhoneEnd
FROM
orders
) AS orders2
I suggest converting the above into a VIEW or CTE so you can directly query it in your JOIN expression:
CREATE VIEW ordersWithPhoneNumbers AS
-- copy and paste the above query here, then execute the batch to create the view, you only need to do this once.
Then you can use it like so:
SELECT
* -- again, avoid the use of the star selector in production use
FROM
ordersWithPhoneNumbers AS o2 -- this is the above query as a VIEW
INNER JOIN order_no ON o2.order_no = order_no.order_no
INNER JOIN #temp AS t ON o2.ExtractedPhoneNumber = t.phone_number
Actually, I take back my previous remark about performance - if you add an index to the ExtractedPhoneNumber column of the ordersWithPhoneNumbers view then you'll get good performance.

Related

How do I use SQL Server to select into another table?

I am consolidating a web service. I am replacing multiple calls to the service with one call that contains the data.
I have created a table:
CREATE TABLE InvResults
(
Invoices nvarchar(max),
InvoiceDetails nvarchar(max),
Products nvarchar(max)
);
I used (max) because I don't know how complex the json will get at this time.
I need to do some sort of selects like this (this is pseudocode, not actual SQL):
SELECT
(SELECT *
INTO InvResults for Column Invoices
FROM MyInvoiceTable
WHERE SomeColumns = 'someStuffvariable'
FOR JSON PATH, ROOT('invoices')) AS invoices;
SELECT
(SELECT *
INTO InvResults for Column InvoiceDetails
FROM MyInvoiceDetailsTable
WHERE SomeColumns = 'someStuffvariable'
FOR JSON PATH, ROOT('invoicedetails')) AS invoicedetails;
I don't know how to format this and my google skills are failing me at this point. I understand that I probably want to use an UPDATE statement, but I'm not sure how to do this in combination with the rest of my requirements. I'm exploring How do I UPDATE from a SELECT in SQL Server? but I am still at a halt.
The end result should be a table "InvResults" that has 3 columns containing one row with results from Select statements as JSON. The column names should be defined the same as the json root objects.

INSERT INTO InvResults(Invoices,InvoidesDetails)
SELECT
(SELECT *
INTO InvResults for Column Invoices
FROM MyInvoiceTable
WHERE SomeColumns = 'someStuffvariable'
FOR JSON PATH, ROOT('invoices'))
,
(SELECT *
INTO InvResults for Column InvoiceDetails
FROM MyInvoiceDetailsTable
WHERE SomeColumns = 'someStuffvariable'
FOR JSON PATH, ROOT('invoicedetails'))
;
Because the SELECT.. FOR JSON is only returning 1 row above works.
The third field is easily to added, but left to do for yourself 😉

Conversion failed when converting the varchar value 'Collect_Date' to data type int

I am struggling with trying to apply a date filter to my query. I keep getting this error message
Conversion failed when converting the varchar value 'Collect_Date' to
data type int
Here is my code:
SELECT
Location_ID,
CONVERT(Date,CONVERT(varchar(10),Collect_Month_Key,101)) as 'Collect_Date',
Calc_Gross_Totals, Loc_Country,
CONVERT(varchar(8),Collect_Month_Key)+'-'+Location_ID as 'Unique Key'
FROM
FT_GPM_NPM_CYCLES,
LU_Location,
LU_Loc_Country
WHERE
LU_Location.LU_Loc_Country_Key=LU_Loc_Country.LU_Loc_Country_Key
AND FT_GPM_NPM_CYCLES.Lu_Loc_Key= LU_Location.LU_Loc_Key
AND Collect_Month_Key<>-1
AND 'Collect_Date'>=2016-1-1
ORDER BY
Location_ID,
Collect_Date;
If someone could help that would be appreciated. I am also getting a different error when I try to do the Month(Collect_Date). So if anyone knows why on that I would appreciate it. I have attched a picture with the code nd results I am getting.

I see whats going on, you are trying to use the alias in the select statement. You can't do that, There are a few other issues that have been covered in the comments, but here is the immediate answer to the question:
Select Location_ID
, Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101)) as Collect_Date
, Calc_Gross_Totals
, Loc_Country
, CONVERT(varchar(8),Collect_Month_Key)+'-'+Location_ID as [Unique Key]
From FT_GPM_NPM_CYCLES
, LU_Location
, LU_Loc_Country
Where LU_Location.LU_Loc_Country_Key=LU_Loc_Country.LU_Loc_Country_Key
and FT_GPM_NPM_CYCLES.Lu_Loc_Key= LU_Location.LU_Loc_Key
and Collect_Month_Key <> -1
and Convert(Date,CONVERT(varchar(10),Collect_Month_Key,101)) >= '2016-1-1'
Order By Location_ID, Collect_Date;

Here is an updated query that brings following modifications:
As commented by Robert Sheahan, you cannot use a resultset column alias in the WHERE clause
As commented by Larnu, since you are storing dates as strings, you could simply do string comparaison to filter records (and return string values). With this technique, you do not need additional condition Collect_Month_Key <> -1, since string '-1' is not greater than string '20160101'.
use explicit joins instead of implicit joins (comment by Gordon Linoff)
I added table aliases : they make the query easier to read (and make it possible to self-join a table...)
I would also recommend to to prefix all columns being used in the query with their table alias. This clearly indicates from which table each column comes from, and makes the query easier to understand and maintain. NB: if Collect_Month_Key belongs to a table other than FT_GPM_NPM_CYCLES, you want to move the condition from the WHERE clause to the ON clause of the relevant JOIN)
Query:
SELECT
Location_ID,
Collect_Month_Key AS Collect_Date,
Calc_Gross_Totals,
Loc_Country,
CONVERT(varchar(8),Collect_Month_Key) + '-' + Location_ID AS Unique_Key
FROM
FT_GPM_NPM_CYCLES AS cyc
INNER JOIN LU_Location AS loc
ON cyc.Lu_Loc_Key = loc.LU_Loc_Key
INNER JOIN LU_Loc_Country AS cty
ON loc.LU_Loc_Country_Key = cty.LU_Loc_Country_Key
WHERE
Collect_Month_Key > '20160101'
ORDER BY
Location_ID,
Collect_Month_Key

To answer your comment "So if I don't put the collect_Date in the WHERE, where should I put it for something like this in the future?", I suggest Common Table Expressions. Functionally they are equivalent to defining a derived table in the FROM clause, but they move it "above" so it feels more like "before" and I think they make it much easier to read. To convert GMB's excellent solution to using a CTE:
--Leading ; because CTEs require prvious command terminated explicitly
;WITH cteWithDates as ( --cteDates becomes a virtual temporary table
SELECT
cyc.* --Keep all the original columns of FT_GPM_NPM_CYCLES
, Collect_Month_Key AS Collect_Date --and add Collect_Date and Unique_Key
, CONVERT(varchar(8),Collect_Month_Key) + '-' + Location_ID AS Unique_Key
FROM FT_GPM_NPM_CYCLES AS cyc
) --you could add more CTEs with the following format,
--all become available at the end
--, cteMore as (SELECT ... FROM ...)
--the first line after the closing ) has access to all CTEs, but ONLY that line
SELECT Location_ID,
Collect_Date,
Calc_Gross_Totals,
Loc_Country,
Unique_Key
FROM
cteWithDates AS cyc --Use the CTE as you would your original table,
--but the added fields are now available EVERYWHERE in your query!
INNER JOIN LU_Location AS loc
ON cyc.Lu_Loc_Key = loc.LU_Loc_Key
INNER JOIN LU_Loc_Country AS cty
ON loc.LU_Loc_Country_Key = cty.LU_Loc_Country_Key
WHERE
Collect_Date > '20160101' --NOW you can use CollectDate!
ORDER BY
Location_ID,
Collect_Date --And here too
Note that this is much more efficient than defining an actual temporary table with #TableName, because the query optimizer can drop unused records from the CTE but it has to put them all into the #temporary table, a huge performance difference if your table is large and the matching subset small.

Solution to avoid non-sargable argument in where clause

In the code_list CTE in this query I have a row constructor that will eventually take any number of arguments. The column icd in the patient_codes CTE is a five digit identifier that is most descriptive that the three digit codes that the row constructor has. The table icd_patient has a 100 million rows so for performance's sake, I would like to filer the rows on this table before I do any further work. I have
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select distinct icd,pat_id,id
from icd_patient
where icd in (select icd from code_list)
)
select distinct pat_id from patient_codes
The problem is, however, is that in the icd_patient table all of the icd columns are five digit and more descriptive. If I look at the execution plan of this query it's pretty streamlined. If I do
;with code_list(code_list)
as
(
select x.code_list
from (values ('70700'),('25002')) as x(code_list)
),patient_codes
as
(
select substring(icd,1,3) as icd,pat_id
from icd_patient2
where substring(icd,1,3) in (select * from code_list)
)
select * from patient_codes
this if course has a large performance impact because of the substring expression in the where clause. Does something akin to like in exist so I can take advantage of my indexes?
Index on icd_patient
CREATE NONCLUSTERED INDEX [ix_icd_patient] ON [dbo].[icd_patient2]
(
[pat_id] ASC
)
INCLUDE ( [id],

This much simpler query should be better than (or, at worst, the same as) your existing query.
select pat_id
FROM dbo.icd_patient
where icd LIKE '707%'
OR icd LIKE '250%'
GROUP BY pat_id;
Note that sargability only matters if there is actually an index on this column.
An alternative (since OR can sometimes give the optimizer fits):
SELECT pat_id FROM
(
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '707%'
UNION ALL
SELECT pat_id
FROM dbo.icd_patient
WHERE icd LIKE '250%'
) AS x
GROUP BY pat_id;
To make this extensible beyond a handful of OR conditions, I would use a table-valued parameter (TVP).
CREATE TYPE dbo.StringPatterns AS TABLE(s VARCHAR(3) PRIMARY KEY);
Then your stored procedure could say:
CREATE PROCEDURE dbo.whatever
#sp dbo.StringPatterns READONLY
AS
BEGIN
SET NOCOUNT ON;
SELECT p.pat_id
FROM dbo.icd_patient AS p
INNER JOIN #sp AS sp
ON p.pat_id LIKE sp.s + '%'
GROUP BY p.pat_id;
END
Then you can pass in your set of three-character substrings from a DataTable or other collection in C#. From T-SQL just as an example:
DECLARE #p dbo.StringPatterns;
INSERT #p VALUES('707'),('250');
EXEC dbo.whatever #sp = #p;

Something like like in does not exist. The following is sargable:
select *
from icd_patient
where icd like '70700%' or
icd like '25002%'
Because like with a constant initial substring is a special case for SQL Server. This does not work when the strings on the right are variables.
One solution is to create an indexed view on the icd_patient table with an index on the first five characters of the icd code.

Using "IN" makes that part of a command non-sargable on both sides. End of discussion.
Saying he fixes it using substring, completely changes what it would return while it remains non sarged.
Any "fix" should exactly match results. The actual fix is to join the cte so the five characters match or put three characters in the cte and match that in a join or put 4 characters in the cte where the fourth is "%" and join matching by using LIKE
Using a "like" that starts with "%" increases the complexity of the search, but it would still use the index to find the value because parsing the index should use less reading by only getting the full table row when a search is successful.

Using Microsoft Query and ODBC to SQL Server, complicated query

I have a view in SQL Server that is somewhat similar to the following example.
SELECT *
FROM PEOPLE
LEFT OUTER JOIN (SELECT ID
FROM OTHER_TABLE
WHERE SOME_FIELD = 'x'
OR SOME_FIELD = 'y'
OR SOME_FIELD = 'z') AS PEOPLE_TO_EXCLUDE ON PEOPLE.ID = PEOPLE_TO_EXCLUDE.ID
WHERE PEOPLE_TO_EXCLUDE.ID IS null
The hassle:
I am perfectly capable of adding and modifying "OR SOME_FIELD = 'w'" countless numbers of times. However, I am making this view for a user to pull up in excel via ODBC. The user needs to be able to modify the inner select to her liking, to match whatever she happens to be limiting on at that time of the day/week/month/year/etc. I need to make this in a way that allows her to easily limit on SOME_FIELD.
Does anyone have suggestions on how to accomplish this? Ideally I could give her a view, which she could put a comma separated list of values that SOME_FIELD cannot be. Since people may have multiple rows in OTHER_TABLE I can't just have her limit off of that table specifically. For example someone may have SOME_FIELD = 'x' but also have a row in the table where SOME_FIELD = 's'. This person should be excluded because they have 'x' even though they also have 's'. So that is why the inner select is necessary.
Thanks for your help.

Don't create queries for EXCEL users, they always break them and then you have to debug them. Instead, create a stored procedure, pass in a CSV. In the stored procedure split the CSV using a split function and join to it. The user will only have an EXCEL query like:
EXEC YourProcedure 'x,y,z'
As a result, they will not break the query.
To help with the split function, see: "Arrays and Lists in SQL Server 2008 Using Table-Valued Parameters" by Erland Sommarskog , then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method:
You need to create a split function. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
Create Procedure YourProcedure
#Filter VARCHAR(1000)
AS
SELECT
p.*
FROM PEOPLE p
LEFT OUTER JOIN (SELECT
o.ID
FROM OTHER_TABLE o
INNER JOIN (SELECT
ListValue
FROM dbo.FN_ListToTable(',',#Filter )
) f ON o.SOME_FIELD=f.ListValue
) x ON p.ID=x.ID
WHERE x.ID IS null
GO

SP to find keywords like a list or strings

In my mssql database I have a table containing articles(id, name, content) a table containing keywords(id, name) and a link table between articles and keywords ArticleKeywords(articleId, keywordID, count). Count is the number of occurrences of that keyword in the article.
How can I write a SP that gets a list of comma separated strings and gives me the articles that have this keywords ordered by the number of occurrences of the keywords in the article?
If an article contains more keywords I want to sum the occurrences of each keyword.
Thanks, Radu

Although it isn't completely clear to me what the source of your comma-separated string is, I think what you want is an SP that takes a string as input and produces the desired result:
CREATE PROC KeywordArticleSearch(#KeywordString NVARCHAR(MAX)) AS BEGIN...
The first step is to verticalize the comma-separated string into a table with the values in rows. This is a problem that has been extensively treated in this question and another question, so just look there and choose one of the options. Whichever way you choose, store the results in a table variable or temp table.
DECLARE #KeywordTable TABLE (Keyword NVARCHAR(128))
-- or alternatively...
CREATE TABLE #KeywordTable (Keyword NVARCHAR(128))
For lookup speed, it is even better to store the KeywordID instead so your query only has to find matching ID's:
DECLARE #KeywordIDTable TABLE (KeywordID INT)
INSERT INTO #KeywordTable
SELECT K.KeywordID FROM SplitFunctionResult S
-- INNER JOIN: keywords that are nonexistent are omitted
INNER JOIN Keywords K ON S.Keyword = K.Keyword
Next, you can go about writing your query. This would be something like:
SELECT articleId, SUM(count)
FROM ArticleKeywords AK
WHERE K.KeywordID IN (SELECT KeywordID FROM #KeywordIDTable)
GROUP BY articleID
Or instead of the WHERE you could use an INNER JOIN. I don't think the query plan would be much different.

For the sake or argument lets say you want to look-up all articles containg the keywords Foo, Bar and Shazam.
ALTER PROCEDURE spArticlesFromKeywordList
#KeyWords varchar(1000) = 'Foo,Bar,Shazam'
AS
SET NOCOUNT ON
DECLARE #KeyWordInClause varchar(1000)
SET #KeyWordInClause = REPLACE (#KeyWords ,',',''',''')
EXEC(
'
SELECT
t1.Name as ArticleName,
t2.Name as KeyWordName,
t3.Count as [COUNT]
FROM ArticleKeywords t3
INNER JOIN Articles t1 on t3.ArticleId = t1.Id
INNER JOIN Keywords t2 on t3.KeywordId = t2.Id
WHERE t2.KeyWord in ( ''' + #KeyWordInClause + ''')
ORDER BY
3 descending, 1
'
)
SET NOCOUNT OFF

I think I understand what you are after so here goes ,(not sure what lang you are using but) in PHP (from your description) I would query ArticleKeywords using a ORDER BY count DESC statement (i.e. the highest comes first) - Obviously you can "select by keywordID or articleid. In very simple terms (cos that's me - simple & there may be much better people than me) you can return the array but create a string from it a bit like this:
$arraytostring .= $row->keywordID.',';
If you left join the tables you could create something like this:
$arraytostring .= $row->keywordID.'-'.$row->name.' '.$row->content.',';
Or you could catch the array as
$array[] = $row->keywordID;
and create your string outside the loop.
Note: you have 2 fields called "name" one in articles and one in keywords it would be easier to rename one of them to avoid any conflicts (that is assuming they are not the same content) i.e. articles name = title and keywords name= keyword

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to use Join with like operator and then casting columns - sql

You can not use LIKE statement in the JOIN condition. Please provide more information on your tables. You have to convert the format of one of the phone field to compile with other phone field format in order to join.

Related

How do I use SQL Server to select into another table?

Conversion failed when converting the varchar value 'Collect_Date' to data type int

Solution to avoid non-sargable argument in where clause

Using Microsoft Query and ODBC to SQL Server, complicated query

SP to find keywords like a list or strings

Categories

Resources