Sql queries usng SELECT statement - sql

I have these table from a database. The table WORKER has 2 fields one of them is ID_W(number) and the other is DUTY(text).
WORKER(ID_W,DUTY)
I want to create an sql query that selects only ID_W(id of worker) which have the same DUTY where DUTY is a text type field.
Can someone help me? I want to use an aggregate function but none of them helps.

I used a temp table for an example but this would do what you are wanting I believe:
SELECT Duty,
STUFF((
SELECT ', ' + CAST(ID_W AS VARCHAR(MAX))
FROM #WORKER
WHERE Duty = T.DUTY
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS IDs FROM #WORKER T
GROUP BY Duty

If I am understanding your question correctly, you are looking for only the IDs of the workers whose DUTY field matches that of another worker.
You can do this with a WHERE EXISTS clause, but you need to CONVERT the TEXT column into a VARCHAR (MAX) in order to compare them. This conversion on both sides will make the query expensive, but this is another way to do it:
Select ID_W
From WORKER A
Where Exists
(
Select *
From WORKER B
Where A.ID_W <> B.ID_W
And Convert(Varchar (Max), A.DUTY) = Convert(Varchar (Max), B.DUTY)
)

Related

Comparing multiple tables with one another

I do not necessarily spend my time creating sql-queries, I maintain and search for mistakes in databases. I constantly have to compare two types of tables, and if the database is small, I dont mind just writing a small query. But at times, some databases are huge und the amount of tables is overwhelming.
I have one table-type that has compressed data and another that has aggregates comprised of the compressed data. At times, the AggregateTables are missing some IDs, some data was not calculated. If it is just one AggregateTable, I just compare it to its corresponding compressed table and i can immediately see what needs to be recalculated(code for that is shown below).
select distinct taguid from TLG.TagValueCompressed_0_100000
where exists
(select * from tlg.AggregateValue_0_100000 where
AggregateValue_0_100000.TagUID = TagValueCompressed_0_100000.TagUID)
I would like to have a table, that compares all tables with another and spits out a table with all non existing tags. My SQl knowledge is at its infancy, and my job does not require me to be a sql freak. But a query that does said problem, would help me alot. Does anyone have any suggestions for a solution?
Relevant comlumns: Taguids, thats it.
Optimal Table:
Existing Tags missing Tags
1
2
3
4
.
.
Numbers meaning what table : "_0_100000", "_0_100001" ...
So let's assume this query produced the output you want, i.e. the list of all possible tags in a given table set (0_100000, etc.) and columns denoting whether the given tag exists in AggregateValue and TagValueCompressed:
SELECT '0_100000' AS TableSet
, ISNULL(AggValue.TagUID, TagValue.TagUID) AS TagUID
, IIF(TagValue.TagUID IS NOT NULL, 1, 0) AS ExistsInTag
, IIF(AggValue.TagUID IS NOT NULL, 1, 0) AS ExistsInAgg
FROM (SELECT DISTINCT TagUID FROM tlg.AggregateValue_0_100000) AggValue
FULL OUTER
JOIN (SELECT DISTINCT TagUID FROM TLG.TagValueCompressed_0_100000) TagValue
ON AggValue.TagUID = TagValue.TagUID
So in order to execute it for multiple tables, we can make this query a template:
DECLARE #QueryTemplate NVARCHAR(MAX) = '
SELECT ''$SUFFIX$'' AS TableSet
, ISNULL(AggValue.TagUID, TagValue.TagUID) AS TagUID
, IIF(TagValue.TagUID IS NOT NULL, 1, 0) AS ExistsInTag
, IIF(AggValue.TagUID IS NOT NULL, 1, 0) AS ExistsInAgg
FROM (SELECT DISTINCT TagUID FROM tlg.AggregateValue_$SUFFIX$) AggValue
FULL OUTER
JOIN (SELECT DISTINCT TagUID FROM TLG.TagValueCompressed_$SUFFIX$) TagValue
ON AggValue.TagUID = TagValue.TagUID';
Here $SUFFIX$ denotes the 0_100000 etc. We can now execute this query dynamically for all tables matching a certain pattern, like say you have 500 of these tables.
DECLARE #query NVARCHAR(MAX) = ''
-- Produce a query for a given suffix out of the template
-- suffix is the last 8 characters of the table's name
-- combine all the queries, for all tables, using UNION ALL
SELECT #query += CONCAT(REPLACE(#QueryTemplate, '$SUFFIX$', RIGHT(name, 8)), ' UNION ALL ')
FROM sys.tables
WHERE name LIKE 'TagValueCompressed%';
-- Get rid of the trailing UNION ALL
SET #Query = LEFT(#Query, LEN(#Query) - LEN('UNION ALL '));
EXECUTE sp_executesql #Query
This will yield combined results for all matching tables.
Here is a working example on dbfiddle.

How to use Join with like operator and then casting columns

I have 2 tables with these columns:
CREATE TABLE #temp
(
Phone_number varchar(100) -- example data: "2022033456"
)
CREATE TABLE orders
(
Addons ntext -- example data: "Enter phone:2022033456<br>Thephoneisvalid"
)
I have to join these two tables using 'LIKE' as the phone numbers are not in same format. Little background I am joining the #temp table on the phone number with orders table on its Addons value. Then again in WHERE condition I am trying to match them and get some results. Here is my code. But my results that I am getting are not accurate. As its not returning any data. I don't know what I am doing wrong. I am using SQL Server.
select
*
from
order_no as n
join
orders as o on n.order_no = o.order_no
join
#temp as t on t.phone_number like '%'+ cast(o.Addons as varchar(max))+'%'
where
t.phone_number = '%' + cast(o.Addons as varchar(max)) + '%'
You can not use LIKE statement in the JOIN condition. Please provide more information on your tables. You have to convert the format of one of the phone field to compile with other phone field format in order to join.
I think your join condition is in the wrong order. Because your question explicitly mentions two tables, let's stick with those:
select *
from orders o JOIN
#temp t
on cast(o.Addons as varchar(max)) like '%' + t.phone_number + '%';
It has been so long since I dealt with the text data type (in SQL Server), that I don't remember if the cast() is necessary or not.
Instead of trying to do everything in a single top-level query, you should apply a transformation projection to your orders table and use that as a subquery, which will make the query easier to understand.
Using the CHARINDEX function will make this a lot easier, however it does not support ntext, you will need to change your schema to use nvarchar(max) instead - which you should be doing anyway as ntext is deprecated, fortunately you can use CONVERT( nvarchar(max), someNTextValue ), though this will reduce performance as you won't be able to use any indexes on your ntext values - but this query will run slowly anyway.
SELECT
orders2.*,
CASE WHEN orders2.PhoneStart > 0 AND orders2.PhoneEnd > 0 THEN
SUBSTRING( orders2.Addons, orders2.PhoneStart, orders2.PhoneEnd - orders2.PhoneStart )
ELSE
NULL
END AS ExtractedPhoneNumber
FROM
(
SELECT
orders.*, -- never use `*` in production, so replace this with the actual columns in your orders table
CHARINDEX('Enter phone:', Addons) AS PhoneStart,
CHARINDEX('<br>Thephoneisvalid', AddOns, CHARINDEX('Enter phone:', Addons) ) AS PhoneEnd
FROM
orders
) AS orders2
I suggest converting the above into a VIEW or CTE so you can directly query it in your JOIN expression:
CREATE VIEW ordersWithPhoneNumbers AS
-- copy and paste the above query here, then execute the batch to create the view, you only need to do this once.
Then you can use it like so:
SELECT
* -- again, avoid the use of the star selector in production use
FROM
ordersWithPhoneNumbers AS o2 -- this is the above query as a VIEW
INNER JOIN order_no ON o2.order_no = order_no.order_no
INNER JOIN #temp AS t ON o2.ExtractedPhoneNumber = t.phone_number
Actually, I take back my previous remark about performance - if you add an index to the ExtractedPhoneNumber column of the ordersWithPhoneNumbers view then you'll get good performance.

Combine Unique Column Values Into One to Avoid Duplicates

For simplicity, assume I have two tables joined by account#. The second table has two columns, id and comment. Each account could have one or more comments and each unique comment has a unique id.
I need to write a t-sql query to generate one row for each account - which I assume means I need to combine as many comments as might exit for each account. This assumes the result set will only show the account# once. Simple?
Sql Server is a RDBMS best tuned for storing data and retrieving data, you can retrieve the desired data with one very simple query but the desired format should be handled with any of the reporting tools available like ssrs or crystal reports
Your query will be a simple inner join something like this
SELECT A.Account , B.Comment
FROM TableA AS A INNER JOIN TableB AS B
ON A.Account = B.Account
Now you can use your reporting tool to Group all the Comments by Account when Displaying data.
I do agree with M. Ali, but if you don't have that option, the following will work.
SELECT [accountID]
, [name]
, (SELECT CAST(Comment + ', ' AS VARCHAR(MAX))
FROM [comments]
WHERE (accountID = accounts.accountID)
FOR XML PATH ('')
) AS Comments
FROM accounts
SQL Fiddle
In my actual project I have this exact situation.
What you need is a solution to aggregate the comments in order to show only one line per account#.
I solve it by creating a function to concatenate the comments, like this:
create function dbo.aggregateComments( #accountId integer, #separator varchar( 5 ) )
as
begin;
declare #comments varchar( max ); set #comments = '';
select #comments = #comments + #separator + YouCommentsTableName.CommentColumn
from dbo.YouCommentsTableNAme
where YouCommentsTableName.AccountId = #accountId;
return #comments;
end;
You can use it on you query this way:
select account#, dbo.aggretateComments( account#, ',' )
from dbo.YourAccountTableName
Creating a function will give you a common place to retrieve your comments. It's a good programming practice.

Vendor database contains table with 100 user defined string columns need to find all occurrences in that table that are not null

I have a vendor database for a web application with 100 "user defined string" columns. They have a datatype of varchar with a length of 255.
I need to return all the rows that aren't null so that I can find out what is being stored in each one. There's no controlling for what the input has been over the years so userdefinedstring1 can contain text, dates, numbers, empty strings or NULL across multiple rows.
My initial solution was just
SELECT
*
FROM userdefinedstring table
WHERE userdefinedstring1 IS NOT NULL
OR userdefinedstring2 IS NOT NULL
repeated 98 more times.
There is likely a better way to do this but I haven't yet determined it so any tips you have are appreciated.
The only improvement to that that I can think of would be to use COALESCE in the WHERE clause instead of OR:
SELECT *
FROM userdefinedstringTable
WHERE COALSECE( userdefinedstring1
, userdefinedstring2
...
) IS NOT NULL
Depending on your DBMS product, there may be vendor-specific improved ways to do this, but generically, this is probably the best.
RBarry's COALESCE is a good idea, and you can use this to list out all columns of interest:
SELECT c.name ColumnName
FROM sys.columns AS c
JOIN sys.types AS t ON c.user_type_id=t.user_type_id
WHERE t.name = 'varchar'
AND t.max_length = 255
ORDER BY c.OBJECT_ID;
Good chance to use EXCEL to craft a query quickly:
=A1&"," copy down to craft your badass COALESCE statement.
Here is query to generate that coalesce on all columns:
SELECT 'SELECT *
FROM userdefinedstringTable
WHERE COALESCE (' + STUFF(
(SELECT ', [' + name + ']'
FROM sys.columns WHERE Object_ID = object_id('userdefinedstringTable')
FOR XML PATH (''))
, 1,1,'') + ') IS NOT NULL'

Is there any way of improving the performance of this SQL Function?

I have a table which looks something like
Event ID Date Instructor
1 1/1/2000 Person 1
1 1/1/2000 Person 2
Now what I want to do is return this data so that each event is on one row and the Instructors are all in one column split with a <br> tag like 'Person 1 <br> Person 2'
Currently the way I have done this is to use a function
CREATE FUNCTION fnReturnInstructorNamesAsHTML
(
#EventID INT
)
RETURNS VARCHAR(max)
BEGIN
DECLARE #Result VARCHAR(MAX)
SELECT
#result = coalesce(#result + '<br>', '') + inst.InstructorName
FROM
[OpsInstructorEventsView] inst
WHERE
inst.EventID = #EventID
RETURN #result
END
Then my main stored procedure calls it like
SELECT
ev.[BGcolour],
ev.[Event] AS name,
ev.[eventid] AS ID,
ev.[eventstart],
ev.[CourseType],
ev.[Type],
ev.[OtherType],
ev.[OtherTypeDesc],
ev.[eventend],
ev.[CourseNo],
ev.[Confirmed],
ev.[Cancelled],
ev.[DeviceID] AS resource_id,
ev.Crew,
ev.CompanyName ,
ev.Notes,
dbo.fnReturnInstructorNamesAsHTML(ev.EventID) as Names
FROM
[OpsSimEventsView] ev
JOIN
[OpsInstructorEventsView] inst
ON
ev.EventID = inst.EventID
This is very slow, im looking at 4seconds per call to the DB. Is there a way for me to improve the performance of the function? Its a fairly small function so im not sure what I can do here, and I couldnt see a way to work the COALESCE into the SELECT of the main procedure.
Any help would be really appreciated, thanks.
You could try something like this.
SELECT
ev.[BGcolour],
ev.[Event] AS name,
ev.[eventid] AS ID,
ev.[eventstart],
ev.[CourseType],
ev.[Type],
ev.[OtherType],
ev.[OtherTypeDesc],
ev.[eventend],
ev.[CourseNo],
ev.[Confirmed],
ev.[Cancelled],
ev.[DeviceID] AS resource_id,
ev.Crew,
ev.CompanyName ,
ev.Notes,
STUFF((SELECT '<br>'+inst.InstructorName
FROM [OpsInstructorEventsView] inst
WHERE ev.EventID = inst.EventID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'), 1, 4, '') as Names
FROM
[OpsSimEventsView] ev
Not sure why you have joined OpsInstructorEventsView in the main query. I removed it here but if you needed you can just add it again.
A few things to look at:
1) The overhead of functions makes them expensive to call, especially in the select statement of a query that could potentially be returning thousands of rows. It will have to execute that function for every one of them. Consider merging the behavior of the function into your main stored procedure, where the SQL Server can make better use of its optimizer.
2) Since you are joining on event id in both tables, make sure you have an index on those two columns. I would expect that you do, given that those both appear to be primary key columns, but make sure. An index can make a huge difference.
3) Convert your coalesce call into its equivalent case statements to remove the overhead of calling that function.
Yes make it an INLINE Table-Valued SQL function:
CREATE FUNCTION fnReturnInstructorNamesAsHTML
( #EventID INT )
RETURNS Table
As
Return
SELECT InstructorName + '<br>' result
FROM OpsInstructorEventsView
WHERE EventID = #EventID
Go
Then, in your SQL Statement, use it like this
SELECT ]Other stuff],
(Select result from dbo.fnReturnInstructorNamesAsHTML(ev.EventID)) as Names
FROM OpsSimEventsView ev
JOIN OpsInstructorEventsView inst
ON ev.EventID = inst.EventID
I'm not exactly clear how the query you show in your question is concatenating data from multiple rows in one row of the result, but the problem is that ordinary UDFs are compiled on use, on EVERY use, so for each row in your output result the Query processopr has to recompile the UDF again. THis is NOT True for an "inline table valued" UDF, as it's sql is folded into the outer sql before it is passed to the SQL optimizer, (the subsystem that generates the statement cache plan) and so the UDF is only compiled once.