Joining multiple columns (value not known until procedure start)

Joining multiple columns (value not known until procedure start) - sql

I am using SQL Server 2008, and I have a task to make some reports that will require me to sort some data then JOIN with a table. I am writing a procedure for this. It looks a bit like
CREATE PROCEDURE getReport #ReportType int
AS
DECLARE #DataToJoin table
--DETAILS OMMITTED
INSERT INTO #DataToJoin
--DETAILS OMMITTED (sorting, fiddling with data)
SELECT table.col1, table.col2, joined.col3
FROM table
JOIN #DataToJoin joined ON table.x=joined.x
GO
Everything seemed to be good till someone told me that #ReportType tells us about how many things to sort and then to join. Since this data needs to be fiddled with, it can't be a simple JOIN from the start.
How should I approach the matter of multiple JOINs to return one table? Initially I thought about WHILE within the last select that will have sorting and joining within it, but it seems that taking this approach won't work :( Then I thought about another table that will hold joined columns, but I can't declare table that will have dynamic list of columns.
Any thoughts on that matter? Any help is appreciated! :)

Try building dynamic sql: http://msdn.microsoft.com/en-us/library/ms188001(v=sql.100).aspx
This way you can create a sql with all the table JOIN together and sorted any way you needed.

Related

Creating stored procedure with multiple temp tables

So I created a report that essentially runs a DML statement built off multiple volatile tables. The way I built it was that each of my temp tables would essentially derive off the prior. For example, in my first table I define the 'dataset' while my other temp tables defining my "exclusions", then my last couple temp table combines it all and then executes it in a final query.
I want to automate this report to pull in data daily, but I'm not sure whether to create a macro or sp for it. The bigger problem with either approach, is how would I even be able to effectively utilize each temp table? I've thought about combining ALL of my tables into a GIANT (1000+ line) DML statement, but SURELY, surely there are better, easier options out there.
Any help is deeply appreciated, thanks!

Alternatively you could use Common Table Expression (CTE) instead of temp tables:
WITH cte1 AS
(
SELECT *
FROM table_1
WHERE
), cte2 AS
(
SELECT...
FROM cte2
JOIN ...
WHERE
)
...
SELECT *
FROM cte_n;
One CTE can be depend on previous one or not, you could also use recursion and combine result in final query.

Stored procedure using too many selects?

I recently started doing some performance tuning on a client's stored procedures and i bumped into this chunk of code and could'nt find a way to make it work more efficiently.
declare #StationListCount int;
select #StationListCount = count(*) from #StationList;
declare #FleetsCnt int;
select #FleetsCnt=COUNT(*) from #FleetIds;
declare #StationCnt int;
select #StationCnt=COUNT(*) from #StationIds;
declare #VehiclesCnt int;
select #VehiclesCnt=COUNT(*) from #VehicleIds;
declare #TrIds table(VehicleId bigint,TrId bigint,InRange bit);
insert into #TrIds(VehicleId,TrId,InRange)
select t.VehicleID,t.FuelTransactionId,1
from dbo.FuelTransaction t
join dbo.Fleet f on f.FleetID = t.FleetID and f.CompanyID=#ActorCompanyID
where t.TransactionTime>=#From and (#To is null or t.TransactionTime<#To)
and (#StationListCount=0 or exists (select id fRom #StationList where t.FuelStationID = ID))
and (#FleetsCnt=0 or exists (select ID from #FleetIds where ID = t.FleetID))
and (#StationCnt=0 or exists (select ID from #StationIds where ID = t.FuelStationID))
and (#VehiclesCnt=0 or exists (select ID from #VehicleIds where ID = t.VehicleID))
and t.VehicleID is not null
the insert command slows the whole procedure and takes 99% of the resources.
I am not sure but i think these nested loops are referring to the queries inside the where clause
I would very much appreciate the help i can get on this.
Thank you!

There are couple of things that you actually should go over and see the performance differences. First of all, as the previous answer suggest you should omit the count(*)-like aggragates as much as possible. If the table is so big, the cost of these functions exponentially increase. You can even think of storing those counts in a seperate table with proper index constraints.
I also suggest you to split the select statement into multiple statements because when you use so many NULL checks, or, and conditions in combinations; your indexes may be bypassed so that your query cost increases a lot. Sometimes, using UNIONs may provide far better performance than using such conditions.
Actually, you should try all these and see what fits your needs
hope it helps.

Insert is using only 1 table for vehicle Id so joining other tables doesn't requires.

I don't see the declaration of the #table variables, but (assuming the IDs in them are unique) consider communicating this information to the optimizer, IOW add primary key constraints to them.
Also, add the option(recompile) to the end of the query.

Speed up query if many SP's based on the same Tables

I have a eight tsql stored procedures that are called everytime i load a form in my vb.net program. The querys look which windowsuser sends the query and gives the result depending on it.
The structure of the procedures is always the same except for the last statement:
Create Procedure dbo.Name #one decimal(18,2) Output as
...
Create Table #Temp1
Insert Into
...
Create Table #Temp2
Insert Into
...
Select ...
The last Select Statement changes in all eight SP's and uses Temp1 and Temp2 information and sends the query depending on the user who is logged in. Everything works fine but it is very slowly since every eight queries are triggered by the load_event. What is the best way to speed that up?

If you are always using all eight procs together, then combine them into one and only create the temp tables once.
You can also index temp tables if you need to improve their speed.

Since it's a performance issue, for SQL Server 2000 and above, you can try using table variables (prefixed with #) instead of temporary tables.

Whats the best way to select fields from multiple tables with a common prefix?

I have sensor data from a client which is in ongoing acquisition. Every week we get a table of new data (about one million rows each) and each table has the same prefix. I'd like to run a query and select some columns across all of these tables.
what would be the best way to go about this ?
I have seen some solutions that use dynammic sql and i was considering writing a stored procedure that would form a dynamic sql statement and execute it for me. But im not sure this is the best way.

I see you are using Postgresql. This is an ideal case for partitioning with constraint exclusion based on dates. You create one master table without data, and the other tables added daily inherit from it. In your case, you don't even have to worry about the nuisance of triggers on INSERT; sounds like there is never any insertion other than the daily bulk creation of a new table. See the link above for full documentation.
Queries can be run against the parent table, and Postgres takes care of looking in all the child tables, plus it is smart enough to skip child tables ruled out by WHERE criteria.

You could query the meta data for tables with the same prefix.
select table_name from information_schema.tables where table_name like 'week%'
Then you could use union all to combine queries like
select * from week001
union all
select * from week002
[...]
However I suggest appending new records to one single table, and use an index on the timestamp column. This would especially speed up queries which span multiple weeks etc. It will simplify your queries a lot, if you only have to deal with one table. If the table is getting too large, you could partition by date etc. So there should be no need to partition manually by having multiple tables.

You are correct, sometimes you have to write dynamic SQL to handle cases such as this.
If all of your tables are loaded you can query for table names within your stored procedure. Something like this:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
Play with that to get the specific table names you need.
How are the table names differentiated? By date? Some incrementing ID?

Why use table valued function instead of a temp table in SQL?

I am trying to speed up my monster of a stored procedure that works on millions of records across many tables.
I've stumbled on this:
Is it possible to use a Stored Procedure as a subquery in SQL Server 2008?
My question is why using a table valued function be better then using a temp table.
Suppose my stored procedure #SP1
declare #temp table(a int)
insert into #temp
select a from BigTable
where someRecords like 'blue%'
update AnotherBigTable
set someRecords = 'were blue'
from AnotherBigTable t
inner join
#temp
on t.RecordID = #temp.a
After reading the above link it seems that the consunsus is instead of using my #temp as temp table, rather create a table valued function that will do that select.
(and inline it if its a simple select like I have in this example) But my actual selects are multiple and often not simple (ie with subqueires, etc)
What is the benefit?
Thanks

Generally, you would use a temporary table (#) instead of a table variable. Table variables are really only useful for
functions, which cannot create temporary objects
passing table-valued data (sets) as read-only parameters
gaming statistics for certain query edge-cases
execution plan stability (related to statistics and also the fact that INSERT INTO table variables cannot use a parallel plan)
prior to SQL Server 2012, #temp tables inherit collation from the tempdb whereas #table variables uses the current database collation
Other than those, a #temporary table will work as well as if not better than a variable.
Further reading: What's the difference between a temp table and table variable in SQL Server?

Probably no longer relevant... but two things I might suggest that take two different approaches.
Simple approach 1:
Try a primary key on your table valued variable:
declare #temp table(a int, primary key(a))
Simple approach 2:
In this particular case try a common table expression (CTE)...
;with
temp as (
SELECT a as Id
FROM BigTable
WHERE someRecords like '%blue'
),
UPDATE AnotherBigTable
SET someRecords = 'were Blue'
FROM AnotherBigTable
JOIN temp
ON temp.Id = AnotherBigTable.RecordId
CTE's are really great and help to isolate specific data sets you actually want to work on from the myriad of records contained in larger tables... and if you find your self utilizing the same CTE declaration repeatedly consider formalizing that expression into a view. Views are an often overlooked and very valuable tool for DBA and DB programmers to manage large complex data sets with lots of records and relationships.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Joining multiple columns (value not known until procedure start) - sql

Try building dynamic sql: http://msdn.microsoft.com/en-us/library/ms188001(v=sql.100).aspx This way you can create a sql with all the table JOIN together and sorted any way you needed.

Related

Creating stored procedure with multiple temp tables

Stored procedure using too many selects?

Speed up query if many SP's based on the same Tables

Whats the best way to select fields from multiple tables with a common prefix?

Why use table valued function instead of a temp table in SQL?

Categories

Resources