Wildcard in Sql table name dynamic creation - sql

I am using query string to dynamically loop through table name. Now I need to add a wildcard to the table name so that it picks up new table I get. Example below
WHILE #Year_Id <= 2018
BEGIN
SET #YearVar = CONVERT(varchar(4), #Year_Id)
SET #TABLENAME = '[SWDI].[dbo].[hail-'+#YearVar+']'
SET #SQLQUERY = 'SELECT CELL_ID, LAT, LON, SEVPROB, PROB, MAXSIZE,
_ZTIME'+
' from '+#TABLENAME+
So my earlier tables were hail-2001, hail-2002, hail-2003 till 2017. Now I get tables with name hail-201801, hail-201802..
I want to incorporate the extra 01, 02 as wild card while calling the table.
Thanks a lot for the help. I am new to this.

Uh, no you don't. You clearly don't have a complete understanding of how tables work in a database or in SQL Server.
You gain nothing by having multiple tables with exact same columns and types and whose names are differentiated by numbers or dates. That is not how SQL works. You lose a lot: foreign key references, query simplicity, maintainability, and more.
Instead, include the date column in the data and store everything in one table.
If you are concerned about performance, then you can create an index on the date column to get the data that you need. Another method (if the data is large) is to store the data in separate data partitions. These are an important part of SLQ Server functionality.

As a general solution, you could do something like this:
SET #TableName = '[SWDI].[dbo].[hail-'+#YearVar+']';
-- Check if the year table exists
IF (OBJECT_ID(#TableName, 'U') IS NULL) BEGIN
-- Implement your 'wildcard' logic here
SET #NumVar = '01';
SET #TableName = '[SWDI].[dbo].[hail-'+ #YearVar + #NumVar']';
END
Another solution would be to have the missing numbered tables as views on top of the existing tables, but this might have negative performance effects.
A third one is to have yearly views on top of the new numbered tables, with clever constraints on the tables and in the view definition, this can have insignificant overhead.
Last, but not least, you should consider to build a partitioned view on top of these tables and maintain that view. You can query the view directly without messing with table names all the time.
Please read Gordon's answer!
In any case, I'd hightly suggest to be careful with the dynamic queries. You might want to take a look at functions like PARSENAME and QUOTENAME.

Related

SQL Query on table with 30mill records

I have been having problems building a table in my local SQL Server. Orginally it was causing the tempdb table to become full and throw an exception. This has a lot of joins and outer applies, and so to find specifically where the problem lay I did a select on the first table in the sql query to determine how long it took, that was fast so I then added the next table that was the first join in the query and reran, I continued to do this until I found the table that stalled.
I found the problem (or at least the first problem) was with the shipper_container table. This table is huge and pulling it alone gets a System.OutOfMemoryException just showing a select on the results of that table alone (it has only 5 columns). It cuts out at 16 million records but has 30 million rows. It is 1.2GB in size. This doesn't seem so big for me that SQL Management studio couldn't handle it.
Using a WHERE statement to collect values between 1 January - 10 January 2015 still resulted in a search that took over 5 minutes and was still executing when I cancelled. I have also added indexes on each of the select parameters and this did not increase performance either.
Here is the SQL Query. You can see I have commented out the other parameters that have yet to be added in other joins and outer applies.
DECLARE #startDate DATETIME
DECLARE #endDate DATETIME
DECLARE #Shipper_Key INT = NULL
DECLARE #Part_Key INT = NULL
SET #startDate = '2015-01-01'
SET #endDate = '2015-01-10'
SET NOCOUNT ON;
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
INSERT Shipped_Container
(
Ship_Date,
Invoice_Quantity,
Shipper_No,
Serial_No,
Truck_Key,
Shipper_Key
)
SELECT
S.Ship_Date,
SC.Quantity,
S.Shipper_No,
SC.Serial_No,
S.Truck_Key,
S.Shipper_Key
FROM Shipper AS S
JOIN Shipper_Line AS SL
--ON SL.PCN = S.PCN
ON SL.Shipper_Key = S.Shipper_Key
JOIN Shipper_Container AS SC
--ON SC.PCN = SL.PCN
ON SC.Shipper_Line_Key = SL.Shipper_Line_Key
WHERE S.Ship_Date >= #startDate AND S.Ship_Date <= #endDate
AND S.Shipper_Key = ISNULL(#Shipper_Key, S.Shipper_Key)
AND SL.Part_Key = ISNULL(#Part_Key, SL.Part_Key)
The server instance is run on the local network - could this be an issue? I really have minimal experience at this and would really appreciate help and as detailed and clear as possible. Often in SQL forums people jump right into technical details I don't follow so well.
Don't do a Select ... From yourtable in SS Management Studio when it return
hundrend of thousand or millions of row. 1GB of data gets a lot bigger when the system has to draw and show it on screen in the Management Studio data sheet
The server instance is run on the local network
When you do a Select ... From yourtable in SSMS, the server must send all the data to your laptop/desktop. This is quite a lot of uneeded presure on the network.
It should not be an issue when you insert because everything stays on the server. However, staying on the server does not mean it will be fast if your data model is not good enough.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
You may get dirty data is you use that... It may be better to remove it unless you know why it is there and why you need it.
I have also added indexes on each of the select parameters and this did not increase performance either
If you mean indexes on :
S.Ship_Date,
SC.Quantity,
S.Shipper_No,
SC.Serial_No,
S.Truck_Key,
S.Shipper_Key
What are their definitions ?
If they are individual indexes on 1 column, you can drop indexes on SC.Quantity, S.Shipper_No, SC.Serial_No and S.Truck_Key. They are not used.
Ship_Date and Shipper_key may be usefull. It all depends on your model and existing primary keys. (which you need to describe, see below)
It will help to give a more accurate answer if you could tell us:
the relation between your 3 tables (which field link A to B and in which direction)
the primary key on your 3 tables
a complete list of all your indexes(and columns) on your 3 tables
If none of your indexes are usefull or if they are missing, it will most likely read the whole 3 tables and try to match them. Because it is pretty big, it does not have enough memory to process it and it uses tempdb to store intermediary data.
For now I will suppose that shipper_key + PCN is the primary key on each tables.
I think you can try that:
You can create an index on S.Ship_Date
Create Index Shipper_Line_Ship_Date(Ship_Date) -- subject to updates according to your Primary Key
The query optimizer may not use the indexes (if they exists) with such a where clause:
AND S.Shipper_Key = ISNULL(#Shipper_Key, S.Shipper_Key)
AND SL.Part_Key = ISNULL(#Part_Key, SL.Part_Key)
you can use:
AND (S.Shipper_Key = #Shipper_Key or #Shipper_Key is null)
AND (SL.Part_Key = #Part_Key or #Part_Keyis null)
It would help to have indexes on Shipper_Key and PCN
Finally
As I already said above, we need to know more about your data model (create table...), primary keys and indexes (create indexes). You can create a modele here http://sqlfiddle.com/ with all 3 create tables and their indexes. Then go to link and add the link here.
In SSMS, you can right click on a table and go to Script Table as / Create To / New Query Window and add it here or in http://sqlfiddle.com/. Only keep the CREATE TABLE ... part down to the first GO.
You can then do the same thing on all you indexes.
You should also add a copy of you query plan.
In SSMS, go to Query menu / Display Estimated Execution Plan and right click to save it as xml (xml is better). It is only an estimation and it won't execute the whole query. It should be pretty fast.

Overwrite ONE column in select statement with a fixed value without listing all columns

I have a number of tables with a large number of columns (> 100) in a SQL Server database. In some cases when selecting (using views) I need to replace exactly ONE of the columns with a fixed result value instead of the data from the row(s).
Is there a way to use something like
select table.*, 'value' as Column1 from table
if Column1 is a column name within the table?
Of course I can list all the columns which are expected as result in the select Statement, replacing the one with a value.
However, this is very inconvinient and having 3 or 4 those views I have to maintain them all if columns are added or removed from the table.
Nope, you have to specify columns in this case.
And you have much more serious problems if tables are being changed often. This may be a signal of large architectural defects.
Anyway, listing all columns instead of * is a good practice, because if columns number will change, it may cause cascade errors.
As other responses have noted, this can't be done in a single statement. There is a workaround, however, which is not perfect but does circumvent the need to list columns manually: save your initial, unmodified query to a temp table, update the column(s) you need to overwrite, then select the results:
--we're going to use a temp table; make sure it doesn't already exist
if (object_id('tempdb..#tmpTbl') is not null)
drop table #tmpTbl
--initial query to retrieve all the columns
select *
into #tmpTbl
from TblWithManycolumns
--update column(s) from another table or query
update #tmpTbl
set ColToBeReplaced = trv.ColWithReplacementValue
from #tmpTbl t
join TableWithReplacementValue trv
on trv.KeyCol = t.KeyCol
--where trv.FilterCol = #FilterVal -- if needed
--this select contains the final output data
select * from #tmpTbl
drop table #tmpTbl
This has plenty of drawbacks. Complexity, performance, etc. But it is very flexible and solves the major problem of preventing changes to the main table (TblWithManyColumns) from breaking the query or requiring manual changes. This is particularly important if you're trying to generate SQL.

Whats the best way to select fields from multiple tables with a common prefix?

I have sensor data from a client which is in ongoing acquisition. Every week we get a table of new data (about one million rows each) and each table has the same prefix. I'd like to run a query and select some columns across all of these tables.
what would be the best way to go about this ?
I have seen some solutions that use dynammic sql and i was considering writing a stored procedure that would form a dynamic sql statement and execute it for me. But im not sure this is the best way.
I see you are using Postgresql. This is an ideal case for partitioning with constraint exclusion based on dates. You create one master table without data, and the other tables added daily inherit from it. In your case, you don't even have to worry about the nuisance of triggers on INSERT; sounds like there is never any insertion other than the daily bulk creation of a new table. See the link above for full documentation.
Queries can be run against the parent table, and Postgres takes care of looking in all the child tables, plus it is smart enough to skip child tables ruled out by WHERE criteria.
You could query the meta data for tables with the same prefix.
select table_name from information_schema.tables where table_name like 'week%'
Then you could use union all to combine queries like
select * from week001
union all
select * from week002
[...]
However I suggest appending new records to one single table, and use an index on the timestamp column. This would especially speed up queries which span multiple weeks etc. It will simplify your queries a lot, if you only have to deal with one table. If the table is getting too large, you could partition by date etc. So there should be no need to partition manually by having multiple tables.
You are correct, sometimes you have to write dynamic SQL to handle cases such as this.
If all of your tables are loaded you can query for table names within your stored procedure. Something like this:
SELECT TABLE_NAME
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
Play with that to get the specific table names you need.
How are the table names differentiated? By date? Some incrementing ID?

Updating 100k records in one update query

Is it possible, or recommended at all, to run one update query, that will update nearly 100k records at once?
If so, how can I do that? I am trying to pass an array to my stored proc, but it seems not to work, this is my SP:
CREATE PROCEDURE [dbo].[UpdateAllClients]
#ClientIDs varchar(max)
AS
BEGIN
DECLARE #vSQL varchar(max)
SET #vSQL = 'UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN (' + #ClientIDs + ')';
EXEC(#vSQL);
END
I have not idea whats not working, but its just not updating the relevant queries.
Anyone?
The UPDATE is reading your #ClientIDs (as a Comma Separated Value) as a whole. To illustrate it more, you are doing like this.
assume the #ClientIDs = 1,2,3,4,5
your UPDATE command is interpreting it like this
UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN ('1,2,3,4,5')';
and not
UPDATE Clients SET LastUpdate=GETDATE() WHERE ID IN (1,2,3,4,5)';
One suggestion to your question is by using subquery on your UPDATE, example
UPDATE Clients
SET LastUpdate = GETDATE()
WHERE ID IN
(
SELECT ID
FROM tableName
-- where condtion
)
Hope this makes sense.
A few notes to be aware of.
Big updates like this can lock up the target table. If > 5000 rows are affected by the operation, the individual row locks will be promoted to a table lock, which would block other processes. Worth bearing in mind if this could cause an issue in your scenario. See: Lock Escalation
With a large number of rows to update like this, an approach I'd consider is (basic):
bulk insert the 100K Ids into a staging table (e.g. from .NET, use SqlBulkCopy)
update the target table, using a join onto the above staging table
drop the staging table
This gives some more room for controlling the process, but breaking the workload up into chunks and doing it x rows at a time.
There is a limit for the number of items you pass to 'IN' if you are giving an array.
So, if you just want to update the whole table, skip the IN condition.
If not specify an SQL inside IN. That should do the job
The database will very likely reject that SQL statement because it is too long.
When you need to update so many records at once, then maybe your database schema isn't appropriate. Maybe the LastUpdate datum should not be stored separately for each client but only once globally or only once for a constant group of clients?
But it's hard to recommend a good course of action without seeing the whole picture.
What version of sql server are you using? If it is 2005+ I would recommend using TVPs (table valued parameters - http://msdn.microsoft.com/en-us/library/bb510489.aspx). The transfer of data will be faster (as opposed to building a huge string) and your query would look nicer:
update c
set lastupdate=getdate()
from clients c
join #mytvp t on c.Id = t.Id
Each SQL statement on its own is a transaction statement . This means sql server is going to grab locks for all these million of rows .It can really degrade the performance of a table .So you really don’t tend to update a table which has million of rows in it which hurts the performance.So the workaround is to set rowcount before DML operation
set rowcount=100
UPDATE Clients SET LastUpdate=GETDATE()
WHERE ID IN ('1,2,3,4,5')';
set rowcount=0
or from SQL server 2008 you can parametrize Top keyword
Declare #value int
set #value=100000
again:
UPDATE top (#value) Clients SET LastUpdate=GETDATE()
WHERE ID IN ('1,2,3,4,5')';
if ##rowcount!=0 goto again
See how fast the above query takes and then adjust and change the value of the variable .You need to break the tasks for smaller units as suggested in the above answers
Method 1:
Split the #clientids with delimiters ','
put in array and iterate over that array
update clients table for each id.
OR
Method 2:
Instead of taking #clientids as a varchar2, follow below steps
create object type table for ids and use join.
For faster processing u can also create index on clientid as well.

How do I optimize a database for superstring queries?

So I have a database table in MySQL that has a column containing a string. Given a target string, I want to find all the rows that have a substring contained in the target, ie all the rows for which the target string is a superstring for the column. At the moment I'm using a query along the lines of:
SELECT * FROM table WHERE 'my superstring' LIKE CONCAT('%', column, '%')
My worry is that this won't scale. I'm currently doing some tests to see if this is a problem but I'm wondering if anyone has any suggestions for an alternative approach. I've had a brief look at MySQL's full-text indexing but that also appears to be geared toward finding a substring in the data, rather than finding out if the data exists in a given string.
You could create a temporary table with a full text index and insert 'my superstring' into it. Then you could use MySQL's full text match syntax in a join query with your permanent table. You'll still be doing a full table scan on your permanent table because you'll be checking for a match against every single row (what you want, right?). But at least 'my superstring' will be indexed so it will likely perform better than what you've got now.
Alternatively, you could consider simply selecting column from table and performing the match in a high level language. Depending on how many rows are in table, this approach might make more sense. Offloading heavy tasks to a client server (web server) can often be a win because it reduces load on the database server.
If your superstrings are URLs, and you want to find substrings in them, it would be useful to know if your substrings can be anchored on the dots.
For instance, you have superstrings :
www.mafia.gov.ru
www.mymafia.gov.ru
www.lobbies.whitehouse.gov
If your rules contain "mafia' and you want the first 2 to match, then what I'll say doesn't apply.
Else, you can parse your URLs into things like : [ 'www', 'mafia', 'gov', 'ru' ]
Then, it will be much easier to look up each element in your table.
Well it appears the answer is that you don't. This type of indexing is generally not available and if you want it within your MySQL database you'll need to create your own extensions to MySQL. The alternative I'm pursuing is to do the indexing in my application.
Thanks to everyone that responded!
I created a search solution using views that needed to be robust enought to grow with the customers needs. For Example:
CREATE TABLE tblMyData
(
MyId bigint identity(1,1),
Col01 varchar(50),
Col02 varchar(50),
Col03 varchar(50)
)
CREATE VIEW viewMySearchData
as
SELECT
MyId,
ISNULL(Col01,'') + ' ' +
ISNULL(Col02,'') + ' ' +
ISNULL(Col03,'') + ' ' AS SearchData
FROM tblMyData
SELECT
t1.MyId,
t1.Col01,
t1.Col02,
t1.Col03
FROM tblMyData t1
INNER JOIN viewMySearchData t2
ON t1.MyId = t2.MyId
WHERE t2.SearchData like '%search string%'
If they then decide to add columns to tblMyData and they want those columns to be searched then modify viewMysearchData by adding the new colums to "AS SearchData" section.
If they decide that there are two many columns in the search then just modify the viewMySearchData by removing the unwanted columns from the "AS SearchData" section.