Is there a way to pull part of a SQL query from a .sql file? - sql

Let me simplify with an example. Let's say I have the following query saved on:
C:\sample.sql
grp.id IN
(001 --Bob
,002 --Tom
,003 --Fay
)
Now, that group of IDs could change, but instead of updating those IDs in every query it's related to, I was hoping to just update in sample.sql and the rest of the queries will pull from that SQL file directly.
For example, I have several queries that would have a section like this:
SELECT *
FROM GROUP grp
WHERE grp.DATERANGE >= '2017-12-01 AND grp.DATERANGE <= '2017-12-31
AND -- **this is where I would need to insert that query (ie. C:\sample.sql)**
More explained update:
Issue: I have several reports/queries having the same ID filter (that's the only thing in common between those reports)
What's needed: Instead of updating those IDs every time they change on each report, I was wondering if I can update those IDs in it's own SQL file (like the example above) and have the rest of the queries pull from there.
Note. I can't create a table or database in the used database.

Maybe the bulk insert utility could help. Hold your data in csv files and load them into temp tables at run time. Use these temp tables to drive your query.
CREATE TABLE #CsvData(
Column1 VARCHAR(40),
Column2 VARCHAR(40)
)
GO
BULK
INSERT #CsvData
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
--Use #CsvData to drive your query
SELECT *
FROM #CsvData

maybe what you could use is a CTE (Common Table Expression) to pull your IDs using an additional query, specially if you only have read access. It would look something like this:
WITH myIDs AS (select IDs from grp where (conditions to get the IDs))
SELECT *
FROM grp
WHERE grp.DATERANGE BETWEEN '2017-12-01 AND '2017-12-31'
AND IDs in (select * from myIDs)
I've changed the dates syntax to use BETWEEN since it's more practical but only works if you have a SQL Server 2008 or later
Hope this helps!
Cheers!

The only chance to build a query out of text fragments is dynamic SQL:
Try this:
DECLARE #SomeCommand VARCHAR(MAX)='SELECT * FROM sys.objects';
EXEC(#SomeCommand);
Returns a list of all sys.object entries
Now I append a WHERE clause to the string
SET #SomeCommand=#SomeCommand + ' WHERE object_id IN(1,2,3,4,5,6,7,8,9)';
EXEC(#SomeCommand);
And you get a reduced result.
Another option is dynamic IN-list with a CSV paramter.
This is forbidden: DECLARE #idList VARCHAR(100)='1,2,3,4' and use it like IN (#idList).
But this works:
DECLARE #idList VARCHAR(100)='1,2,3,4,5,6,7,8,9';
SELECT sys.objects.*
FROM sys.objects
--use REPLACE to transform the list to <x>1</x><x>2</x>...
OUTER APPLY(SELECT CAST('<x>' + REPLACE(#idList,',','</x><x>') + '</x>' AS XML)) AS A(ListSplitted)
--Now use the XML (the former CSV) within your IN() as set-based filter
WHERE #idList IS NULL OR LEN(#idList)=0 OR object_id IN(SELECT B.v.value('.','int') FROM ListSplitted.nodes('/x') AS B(v));
With a version of SQL Server 2016+ this can be done much easier using STRING_SPLIT().
This approach allows you to pass the id-list as simple text parameter.

Related

Generate unique ID CSV list from table

I am trying to get a unique list of IDs from a table in CSV format. I am close I just need to be able to remove duplicates. So far I have:
DECLARE #csv VARCHAR(max)
set #csv = null
SELECT #csv = COALESCE(#csv + ',', '') + ''''+ID+''''
FROM TABLE
select #csv
The only problem is the table can have multiple IDs, and I only want each occurrence once. I tried adding a "DISTINCT" before the ID but it doesn't like being there.
Using the syntax SELECT #Variable = #Variable + ... FROM is a documented antipattern and should be avoided; it relies on the data engine processing your data in a row by row order, which there is no guarentee of. Instead use string aggregation to achieve the same results. In recent versions of SQL Server that would be by using STRING_AGG, however, in older versions you'll need to use FOR XML PATH (and STUFF) to achieve the same results.
Assuming you are on a fully supported version of SQL Server, then use a CTE/derived table to get the DISTINCT values, and then aggregate that:
WITH CTE AS(
SELECT DISTINCT ID
FROM dbo.YourTable)
SELECT STRING_AGG(ID,',')
FROM CTE;

Is there any SQL query character limit while executing it by using the JDBC driver [duplicate]

I'm using the following code:
SELECT * FROM table
WHERE Col IN (123,123,222,....)
However, if I put more than ~3000 numbers in the IN clause, SQL throws an error.
Does anyone know if there's a size limit or anything similar?!!
Depending on the database engine you are using, there can be limits on the length of an instruction.
SQL Server has a very large limit:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
ORACLE has a very easy to reach limit on the other side.
So, for large IN clauses, it's better to create a temp table, insert the values and do a JOIN. It works faster also.
There is a limit, but you can split your values into separate blocks of in()
Select *
From table
Where Col IN (123,123,222,....)
or Col IN (456,878,888,....)
Parameterize the query and pass the ids in using a Table Valued Parameter.
For example, define the following type:
CREATE TYPE IdTable AS TABLE (Id INT NOT NULL PRIMARY KEY)
Along with the following stored procedure:
CREATE PROCEDURE sp__Procedure_Name
#OrderIDs IdTable READONLY,
AS
SELECT *
FROM table
WHERE Col IN (SELECT Id FROM #OrderIDs)
Why not do a where IN a sub-select...
Pre-query into a temp table or something...
CREATE TABLE SomeTempTable AS
SELECT YourColumn
FROM SomeTable
WHERE UserPickedMultipleRecordsFromSomeListOrSomething
then...
SELECT * FROM OtherTable
WHERE YourColumn IN ( SELECT YourColumn FROM SomeTempTable )
Depending on your version, use a table valued parameter in 2008, or some approach described here:
Arrays and Lists in SQL Server 2005
For MS SQL 2016, passing ints into the in, it looks like it can handle close to 38,000 records.
select * from user where userId in (1,2,3,etc)
I solved this by simply using ranges
WHERE Col >= 123 AND Col <= 10000
then removed unwanted records in the specified range by looping in the application code. It worked well for me because I was looping the record anyway and ignoring couple of thousand records didn't make any difference.
Of course, this is not a universal solution but it could work for situation if most values within min and max are required.
You did not specify the database engine in question; in Oracle, an option is to use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
This ugly hack only works in Oracle SQL, see https://asktom.oracle.com/pls/asktom/asktom.search?tag=limit-and-conversion-very-long-in-list-where-x-in#9538075800346844400
However, a much better option is to use stored procedures and pass the values as an array.
You can use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
There are no restrictions on number of these. It compares pairs.

Create Microsoft SQL Temp Tables (without declaring columns – like Informix)?

I recently changed positions, and came from an Informix database environment, where I could use SQL statements to select one or more columns ... and direct the output to a temporary table. In Informix, for temp tables, I neither had to declare the column names, nor the column lengths (only the name of a temp table) - I could simply write:
select [columnname1, columnname2, columnname3 ..] from
[database.tablename] where... etc. into temp tablename1 with no log;
Note that in Informix, the temp table stores the column names by default... as well as the data types [by virtue of the data-type being stored in the temp table]. So, if the above statement was executed, then a developer could merely write:
select columname1, columnname2, etc. from tablename1
In my experience, I found this method was very useful - for numerous reasons ('slicing/dicing' the data, using various data sources, etc.)... as well as tremendously fast and efficient.
However, now I am using Microsoft SQL Server, I have not found a way (yet) do the same. In SQL Server, I must declare each column, along with its length:
Create table #tablename1 ( column1 numeric(13,0) );
insert into #tablename1(column1) select [column] from
[database.tablename] where …
[Then use the info, as needed]:
select * from #tablename1 [ and do something...]
Drop table #tablename1
Does anyone know of how I could do this and/or set-up this capability in Microsoft SQL Server? I looked at anonymous tables (i.e. Table-Value constructors: http://technet.microsoft.com/en-us/library/dd776382.aspx)... but the guidance stated that declaring the columns was still necessary.
Thanks ahead of time
- jrd
The syntax is :
select [columnname1], [columnname2], [columnname3] into tablename1 from [database].[schema].[tablename] where...
prefix tablename1 with # if you want the table to be temporary
It should be noted that, while you can use the syntax below:
SELECT col1, col2...
INTO #tempTable1
FROM TABLEA
You should also give your calculated columns names as well.
Such that you get:
SELECT col1, col2...,AVG(col9) AS avgCol9
INTO #tempTable1
FROM TABLEA
Its very simple in sql server as well all you have to do is
SELECT Column1, Column2, Column3,...... INTO #Temp
FROM Table_Name
This statement will Create a Temp Table on fly copying Data and DataType all over to the Temp Table. # sign makes this table a temporary table , you can also Create a Table by using the same syntax but with out the # sign, something like this
SELECT Column1, Column2, Column3,...... INTO New_Table_Name
FROM Table_Name

how to overcome the limitation of IN cause in sql query

I have written an sql query like :
select field1, field2 from table_name;
The problem is this query will return 1 million records/ or more than 100k records.
I have a directory in which I have input files (around 20,000 to 50,000 records) that contain field1 . This is the main data I am concerned with.
Using perl script, I am extracting from the directory.
But , if I write a query like :
select field1 , field2 from table_name
where field1 in (need to write a query to take field1 from directory);
If I use IN cause then it has limitation of processing 1000 entries, then how should I overcome the limitation of IN cause?
In any DBMS, I would insert them into a temporary table and perform a JOIN to workaround the IN clause limitation on the size of the list.
E.g.
CREATE TABLE #idList
(
ID INT
)
INSERT INTO #idList VALUES(1)
INSERT INTO #idList VALUES(2)
INSERT INTO #idList VALUES(3)
SELECT *
FROM
MyTable m
JOIN #idList AS t
ON m.id = t.id
In SQL Server 2005, in one of our previous projects, we used to convert this list of values that are a result of querying another data store (lucene index) into XML and pass it as XML variable in the SQL query and convert it into a table using the nodes() function on XML data types and perform a JOIN with that.
DECLARE #IdList XML
SELECT #idList = '
<Requests>
<Request id="1" />
<Request id="2" />
<Request id="3" />
</Requests>'
SELECT *
FROM
MyTable m
JOIN (
SELECT id.value('(#id)[1]', 'INT') as 'id'
FROM #idList.nodes('/Requests/Request') as T(id)
) AS t
ON m.id = t.id
Vikdor is right, you shouldn't be querying this with an IN() clause, it's faster and more memory efficient to use a table to JOIN.
Expanding on his answer I would recommend the following approach:
Get a list of all input files via Perl
Think of some clever way to compute a hash value for your list that is unique and based on all input files (I'd recommend the filenames or similar)
This hash will serve as the name of the table that stores the input filenames (think of it as a quasi temporary table that gets discarded once the hash changes)
JOIN that table to return the correct records
For step 2. you could either use a cronjob or compute whenever the query is actually needed (which would delay the response, though). To get this right you need to consider how likely it is that files are added/removed.
For step 3. you would need some logic that drops the previously generated tables once the current hash value differs from last execution, then recreate the table named after the current hash.
For the quasi temporary table names I'd recommend something along the lines of
input_files_XXX (.i.e. prefix_<hashvalue>)
which makes it easier to know what stale tables to drop.
You could split your 50'000 ids in 50 lists of 1000 ids, do a query for each such list, and collect the result sets in perl.
Oracle wise, the best solution with using a temporary table - which without indexing won't give you much performance is to use a nested tabled type.
CREATE TYPE my_ntt is table of directory_rec;
Then create a function f1 that returns a variable of my_ntt type and use in the query.
select field1 , field2 from table_name where field1 in table (cast (f1 as my_ntt));

Limit on the WHERE col IN (...) condition

I'm using the following code:
SELECT * FROM table
WHERE Col IN (123,123,222,....)
However, if I put more than ~3000 numbers in the IN clause, SQL throws an error.
Does anyone know if there's a size limit or anything similar?!!
Depending on the database engine you are using, there can be limits on the length of an instruction.
SQL Server has a very large limit:
http://msdn.microsoft.com/en-us/library/ms143432.aspx
ORACLE has a very easy to reach limit on the other side.
So, for large IN clauses, it's better to create a temp table, insert the values and do a JOIN. It works faster also.
There is a limit, but you can split your values into separate blocks of in()
Select *
From table
Where Col IN (123,123,222,....)
or Col IN (456,878,888,....)
Parameterize the query and pass the ids in using a Table Valued Parameter.
For example, define the following type:
CREATE TYPE IdTable AS TABLE (Id INT NOT NULL PRIMARY KEY)
Along with the following stored procedure:
CREATE PROCEDURE sp__Procedure_Name
#OrderIDs IdTable READONLY,
AS
SELECT *
FROM table
WHERE Col IN (SELECT Id FROM #OrderIDs)
Why not do a where IN a sub-select...
Pre-query into a temp table or something...
CREATE TABLE SomeTempTable AS
SELECT YourColumn
FROM SomeTable
WHERE UserPickedMultipleRecordsFromSomeListOrSomething
then...
SELECT * FROM OtherTable
WHERE YourColumn IN ( SELECT YourColumn FROM SomeTempTable )
Depending on your version, use a table valued parameter in 2008, or some approach described here:
Arrays and Lists in SQL Server 2005
For MS SQL 2016, passing ints into the in, it looks like it can handle close to 38,000 records.
select * from user where userId in (1,2,3,etc)
I solved this by simply using ranges
WHERE Col >= 123 AND Col <= 10000
then removed unwanted records in the specified range by looping in the application code. It worked well for me because I was looping the record anyway and ignoring couple of thousand records didn't make any difference.
Of course, this is not a universal solution but it could work for situation if most values within min and max are required.
You did not specify the database engine in question; in Oracle, an option is to use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
This ugly hack only works in Oracle SQL, see https://asktom.oracle.com/pls/asktom/asktom.search?tag=limit-and-conversion-very-long-in-list-where-x-in#9538075800346844400
However, a much better option is to use stored procedures and pass the values as an array.
You can use tuples like this:
SELECT * FROM table
WHERE (Col, 1) IN ((123,1),(123,1),(222,1),....)
There are no restrictions on number of these. It compares pairs.