I keep many work-related SQL convenience scripts.
For a while I've been using a convention of having
several AND clauses in the where statement that I can activate
by providing a value or values to search on. For example,
where color like '%&color' and size like '%&size'
When I run such SQL in my preferred client (Golden6) it pops up
a dialog box where I can provide values for color, size or both.
Very convenient, but the performance of LIKE '%string' is often
terrible, often resulting in a full table scan, or so I have read.
Is there some alternative technique for writing and managing these
scripts that maintains the convenience of being able to fill in only the
arguments I want to use, yet avoids the performance issues around LIKE '%string'? I don't want to have to edit the script each time I use it,
because I keep them in git and I don't want to manage having a bunch of locally modified files to sort out all the time.
If you want to support optional input parameters then you could try
with data as
(select '123' col1, 'ABC' col2 from dual union select '124', 'AB' from dual)
select * from data a where a.col1 = nvl('&col1', a.col1) and a.col2 = nvl('&col2', a.col2)
Additonal data rows with null values
with data as
(select '123' col1, 'ABC' col2
from dual
union
select '124', 'AB'
from dual
union
select '123', null from dual
union
select '124', null from dual)
select *
from data
where ('&col1' is null or '&col1' is not null and '&col1' = col1)
and ('&col2' is null or '&col2' is not null and '&col2' = col2)
Related
I am querying Google BigQuery databases using JetBrains' DataGrip. I love the UI in many ways, but one thing I'd like to know is if there is a more friendly way to see structs and arrays.
I hate the BigQuery WebUI for many reasons, but the one thing I like is the way they implicitly render structs and arrays.
select struct("s1","s2","s3")
, array(select "a1" union all select "a2" union all select "a3")
union all
select struct("s4","s5","s6")
, array(select "a4" union all select "a5" union all select "a6")
To clarify with images, I like the following in BigQuery:
DataGrip is harder to read:
You'll finally be able to see your structs in the following DataGrip 2022.3 release.
Next steps:
DBE-16175 Support editing tables with BigQuery struct and array
data types
DBE-16173 Better Data Viewer presentation/layout for
hierarchical data types (struct, array, json)
DBE-16176 Erase
extra Data Viewer border when showing flattened hierarchical data
(struct, array, json)
Here is an example how it would be displayed in DataGrip:
Not sure how it will look in DataGrip, but try below
select format('%T', col1) col1, format('%T', col2) col2 from (
select struct("s1","s2","s3") col1
, array(select "a1" union all select "a2" union all select "a3") col2
union all
select struct("s4","s5","s6")
, array(select "a4" union all select "a5" union all select "a6")
)
i am using this trick some times within BigQuery Console, to have even better view so results are more compact and easier to swallow
I have a table that has 40 columns/headers and 15 rows(may vary due to the high volume records); In the records/data, many columns which have NULL values or in other words, these columns were not in a production environment, so no validation required or to list in the output.
I wanna list only the NON NULL values(Column vise) in the select table even if the entire Column is NULL.
COLUMNS
Col_A
Col_B
Col_C
Col_D
Col_E
Col_F
ROW1
Val_1
Val_2
NULL
Val_4
Val_5
Val_6
ROW2
Val_1
Val_2
NULL
Val_4
Val_5
NULL
Here I want to list all the columns except "Column C" which is NULL
In sql this is not possible, because that is not how a sql query works. In short this is what happens
You tell the database engine what columns you want to get back and what the conditions are (conditions filter rows, not columns). At this point the database does not know anything about the results. This is done using the query syntax.
The db engine runs the query and returns the result, row by row. The db engine does not know what is in those rows/columns.
The requirement is to skip columns that have no data. As explained above, that is not know at the time of running the query, but you could work around this, for example by creating a view that only has the columns with data. Bear in mind that this means that every query has to run at least once before the query itself, which could be a downside if you're talking about large datasets.
Lets create a sample table
CREATE TABLE tab1 (c1, c2, c3, c4) AS
(
SELECT 1,CAST(NULL AS NUMBER), 2,4 FROM DUAL UNION ALL
SELECT NULL,NULL, 1,3 FROM DUAL UNION ALL
SELECT 1,NULL, NULL,3 FROM DUAL UNION ALL
SELECT 1,NULL, 2,NULL FROM DUAL UNION ALL
SELECT 4,NULL, 3,4 FROM DUAL
);
Now it's not possible to know which columns have only NULL values before we run the query, but we can run a query to know which columns have only NULL values using an aggregate function like MAX, MIN OR SUM. Lets call this QUERY1. For example:
SELECT MAX(c1),MAX(c2),MAX(c3),MAX(c4) FROM tab1;
MAX(C1) MAX(C2) MAX(C3) MAX(C4)
---------- ---------- ---------- ----------
4 3 4
Now we can convert that select so it generates another select that that only selects the columns with not null values. Lets call this QUERY2.
SELECT
'SELECT '||
RTRIM(
NVL2(SUM(c1),'c1,','') ||
NVL2(SUM(c2),'c2,','') ||
NVL2(SUM(c3),'c3,','') ||
NVL2(SUM(c4),'c4,','')
,',')||' FROM tab1;' FROM tab1;
STMT
---------------------------------------------------------------
CREATE OR REPLACE VIEW tab1_v AS SELECT c1,c3,c4 FROM tab1;
There you go - this statement will only return the columns that have no NULL values. Note that this is not foolproof. If a row is created or updated after you run QUERY1 but before you run QUERY2 that would alter the results of QUERY1, it could not select the correct columns.
I saw the following posted on a basic way to de-dup entries, without explanation of how it works. I see that it works, but I want to know the workings of how it works and the process in which it evaluates. Below I will post the code, and my thoughts. I am hoping that somebody can tell me if my thought process on how this is evaluated step by step is correct, or if I am off, can somebody please break it down for me.
CREATE TABLE #DuplicateRcordTable (Col1 INT, Col2 INT)
INSERT INTO #DuplicateRcordTable
SELECT 1, 1
UNION ALL
SELECT 1, 1
UNION ALL
SELECT 1, 1
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 1, 4
GO
This returns a basic table:
Then this code is used to exclude duplicates:
SELECT col1,col2
FROM #DuplicateRcordTable
EXCEPT
SELECT col1,col2
FROM #DuplicateRcordTable WHERE 1=0
My understanding is that where 1=0 creates a "temp" table structured the same but has no data.
Does this code then start adding data to the new empty table?
For example does it look at the first Col1, Col2 pair of 1,1 and say "I don't see it in the table" so it adds it to the "temp" table and end result, then checks the next row which is also 1,1 and then sees it already in the "temp" table so its not added to the end result....and so on through the data.
EXCEPT is a set operation that removes duplicates. That is, it takes everything in the first table that is not in the second and then does duplicate removal.
With an empty second set, all that is left is the duplicate removal.
Hence,
SELECT col1, col2
FROM #DuplicateRcordTable
EXCEPT
SELECT col1, col2
FROM #DuplicateRcordTable
WHERE 1 = 0;
is equivalent to:
SELECT DISTINCT col1, col2
FROM #DuplicateRcordTable
This would be the more typical way to write the query.
This would also be equivalent to:
SELECT col1,col2
FROM #DuplicateRcordTable
UNION
SELECT col1,col2
FROM #DuplicateRcordTable
WHERE 1 = 0;
The reason that this works is due to the definition of EXCEPT which according to the MS docs is
EXCEPT returns distinct rows from the left input query that aren't
output by the right input query.
The key word here being distinct. Putting where 1 = 0 makes the second query return no results, but the EXCEPT operator itself then reduces the rows from the left query down to those which are distinct.
As #Gordon Linoff says in his answer, there is a simpler, more straightforward way to accomplish this.
The fact that the example uses the same table in the left and right queries could be misleading, the following query will accomplish the same thing, so long as the values in the right query don't exist in the left:
SELECT col1, col2
FROM #DuplicateRecordTable
EXCEPT
SELECT -1, -1
REF: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/set-operators-except-and-intersect-transact-sql?view=sql-server-2017
Is it possible to select the value of the second column using something like an index or column position if i dont know the name of the column?
Select col(2) FROM (
Select 'a', 'b',' c', 'd' from dual
)
Is it possible? Sure. You could write a PL/SQL block that used dbms_sql to open a cursor using the actual query against dual, describe the results, bind a variable to whatever you find the second column to be, fetch from the cursor, and then loop. That would be a terribly involved and generally rather painful process but it could be done.
The SQL language does not define a way to do this in a static SQL statement and Oracle does not provide an extension that would allow this. I'd be rather concerned about the underlying problem that you're trying to solve, though, if you somehow know that you want the second column but don't know what that column represents. That's not something that makes a lot of sense in a relational database.
SELECT ORDINAL_POSITION, COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'dual'
not sure if this is what you need.
If you don't know the name, just give it a nice name ;).
Select b FROM (
Select 'a', 'b' as b, 'c', 'd' from dual
)
It is possible in case when you know number of columns in the subquery:
SELECT COL2 FROM (
SELECT NULL COL1, NULL COL2, NULL COL3, NULL COL4
UNION ALL
select 'a', 'b', 'c', 'd' from dual
)
WHERE COL2 IS NOT NULL;
There are limitations:
query is hard to read and understand
you should know number of columns
value should not be NULL
not recommended for production use
I'm working on a stored procedure in SQL Server 2000 with a temp table defined like this:
CREATE TABLE #MapTable (Category varchar(40), Code char(5))
After creating the table I want to insert some standard records (which will then be supplemented dynamically in the procedure). Each category (about 10) will have several codes (typically 3-5), and I'd like to express the insert operation for each category in one statement.
Any idea how to do that?
The best idea I've had so far is to keep a real table in the db as a template, but I'd really like to avoid that if possible. The database where this will live is a snapshot of a mainframe system, such that the entire database is blown away every night and re-created in a batch process- stored procedures are re-loaded from source control at the end of the process.
The issue I'm trying to solve isn't so much keeping it to one statement as it is trying to avoid re-typing the category name over and over.
DJ's is a fine solution but could be simplified (see below).
Why does it need to be a single statement?
What's wrong with:
insert into #MapTable (category,code) values ('Foo','AAAAA')
insert into #MapTable (category,code) values ('Foo','BBBBB')
insert into #MapTable (category,code) values ('Foo','CCCCC')
insert into #MapTable (category,code) values ('Bar','AAAAA')
For me this is much easier to read and maintain.
Simplified DJ solution:
CREATE TABLE #MapTable (Category varchar(40), Code char(5))
INSERT INTO #MapTable (Category, Code)
SELECT 'Foo', 'AAAAA'
UNION
SELECT 'Foo', 'BBBBB'
UNION
SELECT 'Foo', 'CCCCC'
SELECT * FROM #MapTable
There's nothing really wrong with DJ's, it just felt overly complex to me.
From the OP:
The issue I'm trying to solve isn't so much keeping it to one statement as it
is trying to avoid re-typing the category name over and over.
I feel your pain -- I try to find shortcuts like this too and realize that by the time I solve the problem, I could have typed it long hand.
If I have a lot of repetitive data to input, I'll sometimes use Excel to generate the insert codes for me. Put the Category in one column and the Code in another; use all of the helpful copying techniques to do the hard work
then
="insert into #MapTable (category,code) values ('"&A1&"','"&B1&"')"
in a third row and I've generated my inserts
Of course, all of this is assuming that the Categories and Codes can't be pulled from a system table.
insert into #maptable (category, code)
select 'foo1', b.bar
from
( select 'bar11' as bar
union select 'bar12'
union select 'bar13'
) b
union
select 'foo2', b.bar
from
( select 'bar21' as bar
union select 'bar22'
union select 'bar23'
) b
This might work for you:
CREATE TABLE #MapTable (Category varchar(40), Code char(5))
INSERT INTO #MapTable
SELECT X.Category, X.Code FROM
(SELECT 'Foo' as Category, 'AAAAA' as Code
UNION
SELECT 'Foo' as Category, 'BBBBB' as Code
UNION
SELECT 'Foo' as Category, 'CCCCC' as Code) AS X
SELECT * FROM #MapTable
Here's the notation I ended up using. It's based on Arvo's answer, but a little shorter and uses cAse to help make things clearer:
SELECT 'foo1', b.code
FROM ( select 'bar11' as code
union select 'bar12'
union select 'bar13' ) b
UNION SELECT 'foo2', b.code
FROM ( select 'bar21' as code
union select 'bar22'
union select 'bar32' ) b
This way highlights the category names a little, lines up codes vertically, and uses less vertical space.