teradata case when issue - sql

I have the following queries which are supposed to give the same result, but drastically different
1.
select count(*)
from qigq_sess_parse_2
where str_vendor = 'natural search' and str_category is null and destntn_url = 'http://XXXX.com';
create table qigq_test1 as
(
select case
when (str_vendor = 'natural search' and str_category is null and destntn_url = 'http://XXXX.com' ) then 1
else 0
end as m
from qigq_sess_parse_2
) with data;
select count(*) from qigq_test1 where m = 1;
the first block gives a total number of count 132868, while the second one only gives 1.
What are the subtle parts in the query that causes this difference?
Thanks

When you create a table in Teradata, you can specify it to be SET or MULTISET. If you don't specify, it defaults to SET. A set table cannot contain duplicates. So at most, your new table will contain two rows, a 0 and a 1, since that's all that can come from your case statement.
EDIT:
After a bit more digging, the defaults aren't quite that simple. But in any case, I suspect that if you add the MULTISET option to your create statement, you'll see the behavior your expect.

My guess would be that your Create Table statement is only pulling in one row of data that fits the parameters for the following Count statement. Try this instead:
CREATE TABLE qigq_test1 (m integer);
INSERT INTO qigq_test1
SELECT
CASE
WHEN (str_vendor = 'natural search' and str_category IS NULL AND destntn_url = 'http://XXXX.com' ) THEN 1
ELSE 0
END AS m
FROM qigq_sess_parse_2;
SELECT COUNT(*) FROM qigq_test1 WHERE m = 1;
This should pull ALL ROWS of data from qigq_sess_parse_2 into qigq_test1 as either a 0 or 1.

Related

Displaying an alternative result when derrived table is empty

I have this sql code where I try to display an alternative value as a result whenever the table is empty or the the single column of the top row when it is not
select top 1 case when count(*)!=0 then derrivedTable.primarykey
else 0 end endCase
from
(
select top 1 m.primarykey
from mytable m
where 0=1
)derrivedTable
The problem is that when I run this, I get the error message "column 'derrivedTable.primarykey' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause."
But when I put 'derrivedTable.primarykey' in the group by clause, I just get an empty table.
Does anyone hve a solution?
thanks in advance
You can use aggregation:
select coalesce(max(m.primarykey), 0)
from mytable m;
An aggregation query with no group by always returns exactly one row. If the table is empty (or all rows are filtered out), then the aggregation functions -- except for COUNT() -- return NULL -- which can be transformed to a value using COALESCE().
Such a construct makes me worry. If you are using this to set the primary key on an insert, then you should learn about identity columns or sequences. The database will do the work for you.
Can you try this below script-
SELECT
CASE
WHEN COUNT(*) = 1 THEN derrivedTable.primarykey
ELSE 0
END endCase
FROM
(
SELECT TOP 1 m.primarykey
FROM mytable m
WHERE 0 = 1
) derrivedTable
derrivedTable.primarykey;

Show results from query if results exist without running the same query again

I want to have a stored procedure that will take one SerialNumber nvarchar as it's input and check several databases to see if that serial number exists and if it does exist then return the result of the query, otherwise move onto the next database and do the same thing until all databases have been checked.
Current pseudocode:
IF(exists(select top 1 * from Server1.Database1.Table where num = #SerialNumberInput))
BEGIN
select top 1 * from Server1.Database1.Table where num = #SerialNumberInput
END ELSE
IF(exists(select top 1 * from Server2.Database2.Table where num = #SerialNumberInput))
BEGIN
select top 1 * from Server2.Database2.Table where num = #SerialNumberInput
END ELSE
--Server3.Database3
--Server4.Database4
--etc...
But I don't like all this query repetition and I don't like how I'm having to make a call to the server twice by calling the same query twice. I could save the result to a table variable and just check that but that feels hacky.
Too long to comment.
But I don't like all this query repetition
Me neither, but for this case, it's the cleanest method or most readable IMHO.
I don't like how I'm having to make a call to the server twice by
calling the same query twice.
You aren't, at least not exactly. EXISTS returns a BOOLEAN value so as long as there is an INDEX on your predicate, it should be pretty fast. The second query, where you are returning the first row with all of the columns would be slightly slower. Also, you don't need top 1 * in the EXISTS unless you just like that. You can use SELECT 1 or anything since the result is BOOLEAN.
Another thing is you are using TOP without and ORDER BY which means you don't care what row is returned, and are OK with that row being different (potentially) each time you execute this. More on that in this blog.
If you really want to not use EXISTS, you can break this up using ##ROWCOUNT.
select top 1 * from Server1.Database1.Table where num = #SerialNumberInput
if ##ROWCOUNT = 1
return
else
select top 1 * from Server2.Database2.Table where num = #SerialNumberInput
if ##ROWCOUNT = 1
return
else
...
Or, if the schema is the same and you don't want NULL datasets... something like you said with the table variable.
create table #Temp(...)
insert into #Temp
select top 1 * from Server1.Database1.Table where num = #SerialNumberInput
if ##ROWCOUNT = 1
select * from #Temp
return
else
insert into #Temp
select top 1 * from Server2.Database2.Table where num = #SerialNumberInput
if ##ROWCOUNT = 1
select * from #Temp
return
else
...
Since you are only inserting a single row, it'd be pretty quick. Larger datasets would naturally take longer.

SQL Query : should return Single Record if Search Condition met, otherwise return Multiple Records

I have table with Billions of Records, Table structure is like :
ID NUMBER PRIMARY KEY,
MY_SEARCH_COLUMN NUMBER,
MY_SEARCH_COLUMN will have Numeric value upto 15 Digit in length.
What I want is, if any specific record is matched, I will have to get that matched value only,
i.e. : If I enter WHERE MY_SEARCH_COLUMN = 123454321 and table has value 123454321 then this only should be returned.
But if exact value is not matched, I will have to get next 10 values from the table.
i.e. : if I enter WHERE MY_SEARCH_COLUMN = 123454321 and column does not have the value 123454321 then it should return 10 values from the table which is greater than 123454321
Both the case should be covered in single SQL Query, and I have have to keep in mind the Performance of the Query. I have already created Index on the MY_SEARCH_COLUMN columns, so other suggestions are welcome to improve the Performance.
This could be tricky to do without using a proc or maybe some dynamic SQL, but we can try using ROW_NUMBER here:
WITH cte AS (
SELECT ID, MY_SEARCH_COLUMN,
ROW_NUMBER() OVER (ORDER BY MY_SEARCH_COLUMN) rn
FROM yourTable
WHERE MY_SEARCH_COLUMN >= 123454321
)
SELECT *
FROM cte
WHERE rn <= CASE WHEN EXISTS (SELECT 1 FROM yourTable WHERE MY_SEARCH_COLUMN = 123454321)
THEN 1
ELSE 10 END;
The basic idea of the above query is that we assign a row number to all records matching the target or greater. Then, we query using either a row number of 1, in case of an exact match, or all row numbers up to 10 in case of no match.
SELECT *
FROM your_table AS src
WHERE src.MY_SEARCH_COLUMN = CASE WHEN EXISTS (SELECT 1 FROM your_table AS src2 WITH(NOLOCK) WHERE src2.MY_SEARCH_COLUMN = 123456321)
THEN 123456321
ELSE src.MY_SEARCH_COLUMN
END

Left Join Not Joining with a Single Record

I have the following query:
Insert into cet_database.dbo.termData
(
termID,
studentID,
course,
[current],
program,
StbyCurrentClassID,
class,
classCode,
cancelled
)
Select
fm_stg.classByStudent_termData_assessmentData.termID,
fm_stg.classByStudent_termData_assessmentData.studentID,
fm_stg.classByStudent_termData_assessmentData.class_code,
case when fm_stg.classByStudent_termData_assessmentData.[current] = 'Yes' then 1 else 0 end,
fm_stg.classByStudent_termData_assessmentData.program,
fm_stg.classByStudent_termData_assessmentData.classByStudentID,
fm_stg.classByStudent_termData_assessmentData.class,
fm_stg.classByStudent_termData_assessmentData.classID,
case when fm_stg.classByStudent_termData_assessmentData.cancelled_flag = 1 then 1 else 0 end
From fm_stg.classByStudent_termData_assessmentData left outer join termData
On fm_stg.classByStudent_termData_assessmentData.class_code = termData.course
and fm_stg.classByStudent_termData_assessmentData.termID = termData.termID
and fm_stg.classByStudent_termData_assessmentData.studentID = fm_stg.classByStudent_termData_assessmentData.studentID
Where termData.StbyCurrentClassID is null
I use the query to import data into a staging table from another database (fm_stg.classByStudent_termData_assessmentData) before importing it into my database's tables. This particular query is part of a larger stored procedure that imports data into multiple tables related to termData.
When I run the sproc, I get the record inserted into fm_stg.classByStudent_termData_assessmentData but not into termData. I am only inserting one record when having this problem, but it works for the 10,000 records I did previously. I use the left join to establish what already exists in my database's table and what doesn't, then take the relevant records from the staging table. However, with this record:
316a, 39520, DEC 10, Yes, DEC10, 105713, DEC 10 (18), 6078, NULL, 2
The select returns nothing - why is this? The record definitely doesn't exist in my termData table and records insert into all my other tables from the staging table. The sproc is running all of the inserts in a transaction so as to avoid precisely this scenario where records are inserted in some tables and not others, but it doesn't seem to be working.
You say the query worked for the previous 10,000 records, but doesn't for the current one. The only thing that looks strange in your query is the third line in your ON clause where you compare a field (the studentID) with itself.
On fm_stg.classByStudent_termData_assessmentData.class_code = termData.course
and fm_stg.classByStudent_termData_assessmentData.termID = termData.termID
and fm_stg.classByStudent_termData_assessmentData.studentID = fm_stg.classByStudent_termData_assessmentData.studentID
I am just guessing here, but as this line is in the ON clause, did you want to compare the student ID, too? So it may be you were just lucky the query worked so far and now you stumble upon the student ID. I suppose the ON clause should look like this:
On fm_stg.classByStudent_termData_assessmentData.class_code = termData.course
and fm_stg.classByStudent_termData_assessmentData.termID = termData.termID
and fm_stg.classByStudent_termData_assessmentData.studentID = termData.studentID
By the way, queries get more readable by using table aliases. In the following query I use ad for fm_stg.classByStudent_termData_assessmentData and td for termData:
Insert into cet_database.dbo.termData
(
termID,
studentID,
course,
[current],
program,
StbyCurrentClassID,
class,
classCode,
cancelled
)
Select
ad.termID,
ad.studentID,
ad.class_code,
case when ad.[current] = 'Yes' then 1 else 0 end,
ad.program,
ad.classByStudentID,
ad.class,
ad.classID,
case when ad.cancelled_flag = 1 then 1 else 0 end
From fm_stg.classByStudent_termData_assessmentData ad
Left Outer Join termData td On ad.class_code = td.course
And ad.termID = td.termID
And ad.studentID = td.studentID
Where td.StbyCurrentClassID is null;
Moreover when checking for existence, why do you use the anti-join trick? Did you have issues with a straight-forward NOT EXISTS? Use tricks only when really needed. The query reads better as follows:
Insert into cet_database.dbo.termData
(
termID,
studentID,
course,
[current],
program,
StbyCurrentClassID,
class,
classCode,
cancelled
)
Select
termID,
studentID,
class_code,
case when [current] = 'Yes' then 1 else 0 end,
program,
classByStudentID,
class,
classID,
case when cancelled_flag = 1 then 1 else 0 end
From fm_stg.classByStudent_termData_assessmentData ad
Where Not Exists
(
Select *
From termData td
Where ad.class_code = td.course
And ad.termID = td.termID
And ad.studentID = td.studentID
);
With another DBMS you could even have used NOT IN (i.e. Where (class_code, termId, studenId) Not In (Select ...)) which is not correlated so such a typo as yours could not even have occurred, but SQL Server doesn't feature tuples in the IN clause unfortunately.

Randomly Select a Row with SQL in Access

I have a small access database with some tables. I am trying the code in the sql design within access. I just want to randomly select a record within a table.
I created a simple table called StateAbbreviation. It has two columns: ID and Abbreviation. ID is just an autonumber and Abbreviation are different abbreviations for states.
I saw this thread here. So I tried
SELECT Abbreviation
FROM STATEABBREVIATION
ORDER BY RAND()
LIMIT 1;
I get the error Syntax error (missing operator) in query expresion RAND() LIMIT 1. So I tired RANDOM() instead of RAND(). Same error.
None of the others worked either. What am I doing wrong? Thanks.
Ypercude provided a link that led me to the right answer below:
SELECT TOP 1 ABBREVIATION
FROM STATEABBREVIATION
ORDER BY RND(ID);
Note that for RND(), I believe that it has to be an integer value/variable.
You need both a variable and a time seed to not get the same sequence(s) each time you open Access and run the query - and to use Access SQL in Access:
SELECT TOP 1 Abbreviation
FROM STATEABBREVIATION
ORDER BY Rnd(-Timer()*[ID]);
where ID is the primary key of the table.
Please try this, it is helpful to you
It is possible by using a stored procedure and function, which I created it's have a extra column which you could be create in your table FLAG name and column all field value should be 0 Then it works
create Procedure proc_randomprimarykeynumber
as
declare #Primarykeyid int
select top 1
#Primarykeyid = u.ID
from
StateAbbreviation u
left join
StateAbbreviation v on u.ID = v.ID + 1
where
v.flag = 1
if(#Primarykeyid is null )
begin
UPDATE StateAbbreviation
SET flag = 0
UPDATE StateAbbreviation
SET flag = 1
WHERE ID IN (SELECT TOP 1 ID
FROM dbo.StateAbbreviation)
END
ELSE
BEGIN
UPDATE StateAbbreviation
SET flag = 0
UPDATE StateAbbreviation
SET flag = 1
WHERE ID IN (#Primarykeyid)
END
SET #Primarykeyid = 1
SELECT TOP 1
ID, Abbreviation
FROM
StateAbbreviation
WHERE
flag = 1
It is made in stored procedure run this and get serial wise primary key
exec proc_randomprimarykeynumber
Thanks and regard
Try this:
SELECT TOP 1 *
FROM tbl_name
ORDER BY NEWID()
Of course this may have performance considerations for large tables.