In SQL, can I reference multiple columns in a single WHERE clause? - sql

Suppose I've got a table called SAMPLE_TABLE, and its columns include USERCODE1, USERCODE2, ... USERCODE6, and that I need to run a query that excludes any rows where any of these USERCODEs = 25. I know I can write a query like this:
SELECT * FROM SAMPLE_TABLE
WHERE USERCODE1 <> 25 AND USERCODE2 <> 25 AND USERCODE3 <> 25
AND USERCODE4 <> 25 AND USERCODE5 <> 25 AND USERCODE6 <> 25
Is there a way to group all those USERCODE columns together in the WHERE clause without all the ANDs? Something along these lines:
SELECT * FROM SAMPLE_TABLE
WHERE (USERCODE1, USERCODE2, USERCODE3, USERCODE4, USERCODE5, USERCODE6) <> 25
I'm simplifying this -- there are actually 40 USERCODE columns in the real data set, which is why I'm looking for something more concise. Any thoughts? Thanks!
EDIT: Right after I posted this, I came up with something that works, although it's a bit clunky:
SELECT *
FROM SAMPLE_TABLE
WHERE CONCAT_WS('-', SERVICE1, SERVICE2, SERVICE3, SERVICE4, SERVICE5, SERVICE6) NOT LIKE '%25%'

You don't mention the specific database so I'll assume it's PostgreSQL. You can use 25 NOT IN (col1, col2, ...) as in:
create table t (a int, b int, c int, d int, e int, f int);
insert ...
select * from t where 25 not in (a, b, c, d, e, f)
Result:
a b c d e f
--- --- --- --- --- --
1 2 3 4 5 6
11 12 13 14 15 16
See running example at db<>fiddle.

there is no straightforward way to group all the columns together in a single comparison in the way you're describing but you can simplify your query using an INNER JOIN with a subquery to exclude any rows where any of the 40 USERCODE columns have a value of 25

Related

Aggregate column text where dates in table a are between dates in table b

Sample data
CREATE TEMP TABLE a AS
SELECT id, adate::date, name
FROM ( VALUES
(1,'1/1/1900','test'),
(1,'3/1/1900','testing'),
(1,'4/1/1900','testinganother'),
(1,'6/1/1900','superbtest'),
(2,'1/1/1900','thebesttest'),
(2,'3/1/1900','suchtest'),
(2,'4/1/1900','test2'),
(2,'6/1/1900','test3'),
(2,'7/1/1900','test4')
) AS t(id,adate,name);
CREATE TEMP TABLE b AS
SELECT id, bdate::date, score
FROM ( VALUES
(1,'12/31/1899', 7 ),
(1,'4/1/1900' , 45),
(2,'12/31/1899', 19),
(2,'5/1/1900' , 29),
(2,'8/1/1900' , 14)
) AS t(id,bdate,score);
What I want
What I need to do is aggregate column text from table a where the id matches table b and the date from table a is between the two closest dates from table b. Desired output:
id date score textagg
1 12/31/1899 7 test, testing
1 4/1/1900 45 testinganother, superbtest
2 12/31/1899 19 thebesttest, suchtest, test2
2 5/1/1900 29 test3, test4
2 8/1/1900 14
My thoughts are to do something like this:
create table date_join
select a.id, string_agg(a.text, ','), b.*
from tablea a
left join tableb b
on a.id = b.id
*having a.date between b.date and b.date*;
but I am really struggling with the last line, figuring out how to aggregate only where the date in table b is between the closest two dates in table b. Any guidance is much appreciated.
I can't promise it's the best way to do it, but this is a way to do it.
with b_values as (
select
id, date as from_date, score,
lead (date, 1, '3000-01-01')
over (partition by id order by date) - 1 as thru_date
from b
)
select
bv.id, bv.from_date, bv.score,
string_agg (a.text, ',')
from
b_values as bv
left join a on
a.id = bv.id and
a.date between bv.from_date and bv.thru_date
group by
bv.id, bv.from_date, bv.score
order by
bv.id, bv.from_date
I'm presupposing you will never have a date in your table greater than 12/31/2999, so if you're still running this query after that date, please accept my apologies.
Here is the output I got when I ran this:
id from_date score string_agg
1 0 7 test,testing
1 92 45 testinganother,superbtest
2 0 19 thebesttest,suchtest,test2
2 122 29 test3,test4
2 214 14
I might also note that between in a join is a performance killer. IF you have large data volumes, there might be better ideas on how to approach this, but that depends largely on what your actual data looks like.

SQL. how to compare values and from two table, and report per-row results

I have two Tables.
table A
id name Size
===================
1 Apple 7
2 Orange 15
3 Banana 22
4 Kiwi 2
5 Melon 28
6 Peach 9
And Table B
id size
==============
1 14
2 5
3 31
4 9
5 1
6 16
7 7
8 25
My desired result will be (add one column to Table A, which is the number of rows in Table B that have size smaller than Size in Table A)
id name Size Num.smaller.in.B
==============================
1 Apple 7 2
2 Orange 15 5
3 Banana 22 6
4 Kiwi 2 1
5 Melon 28 7
6 Peach 9 3
Both Table A and B are pretty huge. Is there a clever way of doing this. Thanks
Use this query it's helpful
SELECT id,
name,
Size,
(Select count(*) From TableB Where TableB.size<Size)
FROM TableA
The standard way to get your result involves a non-equi-join, which will be a product join in Explain. First duplicating 20,000 rows, followed by 7,000,000 * 20,000 comparisons and a huge intermediate spool before the count.
There's a solution based on OLAP-functions which is usually quite efficient:
SELECT dt.*,
-- Do a cumulative count of the rows of table #2
-- sorted by size, i.e. count number of rows with a size #2 less size #1
Sum(CASE WHEN NAME = '' THEN 1 ELSE 0 end)
Over (ORDER BY SIZE, NAME DESC ROWS Unbounded Preceding)
FROM
( -- mix the rows of both tables, an empty name indicates rows from table #2
SELECT id, name, size
FROM a
UNION ALL
SELECT id, '', size
FROM b
) AS dt
-- only return the rows of table #1
QUALIFY name <> ''
If there are multiple rows with the same size in table #2 you better count before the Union to reduce the size:
SELECT dt.*,
-- Do a cumulative sum of the counts of table #2
-- sorted by size, i.e. count number of rows with a size #2 less size #1
Sum(CASE WHEN NAME ='' THEN id ELSE 0 end)
Over (ORDER BY SIZE, NAME DESC ROWS Unbounded Preceding)
FROM
( -- mix the rows of both tables, an empty name indicates rows from table #2
SELECT id, name, size
FROM a
UNION ALL
SELECT Count(*), '', SIZE
FROM b
GROUP BY SIZE
) AS dt
-- only return the rows of table #1
QUALIFY NAME <> ''
There is no clever way of doing that, you just need to join the tables like this:
select a.*, b.size
from TableA a join TableB b on a.id = b.id
To improve performance you'll need to have indexes on the id columns.
maybe
select
id,
name,
a.Size,
sum(cnt) as sum_cnt
from
a inner join
(select size, count(*) as cnt from b group by size) s on
s.size < a.size
group by id,name,a.size
if you're working with large tables. Indexing table b's size field could help. I'm also assuming the values in table B converge, that there's many duplicates you don't care about, other than you want to count them.
sqlfiddle
#Ritesh solution is perfectly correct, another similar solution is using CROSS JOIN as shown below
use tempdb
create table dbo.A (id int identity, name varchar(30), size int );
create table dbo.B (id int identity, size int);
go
insert into dbo.A (name, size)
values ('Apple', 7)
,('Orange', 15)
,('Banana', 22)
,('Kiwi', 2 )
,('Melon', 28)
,('Peach', 6 )
insert into dbo.B (size)
values (14), (5),(31),(9),(1),(16), (7),(25)
go
-- using cross join
select a.*, t.cnt
from dbo.A
cross apply (select cnt=count(*) from dbo.B where B.size < A.size) T(cnt)
try this query
SELECT
A.id,A.name,A.size,Count(B.size)
from A,B
where A.size>B.size
group by A.size
order by A.id;

Using a set in the place of a table (or another elegant solution)

I answered a question where I had to generate a temporary derived table on the fly (or use an actual table), see: https://stackoverflow.com/a/24890815/1688441 .
Instead of using the following derived table (using select and union):
(SELECT 21 AS id UNION SELECT 22) AS tmp
within:
SELECT GROUP_CONCAT(CASE WHEN COLUMN1 IS NULL THEN "NULL" ELSE COLUMN1 END)
FROM archive
RIGHT OUTER JOIN
(SELECT 21 AS id UNION SELECT 22) AS tmp ON tmp.id=archive.column2;
I would much prefer to be able to use something much more elegant such as:
([[21],[22]]) AS tmp
Is there any such notation within any of the SQL databases or any similar features? Is there an easy way to use a set in the place of a table in from (when I say set I mean a list of values in 1 dimension) as we use with IN.
So, using such a notation a temporary table with 1 int column, and 1 string column having 2 rows would have:
([[21,'text here'],[22,'text here2']]) AS tmp
SQL Server allows this syntax:
SELECT A, B, C,
CASE WHEN D < 21 THEN ' 0-20'
WHEN D < 51 THEN '21-50'
WHEN D < 101 THEN '51-100'
ELSE '>101' END AS E
,COUNT(*) as "Count"
FROM (
values ('CAR', 1,2,22)
,('CAR', 1,2,23)
,('BIKE',1,3,2)
)TABLE_X(A,B,C,D)
GROUP BY A, B, C,
CASE WHEN D < 21 THEN ' 0-20'
WHEN D < 51 THEN '21-50'
WHEN D < 101 THEN '51-100'
ELSE '>101' END
yielding this:
A B C E Count
---- ----------- ----------- ------ -----------
BIKE 1 3 0-20 1
CAR 1 2 21-50 2

I a table Trn_Clear having columns [SplitUpKey] ,[Clr_Key] ,[FeeComponent] ,[Amt]

In this table, column FeeComponent contains three types of value suppose a, b, c and regarding Amt having price.
A 40
A 20
B 30
B 20
B 40
c 60
Now I want Result in the form of a table like
A,B,C
Amount by FeeComponent-wise.
It's a single Table.
I tried Something like:
Select
(Select Amt from Trn_Clear where Amt='A') As 'A1',
(Select Amt From Trn_Clear where Amt='B') As 'B1'
From Trn_Clear
I think that what you wanna do is to pivot your table; in that case, if you are using SQL Server 2005, you should try something like this:
SELECT *
FROM (SELECT FeeComponent, Amt FROM Trn_Clear) AS S
PIVOT(SUM(Amt) FOR FeeComponent IN ([A],[B],[C])) AS PT
In this case, I use SUM as your aggregation function, but you could use the one you need (AVG,MIN,MAX). Hope this helps.

Counting the rows of multiple distinct columns

I'm trying to count the number of rows that have distinct values in both of the columns "a" and "b" in my Sybase ISQL 9 database.
What I means is, the following dataset will produce the answer "4":
a b
1 9
2 9
3 8
3 7
2 9
3 7
Something like the following syntax would be nice:
SELECT COUNT(DISTINCT a, b) FROM MyTable
But this doesn't work.
I do have a solution:
SELECT COUNT(*) FROM
(SELECT a, b
FROM MyTable
WHERE c = 'foo'
GROUP BY a, b) SubTable
But I was wondering if there is a neater way of constructing this query?
How about:
SELECT COUNT(*)
FROM (SELECT DISTINCT a, b FROM MyTable)
For more information on why this can't be done in a simpler way (besides concatenating strings as noted in a different answer), you can refer to the this Google Answers post: Sql Distinct Count.
You could concatenate a and b together into 1 string like this (TSQL, hopefully something very similar in Sybase:
SELECT COUNT(DISTINCT(STR(a) + ',' + STR(b)))
FROM #YourTable