Generating a set of permutations in SQL Server without reverse duplicates - sql

Table Code:
Col1
----
A1
A2
A3
B1
B2
C1
D1
D2
(I have other columns as well)
I am trying to create every possible combination EXCLUDING itself (i.e COL1:A1 COL2:A1) EXCLUDING havING it again the reverse way (i.e A1 A2, A2,A1)... They are to be in separate columns and there are other columns included as well. I am a newbie, go easy on me :)
So far I have:
SELECT
a.Col1, a.[differentcolumn],
b.Col1, b.[differentcolumn]
FROM
[dbo].code a
CROSS JOIN
[dbo].code b
WHERE
a.[col1] != b.[col1]
This is almost it but it gives me:
A1 A2
A2 A1
I only want it one way (The first one). How do I do this?

I'm not completely clear on your requirement, but do you just need this?
SELECT
a.Col1, a.[differentcolumn],
b.Col1, b.[differentcolumn]
FROM
[dbo].code a
INNER JOIN [dbo].code b ON a.[col1] < b.[col1]
This will join the table to itself on col1, but using < means that you won't see the values where the left-hand copy has a col1 greater than or equal to the right-hand copy, which seems to be what you want.

Related

OracleSQL refer to columns just created

Is there a way to refer to columns created in the same SELECT
SELECT 2*a as a2,
3*a2 as a6
FROM ...
I am aware that I could use a nested query, but with many variables created after each other this seems tedious.
You can not refer to the alias in the same SELECT clause. It is SQL standard behavior, and it prevents ambiguities.
There are two options as I have mentioned in the comment also.
-- Use expression of the a2 in a6
SELECT 2*a as a2,
3*(2*a) as a6
FROM ...
-- OR use the sub-query
SELECT a2, 3*a2 as a6 FROM
(SELECT 2*a as a2
--3*a2 as a6
FROM ...)

Union of big SQL queries vs multiple database calls and merging in the application

Currently we have two batch jobs which fetch different information but functionally related.
Say Job1 fetches details D1 and D2
Job 2 fetches D3
We are going to merge these two jobs in to one, such that it retrieves D1, D2 and D3 and writes to a single file.
One challenge related to performance in merging is that D1 contains most of the information from D3. In merged job we want to exclude the D3 information while fetching D1. I am considering the below options. Please advise which one is better or is there a better alternative.
Merging the data in the application.
1.1 Application executes query for D3 information and stores key values in a Set
1.2 Application executes query for D1 and D2
1.3 While writing D1 information into file, it will check the map and excludes if it exists.
Use SQL UNION and create a single query
fetch D1, D2 where key not in (fetch All keys for D3)
UNION
fetch D3
Which one will be efficient considering huge tables and joins.
As with any performance issue, you should test the different approaches to see what works in your environment.
My bias is to do all the work in the database. The database can marshall more resources for this type of work:
with d1 as (. . .),
d2 as (. . .),
d3 as (. . .)
select d3.*
from d3
union all
select d1.*
from d1
where not exists (select 1 from d3 where d1.key = d3.key)
union all
select d2.*
from d2
where not exists (select 1 from d3 where d2.key = d3.key);
This assumes there are no duplicates within each data source and not between d1 and d2.

How to compare table records for an entire table in a SQL SELECT statement?

I need to compare an attribute values of row 1 with row 2, row 2 with row 3, row 3 with row 4 and so on for the entire table. The table data set is small after applying a condition in the WHERE clause.
The comparison part of it involves joins to other tables (which I can manage).
ID PROJECT VALUE
B1 PRJ001 100
B2 PRJ001 200
B3 PRJ001 200
B4 PRJ001 300
.....
.....
B9 PRJ001 600
In the example above, I need to compare B1 to B2, B2 to B3 ... B8 to B9 and count the number of times the VALUES don't match.
Any help with this would be much appreciated.
Thank you.
You need Lead function
Select t.*,
LEAD(value,1) over (order by id) as next_value
From your_table t;
Also just to be clear, I hope this is just an example. Because if you add more id like 'B10' , then order by would put B10 between B1 and B2.

Unique aggregate function when singular value is guaranteed by the WHERE clause

Given the following:
CREATE TABLE A (A1 INTEGER, A2 INTEGER, A3 INTEGER);
INSERT INTO A(A1, A2, A3) VALUES (1, 1, 1);
INSERT INTO A(A1, A2, A3) VALUES (2, 1, 1);
I want to select the maximum A1 given specific A2 and A3 values, and have those values (A2 and A3) also appear in the returned row (e.g. so that I may use them in a join since the SELECT below is meant for a sub-query).
It would seem logical to be able to do the following, given that A2 and A3 are hardcoded in the WHERE clause:
SELECT MAX(A1) AS A1, A2, A3 FROM A WHERE A2=1 AND A3=1
However, PostgreSQL (and I suspect other RDBMs as well) balks at that and requests an aggregate function for A2 and A3 even though their value is fixed. So instead, I either have to do a:
SELECT MAX(A1) AS A1, MAX(A2), MAX(A3) FROM A WHERE A2=1 AND A3=1
or a:
SELECT MAX(A1) AS A1, 1, 1 FROM A WHERE A2=1 AND A3=1
The first alternative I don't like cause I could have used MIN instead and it would still work, whereas the second alternative doubles the number of positional parameters to provide values for when used from a programming language interface. Ideally I would have wanted a UNIQUE aggregate function which would assert that all values are equal and return that single value, or even a RANDOM aggregate function which would return one value at random (since I know from the WHERE clause that they are all equal).
Is there an idiomatic way to write the above in PostgreSQL?
Even simpler, you only need ORDER BY / LIMIT 1
SELECT a1, a2, a3 -- add more columns as you please
FROM a
WHERE a2 = 1 AND a3 = 1
ORDER BY 1 DESC -- 1 is just a positional reference (syntax shorthand)
LIMIT 1;
LIMIT 1 is Postgres specific syntax.
The SQL standard would be:
...
FETCH FIRST 1 ROWS ONLY
My first answer with DISTINCT ON was for the more complex case where you'd want to retrieve the maximum a1 per various combinations of (a2,a3)
Aside: I am using lower case identifiers for a reason.
how about group by
select
a2
,a3
,MAX(a1) as maximumVal
from a
group by a2, a3
Does this work for you ?
select max(A1),A2,A3 from A GROUP BY A2,A3;
EDIT
select A1,A2,A3 from A where A1=(select max(A1) from A ) limit 1
A standard trick to obtain the maximal row without an aggregate function is to guantee the absense of a larger value by means of a NOT EXISTS subquery. (This does not work when there are ties, but neither would the subquery with the max) When needed, it would not be too difficult to add a tie-breaker condition.
Another solution would be a subquery with a window function row_number() or rank()
SELECT *
FROM a src
WHERE NOT EXISTS ( SELECT * FROM a nx
WHERE nx.a2 = src.a2
AND nx.a3 = src.a3
AND nx.a1 > src.a1
);

how to get the comma separated values of the column stored in the Sql server

how to get the comma separated values stored in the Sql Db into a individual values
e.g in sql DB the column is stored with comma values as shown below,
EligibleGroup
A11,A12,A13
B11,B12,B13
I need to get
EligibleGroup
A11
A12
A13
B11
B12
...
I have written a query that will fetch me some list of employees with employee name and eligible group
XXX A11
YYY B11
ZZZ C11
I need to check that the employees(XXX,YYY,ZZZ) eligiblegroup falls within this
EligibleGroup
A11,A12,A13
B11,B12,B13
and return me only that rows.
use a "user defined function" like the one shown here (including source code) - it returns the splitted values as a "table" (one row per value) you can select from like
select txt_value from dbo.fn_ParseText2Table('A11,A12,A13')
returns
A11
A12
A13
You could use a subquery:
SELECT employee_name, eligible_group
FROM YourTable
WHERE eligible_group IN
(SELECT SPLIT(EligibleGroup)
FROM tblEligibleGroup
WHERE <some conditions here>)
I don't believe the "SPLIT" function exists in SQL Server so you'll have to either create a user defined function to handle that, or you could use the nifty workaround suggested here: How do I split a string so I can access item x?
I think you can do it this way,
select left('A11,A12,A13',3) + SUBSTRING('A11,A12,A13',charindex(',','A11,A12,A13'),10)
I think you may not have to split EligibleGroup. You can do another way by just:
select empId
from yourTempEmpTable t1, EligibleGroup t2
where t2.elibigle like '%'+t1.elibigle+'%'
I think it should work.
Assuming that EligibleGroup has a fixed length data, you can try using SUBSTRING As follows:
select substring(EligibleGroup,1,3) from #test union all
select substring(EligibleGroup,5,3) from #test union all
select substring(EligibleGroup,9,3) from #test
This will return:
A11
A12
A13
B11
B12
...
You can try it in Data Explorer
And If you need to check if an employee fall into which EligibleGroup try this:
Select EligibleGroup from test where eligibleGroup like '%A11'