How to join to a table using aliases - sql

Thanks in advance for any help.
I'm wondering how to join two SQL statements I have, based on the fact that I've given the columns aliases.
The source table looks like this (please bear in mind that changing the layout of this table is enormously expensive due to other requirements):
ColumnID RowIdx Val
1 0 "Mr A"
1 1 "Mr B"
1 2 "Mr C"
2 0 "M40 2TB"
2 1 "G23 XYN"
2 2 "HJ 23N"
I use two sql statements to create views on the table, which I think join. This looks something like this:
-- Statement #1
CREATE TEMP TABLE nameTable AS (SELECT
max(CASE WHEN ColumnID=1 THEN Val ELSE null END) AS Name,
row_number() over (ORDER BY RowIdx) AS rowindex
FROM myTable
WHERE (ColumnID=1)
GROUP BY RowIdx
ORDER BY RowIdx);
-- Statement #2
CREATE TEMP TABLE postcodeTable AS (SELECT
max(CASE WHEN ColumnID=2 THEN Val ELSE null END) AS PostCode,
row_number() over (ORDER BY RowIdx) AS rowindex
FROM myTable
WHERE (ColumnID=2)
GROUP BY RowIdx
ORDER BY RowIdx);
-- Statement #3
SELECT * FROM nameTable
INNER JOIN postcodeTable ON nameTable.rowIndex=postcodeTable.rowIndex
The result is the following:
Name PostCode RowIndex
Mr A M40 2TB 0
Mr B G23 XYN 1
Mr C HJ 23N 2
I would like to combine these into a single statement (as it makes generating the statement simpler for my program.
I have chosen to omit some logic from statements 1 and 2 as it is needlessly complicated for my example, but what I have left out prohibits me doing this:
SELECT
max(CASE WHEN ColumnID=1 THEN Val ELSE null END) AS Name,
max(CASE WHEN ColumnID=2 THEN Val ELSE null END) AS PostCode,
row_number() over (ORDER BY RowIdx) AS rowindex
FROM myTable
WHERE (ColumnID=1 OR ColumnID=2)
GROUP BY RowIdx
ORDER BY RowIdx
I'm now considering doing something like this, but I'm not sure if it's possible:
SELECT
max(CASE WHEN ColumnID=1 THEN Val ELSE null END) AS Name,
row_number() over (ORDER BY RowIdx) AS rowindex
FROM myTable
INNER JOIN (SELECT
max(CASE WHEN ColumnID=2 THEN Val ELSE null END) AS PostCode,
row_number() over (ORDER BY RowIdx) AS rowindex
FROM myTable
WHERE (ColumnID=2)
GROUP BY RowIdx
ORDER BY RowIdx) AS postcodeTable ON rowIndex=postcodeTable.rowIndex
WHERE (ColumnID=1)
GROUP BY RowIdx
ORDER BY RowIdx);
Is this something I should even be considering doing? Or should I just stick with temporary tables? If it is, how do I get this to work? The compile issue in on the rowIndex=postcodeTable.rowIndex rowIndex doesn't exist.
I know the original schema is formatted weirdly, but there are all sorts of external reasons for that. Please let me know if you need any more info.

I may be missing something, but I don't understand the need for the complexity in your example queries. Would something like the following (untested) query work?
SELECT Name, PostCode, A.RowIdx AS RowIdx
FROM
(SELECT Val AS Name, RowIdx FROM myTable WHERE ColumnID=1) A
INNER JOIN
(SELECT Val AS PostCode, RowIdx FROM myTable WHERE ColumnID=2) B
ON A.RowIdx = B.RowIdx;

Related

SQL Joined Tables - Multiple rows on joined table per 'on' matched field merged into one row?

I have two tables I am pulling data from. Here is a minimal recreation of what I have:
Select
Jobs.Job_Number,
Jobs.Total_Amount,
Job_Charges.Charge_Code,
Job_Charges.Charge_Amount
From
DB.Jobs
Inner Join
DB.Job_Charges
On
Jobs.Job_Number = Job_Charges.Job_Number;
So, what happens is that I end up getting a row for each different Charge_Code and Charge_Amount per Job_Number. Everything else on the row is the same. Is it possible to have it return something more like:
Job_Number - Total_Amount - Charge_Code[1] - Charge_Amount[1] - Charge_Code[2] - Charge_Amount[2]
ETC?
This way it creates one line per job number with each associated charge and amount on the same line. I have been reading through W3 but haven't been able to tell definitively if this is possible or not. Anything helps, thank you!
To pivot your resultset over a fixed number of columns, you can use row_number() and conditional aggregation:
select
job_number,
total_amount,
max(case when rn = 1 then charge_code end) charge_code1,
max(case when rn = 1 then charge_amount end) charge_amount1,
max(case when rn = 2 then charge_code end) charge_code2,
max(case when rn = 2 then charge_amount end) charge_amount2,
max(case when rn = 3 then charge_code end) charge_code3,
max(case when rn = 3 then charge_amount end) charge_amount3
from (
select
j.job_number,
j.total_amount,
c.charge_code,
c.charge_amount,
row_number() over(partition by job_number, total_amount order by c.charge_code) rn
from DB.Jobs j
inner join DB.Job_Charges c on j.job_number = c.job_number
) t
group by job_number, total_amount
The above query handes up to 3 charge codes and amounts par job number (ordered by job codes). You can expand the select clause with more max(case ...) expressions to handle more of them.

pivot table returns more than 1 row for the same ID

I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID

2 Rows to 1 Row - Nested Query

I have a response column that stores 2 different values for a same product based on question 1 and question 2. That creates 2 rows for each product but I want only one row for each product.
Example:
select Product, XNumber from MyTable where QuestionID IN ('Q1','Q2')
result shows:
Product XNumber
Bat abc
Bat abc12
I want it to display like below:
Product Xnumber1 Xnumber2
Bat abc abc12
Please help.
Thanks.
If you always have two different values you can try this:
SELECT a.Product, a.XNumber as XNumber1, b.XNumber as XNumber2
FROM MyTable a
INNER JOIN MyTable b
ON a.Product = b.Product
WHERE a.QuestionId = 'Q1'
AND b.QuestionId = 'Q2'
I assume that XNumber1 is the result for Q1 and Xnumber2 is the result for Q2.
This will work best if you don't have answers for both Q1 and Q2 for all ids
SELECT a.Product, b.XNumber as XNumber1, c.XNumber as XNumber2
FROM (SELECT DISTINCT Product FROM MyTable) a
LEFT JOIN MyTable b ON a.Product = b.Product AND b.QuestionID = 'Q1'
LEFT JOIN MyTable c ON a.Product = c.Product AND c.QuestionID = 'Q2'
This is one way to achieve your expected results. However, it relies on knowing that only xNumber abc and abc12 are the values. If this is not the case, then a dynamic pivot would be likely needed.
SELECT product, max(case when XNumber = 'abc' then xNumber end) as XNumber1,
max(Case when xNumber = 'abc12' then xNumber end) as xNumber2
FROM MyTable
GROUP BY Product
The problem is that SQL needs to know how many columns will be in the result at the time it compiles the SQL. Since the number of columns could be dependent on the data itself (2 rows vs 5 rows) it can't complete the request. Using Dynamic SQL you can find out the number of rows, then pass those values in as the column names which is why the dynamic SQL works.
This will get you two columns, the first will be the product, and the 2nd will be a comma delimited list of xNumbers.
SELECT DISTINCT T.Product,
xNumbers = Stuff((SELECT DISTINCT ', ' + T1.XNumber
FROM MyTable T1
WHERE t.Product = T1.Product
FOR XML PATH ('')),1,1,'')
FROM MyTable T
To get what you want, we need to know how many columns there will be, what to name them, and how to determine which value goes into which column
Been using rank() a lot in current code we have been working on at my day job. So this fun variant came to mind for your solution.
Using rank to get the 1st, 2nd, and 3rd possible item identifier then grouping them to create a simulated pivot
DECLARE #T TABLE (PRODUCT VARCHAR(50), XNumber VARCHAR(50))
INSERT INTO #T VALUES
('Bat','0-12345-98765-6'),
('Bat','0-12345-98767-2'),
('Bat','0-12345-98768-1'),
('Ball','0-12345-98771-6'),
('Ball','0-12345-98772-7'),
('Ball','0-12345-98777-9'),
('Hat','0-12345-98711-6'),
('Hat','0-12345-98712-3'),
('Tee','0-12345-98465-1')
SELECT
PRODUCT,
MAX(CASE WHEN I = 1 THEN XNumber ELSE '' END) AS Xnumber1,
MAX(CASE WHEN I = 2 THEN XNumber ELSE '' END) AS Xnumber2,
MAX(CASE WHEN I = 3 THEN XNumber ELSE '' END) AS Xnumber3
FROM
(
SELECT
PRODUCT,
XNumber,
RANK() OVER(PARTITION BY PRODUCT ORDER BY XNumber) AS I
FROM #T
) AS DATA
GROUP BY
PRODUCT

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)

Optimize help for sql query

We've got some SQL code I'm trying to optimize. In the code is a view that is rather expensive to run. For the sake of this question, let's call it ExpensiveView. On top of the view there is a query that joins the view to itself via a two sub-queries.
For example:
select v1.varCharCol1, v1.intCol, v2.intCol from (
select someId, varCharCol1, intCol from ExpensiveView where rank=1
) as v1 inner join (
select someId, intCol from ExpensiveView where rank=2
) as v2 on v1.someId = v2.someId
An example result set:
some random string, 5, 10
other random string, 15, 15
This works, but it's slow since I'm having to select from ExpensiveView twice. What I'd like to do is use a case statement to only select from ExpensiveView once.
For example:
select someId,
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
I could then group the above results by someId and get almost the same thing as the first query:
select sum(rank1IntCol), sum(rank2Intcol)
from ( *the above query* ) SubQueryData
group by someId
The problem is the varCharCol1 that I need to get when the rank is 1. I can't use it in the group since that column will contain different values when rank is 1 than it does when rank is 2.
Does anyone have any solutions to optimize the query so it only selects from ExpensiveView once and still is able to get the varchar data?
Thanks in advance.
It's hard to guess since we don't see your view definition, but try this:
SELECT MIN(CASE rank WHEN 1 THEN v1.varCharCol1 ELSE NULL END),
SUM(CASE rank WHEN 1 THEN rank1IntCol ELSE 0 END),
SUM(CASE rank WHEN 2 THEN rank2IntCol ELSE 0 END)
FROM query
GROUP BY
someId
Note that in most cases for the queries like this:
SELECT *
FROM mytable1 m1
JOIN mytable1 m2
ON …
the SQL Server optimizer will just build an Eager Spool (a temporary index), which will later be used for searching for the JOIN condition, so probably these tricks are redundant.
select someId,
case when rank = 1 then varCharCol1 else '_' as varCharCol1
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
then use min() or max in the enclosing query