Query to search for 2 years and exclude 1 - sql

My apologies for the oddly worded question as I wasn't quite sure how I would name the title without explaining the situation.
I am currently working with a vendor table which gives a unique ID to each vendor, but the table is not normalized.
For example the ID 100000003744450 appears multiple times in the table with a different data in each row.
There are many columns but the only ones that matter to me at the moment are the ID and the year column. I am attempting to find the vendors who have rows for 2013, 2014 but not 2015.
So far I have:
select *
from table
where ls_d_yr = '2013'
or ls_d_yr = '2014'
I need to filter this results by removing any of the vendors that have the year 2013/2014 and should not have any rows with 2015 listed.
Here are the column

If one of 2013 and 2014, use NOT EXISTS to exclude ID's having ls_d_yr in 2015.
select *
from table t1
where ls_d_yr IN ('2013', '2014')
and not exists (select 1 from table t2
where t2.ID = t1.ID
and t2.ls_d_yr = '2015')
If both 2013 and 2014 are required, add a GROUP BY and use HAVING to make sure two different years are provided:
select ID
from table t1
where ls_d_yr IN ('2013', '2014')
and not exists (select 1 from table t2
where t2.ID = t1.ID
and t2.ls_d_yr = '2015')
group by ID
having count(distinct ls_d_yr) = 2

You can use NOT EXISTS for this:
select *
from table AS t1
where ls_d_yr IN ('2013', '2014') AND
NOT EXISTS (SELECT 1
FROM table AS t2
WHERE t1.ID = t2.ID AND ls_d_yr = '2015')

Another variation, should work in both Teradata and Aster (and probably every other DBMS):
select vendor
from table
where ls_d_yr in ('2013','2014','2015') -- probably numbers instead of strings?
group by vendor
having min(ls_d_yr) = '2013' -- at least one row from 2013
and max(ls_d_yr) = '2014' -- at least one row from 2014, but none from 2015

One method for doing this uses aggregation and having:
select t.vendor
from table t
group by t.vendor
having sum(case when ls_d_yr = '2013' then 1 else 0 end) > 0 and
sum(case when ls_d_yr = '2014' then 1 else 0 end) > 0 and
sum(case when ls_d_yr = '2015' then 1 else 0 end) = 0;
Each condition in the having clause tests for one year. The > 0 means that one or more records exist for the year. The = 0 means that no record exists.
This logic is based on the statement: "I am attempting to find the vendors who have rows for 2013, 2014 but not 2015." I don't follow the logic in the last paragraph.

select to_char(id), ls_d_yr
from table
where ls_d_yr like '%2014%'
or ls_d_yr like '%2013%';
something like that.

Related

Show data from table even if there is no data

I have 3 tables:
Data in Compania table:
Data in Periodo table:
Data in VAC_PERIODOCIA table:
I want to show all the companies (COMPANIA) and the value in (vac_aplica column) searching by Periodo, whether or not they are registered.
I tried this:
SELECT
COMPANIA.CIA_CLAVE, COMPANIA.CIA_NOM,
CASE
WHEN VAC_PERIODOCIA.VAC_APLICA IS NULL
THEN 'N'
ELSE 'Y'
END VAC_APLICA
FROM
COMPANIA
LEFT JOIN
VAC_PERIODOCIA ON COMPANIA.CIA_CLAVE = VAC_PERIODOCIA.CIA_CLAVE
WHERE
VAC_PERIODOCIA.PERIODO = '2018 - 2019'
Result:
What I want is this:
First of all, the question is a mess: tables and columns from the question and examples you've provided us with are different. Please fix that.
I don't speak Spanish, so I can only assume the VAC_PERIODICA is Periodo. In that case you need to move what you have in where condition to the join clause. Like this
SELECT COMPANIA.CIA_CLAVE,COMPANIA.CIA_NOM,
CASE
WHEN Periodo.valor IS NULL THEN 'N'
ELSE 'Y'
END VAC_APLICA
FROM Compania
LEFT JOIN Periodo
ON COMPANIA.CIA_CLAVE = Periodo.valor
AND Periodo.PERIODO = '2018 - 2019'
order by 1
dbfiddle

Flag on condition

Here's my table :
key date
a 2002
a 2014
a 2011
b 2004
b 2016
b 2001
I'd like a SELECT statement that adds a flag for the most recent date, like that :
key date flag
a 2002 0
a 2014 1
a 2011 0
b 2004 0
b 2016 1
b 2001 0
Thanks
You can use an analytical function if you don't want to do a group by or self-join. You can probably consolidate this a little if you want to, but I find splitting it out using with makes it more obvious what is going on.
with max_date_query as (
select key, date, max(date) over (partition by key) max_date
from mytable
)
select key, date, case when date = max_date then 1 else 0 end flag
from max_date_query
There are other variations on the same theme where you can order the window by date desc and use row_number() instead of max() to determine the flag. I would imagine the one I showed is better, but not sure how much it will really make a difference. You might need to use that method if you have cases where you have duplicate max dates and need to really only choose one.
select t1.*, case when t2.a is null
then 0
else 1
end as flag
from your_table t1
left join
(
select key, max(date) as mdate
from your_table
group by key
) t2 on t1.key = t2.key and t1.date = t2.mdate
Not really sure what the "most recent" condition is (last "X" years?) and assuming the "2015" are in fact DATE values (not char), try:
select
t1.key,
t1.date,
CASE WHEN DATEDIFF('year', t1.date, CURRENT_DATE) < 2 THEN 1 ELSE 0 END as flag
from table t1;
if the "date" in fact is an integer:
select
t1.key,
t1.date,
CASE WHEN EXTRACT(YEAR FROM CURRENT_DATE) - t1.date < 2 THEN 1 ELSE 0 END as flag
from table t1;
Hope it helps
Sérgio

2 Rows to 1 Row - Nested Query

I have a response column that stores 2 different values for a same product based on question 1 and question 2. That creates 2 rows for each product but I want only one row for each product.
Example:
select Product, XNumber from MyTable where QuestionID IN ('Q1','Q2')
result shows:
Product XNumber
Bat abc
Bat abc12
I want it to display like below:
Product Xnumber1 Xnumber2
Bat abc abc12
Please help.
Thanks.
If you always have two different values you can try this:
SELECT a.Product, a.XNumber as XNumber1, b.XNumber as XNumber2
FROM MyTable a
INNER JOIN MyTable b
ON a.Product = b.Product
WHERE a.QuestionId = 'Q1'
AND b.QuestionId = 'Q2'
I assume that XNumber1 is the result for Q1 and Xnumber2 is the result for Q2.
This will work best if you don't have answers for both Q1 and Q2 for all ids
SELECT a.Product, b.XNumber as XNumber1, c.XNumber as XNumber2
FROM (SELECT DISTINCT Product FROM MyTable) a
LEFT JOIN MyTable b ON a.Product = b.Product AND b.QuestionID = 'Q1'
LEFT JOIN MyTable c ON a.Product = c.Product AND c.QuestionID = 'Q2'
This is one way to achieve your expected results. However, it relies on knowing that only xNumber abc and abc12 are the values. If this is not the case, then a dynamic pivot would be likely needed.
SELECT product, max(case when XNumber = 'abc' then xNumber end) as XNumber1,
max(Case when xNumber = 'abc12' then xNumber end) as xNumber2
FROM MyTable
GROUP BY Product
The problem is that SQL needs to know how many columns will be in the result at the time it compiles the SQL. Since the number of columns could be dependent on the data itself (2 rows vs 5 rows) it can't complete the request. Using Dynamic SQL you can find out the number of rows, then pass those values in as the column names which is why the dynamic SQL works.
This will get you two columns, the first will be the product, and the 2nd will be a comma delimited list of xNumbers.
SELECT DISTINCT T.Product,
xNumbers = Stuff((SELECT DISTINCT ', ' + T1.XNumber
FROM MyTable T1
WHERE t.Product = T1.Product
FOR XML PATH ('')),1,1,'')
FROM MyTable T
To get what you want, we need to know how many columns there will be, what to name them, and how to determine which value goes into which column
Been using rank() a lot in current code we have been working on at my day job. So this fun variant came to mind for your solution.
Using rank to get the 1st, 2nd, and 3rd possible item identifier then grouping them to create a simulated pivot
DECLARE #T TABLE (PRODUCT VARCHAR(50), XNumber VARCHAR(50))
INSERT INTO #T VALUES
('Bat','0-12345-98765-6'),
('Bat','0-12345-98767-2'),
('Bat','0-12345-98768-1'),
('Ball','0-12345-98771-6'),
('Ball','0-12345-98772-7'),
('Ball','0-12345-98777-9'),
('Hat','0-12345-98711-6'),
('Hat','0-12345-98712-3'),
('Tee','0-12345-98465-1')
SELECT
PRODUCT,
MAX(CASE WHEN I = 1 THEN XNumber ELSE '' END) AS Xnumber1,
MAX(CASE WHEN I = 2 THEN XNumber ELSE '' END) AS Xnumber2,
MAX(CASE WHEN I = 3 THEN XNumber ELSE '' END) AS Xnumber3
FROM
(
SELECT
PRODUCT,
XNumber,
RANK() OVER(PARTITION BY PRODUCT ORDER BY XNumber) AS I
FROM #T
) AS DATA
GROUP BY
PRODUCT

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)

most optimal sql query for a simple table design

I'm trying to come up with the most optimal query to solve this problem.
I have simple table made up of columns name(string) and organization_id(int). This table contains a list of names that belong to one or more organizations.
How can I get a list of all the names that belong to the organizations that both "Jim" and "Andy" belong to?
Example:
- John,1
- Jim,1
- Jim,2
- Andy,2
- Carl,2
- Jim,3
- Carl,3
- Andy,4
- John,4
- Jim,5
- Randy,5
- Andy,5
So the query should return to me Jim,2|Andy,2|Carl,2|Jim,5|Randy,5|Andy,5 as both Jim and Andy belong to organizations 2 and 5.
Any ideas?
A straight forward JOIN should do it;
SELECT DISTINCT t1.name
FROM Table1 t1
JOIN Table1 t2 ON t1.organization_id = t2.organization_id AND t2.name = 'Jim'
JOIN Table1 t3 ON t1.organization_id = t3.organization_id AND t3.name = 'Andy'
ORDER BY t1.name
An SQLfiddle to test with.
EDIT: An Oracle SQLfiddle with the same query.
To get the organizations that "Jim" and "Andy" belong to, I like to use aggregation:
select organization
from t
group by organization
having sum(case when name = 'Jim' then 1 else 0 end) > 0 and
sum(case when name = 'Andy' then 1 else 0 end) > 0
You can then get all the people in these organizations using:
select *
from t
where organization in (select organization
from t
group by organization
having sum(case when name = 'Jim' then 1 else 0 end) > 0 and
sum(case when name = 'Andy' then 1 else 0 end) > 0
)