Is it possible to create a SELECT clause with a varying number of columns to be returned depending on joined tables?
For instance.
If I join a table depending on a value in the WHERE-clause I want to return either tbl1.col1, tbl1.col2 if tabl tbl1 is joined or tbl2.col4, tbl2.col5, tbl2.col8 if table tbl2 is joined.
Is this possible? How?
No, you can't write one query that sometimes returns n columns and another time m columns. What you can do is something like this: Use UNION ALL on two queries with conditions that either query 1 or query 2 Returns data. Have columns match, so where one query has no value let it select null in this place.
select tbl1.col1 as firstname, tbl1.col2 as lastname, null as street, tbl1.col3 as job as street from ...
where #variable = 1
UNION ALL
select tbl2.col4 as firstname, tbl2.col5 as lastname, tbl2.col8 as street, null as job from ...
where #variable = 2;
Or you just build your SQL dynamically with whatever language and use completely different SQL, which is what one would normally do.
Related
SELECT
ID, PRIM_EMAIL, SEC_EMAIL, PHONE
FROM
STUDENTS.RECORDS
WHERE
ID IN (SELECT ID FROM STUDENTS.INFO WHERE ROLL_NO = '554')
UNION
SELECT NAME
FROM STUDENTS.INFO
WHERE ROLL_NO = '554';
Here Roll_No is a user inserted data so for now I have hard coded it. Basically with the help of ROLL_NO I sort the STUDENTS_INFO table from where I get the ID and based on that I try to get PRIM_EMAIL, SEC_EMAIL, PHONE from the STUDENTS.RECORDS table while matching the foreign keys of both the tables. In addition to the current result set I also want to have the prov_name column.
Any help is very much appreciated. Thank you!
I suspect that you want to put all this information on the same row, which suggests a join rather than union all:
select
r.ID,
r.PRIM_EMAIL,
r.SEC_EMAIL,
r.PHONE,
r.NAME
from STUDENTS.RECORDS r
inner join STUDENTS.INFO i ON i.ID = r.ID
where I.ROLL_NO = '554';
I think the source of your error query block has incorrect number of result columns is coming from trying to union together a table with 4 columns (id, prim_email, sec_email, phone) with 1 column (name).
From your question, I am gathering that you want a single table of id, prim_email, sec_email, phone from students.records and name from students.info.
I think the following query using CTE's might get you (partially) to your final result. You may want to refactor for optimizing performance.
with s_records as ( select * from students.records ),
s_info as ( select * from students.info ),
joined as (
select
s_records.id,
s_records.prim_email,
s_records.sec_email,
s_records.phone,
s_info.name
from s_records
left join s_info
on s_records.roll_no = s_info.roll_no
where roll_np = '554' )
select * from joined
Overall, I think that a join will be part of your solution rather than a union :-)
My original table (File05292019) has 22,904 records. I perform a self join on 3 of the fields as shown below and the result is 22,886. Why is this the case? What do the missing records represent?
SELECT File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber
FROM File05292019
INNER JOIN File05292019 AS File05292019_1
ON (File05292019.SubscriberSocialSecurityNumber = File05292019_1.SubscriberSocialSecurityNumber)
AND (File05292019.LastName = File05292019_1.LastName)
AND (File05292019.FirstName = File05292019_1.FirstName)
GROUP BY File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber;
Because of group operator. You should have duplicate records in result set
Check by running this query
SELECT File05292019.LastName, File05292019.FirstName, File05292019.SubscriberSocialSecurityNumber
FROM File05292019
INNER JOIN File05292019 AS File05292019_1
ON (File05292019.SubscriberSocialSecurityNumber = File05292019_1.SubscriberSocialSecurityNumber)
AND (File05292019.LastName = File05292019_1.LastName)
AND (File05292019.FirstName = File05292019_1.FirstName)
the presence of group by suggest that
this mean that you have some rows with the same values
you could try uisng
SELECT File05292019.LastName
, File05292019.FirstName
, File05292019.SubscriberSocialSecurityNumber
count(*)
FROM File05292019
GROUP BY File05292019.LastName
, File05292019.FirstName
, File05292019.SubscriberSocialSecurityNumber
HAVING count(*) > 1
for find these rows
Couple of possibilities:
NULL values exist in the JOIN fields: SubscriberSocialSecurityNumber, LastName, and FirstName. Because NULL = NULL is a False statement, joins exclude nulls (non-value entities).
Duplicate values in GROUP BY fields where the aggregation returns distinct values by grouping. Add the COUNT(*) As RecordCount aggregate to see which fields have more than 1 value.
Possibly subscribers changed their names but retained same SSNs; names and SSNs were incorrectly inputted; or several records use a default status like 999-99-9999?
I have the following statement:
Select No, Region = 'Ohio'
FROM table
where PostCode >='0001'
AND PostCode <= '4999'
which updates me the table with the correct state in the field Region. How can I expand that statement with several other WHERE conditions in the same statement?
e.g.
Region = 'NewYork'
Where PostCode >='5000'
AND PostCode <= '7999'
My solution would be to build several Statements, for each Region, but there must be a better way having them all in one.
Two common ways to select/set different values based on multiple criteria in a single query are case statements and doing a join on another table with those values. I should also point out that you can take advantage of the between operator in SQL server for much of this.
CASE statements in a single query
A case statement might be useful if you have a small set of criteria, or if you just need to throw together an adhoc query. Here is an example of using a case statement:
select
No,
Region = case
when (PostCode >= '0001' and PostCode <= '4999')
'Ohio'
when (PostCode between '5000' and '7999')
'NewYork'
else
'Unknown'
end
from [...]
JOIN a table with the values and criteria
This is definitely the better method for something like evaluating 50 states - especially since this data is likely static. The idea is that you will want to have a table that contains the criteria and the value, and then join it to the table.
Here is an example using a temp table - you would likely want to use a real table for something as common as states.
-- Setup a #states table
create table #states (state varchar(20), PostCodeMin char(4), PostCodeMax char(4))
insert into #states values ('Ohio', '0001', '4999')
insert into #states values ('NewYork', '5000', '7999')
-- Now query it
select
t.No,
State = isnull(s.state, 'Unknown')
from
my_table t
left outer join #states s
on (t.PostCode between s.PostCodeMin and s.PostCodeMax)
Note that in the above query, I do a left outer join to #states, in case the state isn't setup. I also select the State using isnull, in case the outer join doesn't return anything for that particular row in my_table.
You can create a calculated field using a case statement on region. If there is going to be many "Unknown" records returned then you may want to tweak the WHERE clause to filter out nonessential records for better performance.
SELECT
*
FROM
(
Select
No,
Region =
CASE
WHEN PostCode >'0001' AND PostCode <='4999' THEN 'Ohio'
WHEN PostCode >'5000' AND PostCode <='7999' THEN 'New York'
ELSE
'Unknown'
END
FROM table
where PostCode >='0001' AND PostCode <= '7999'
)AS X
ORDER BY
Region
I have a table with two columns CountryCode CountryName. There are duplicate entries in countrycode. But I want to remove the non-duplicate entires and keep the rows which are duplicates in the countrycode column. So I am trying to write an SQL statement to do this. I think I have to use Having but not too sure how exactly to incorporate it. Thanks
That's a bit odd. I was expecting you to want to remove the duplicate entries, not the other way around. But something like this should work regardless of the database you are using:
delete from TableName
where CountryCode in (select CountryCode
from TableName
group by CountryCode
having count(*) = 1).
So to be clear, the subquery:
select CountryCode
from TableName
group by CountryCode
having count(*) = 1
... returns rows with unique CountryCodes. And then the delete statement:
delete from TableName
where CountryCode in (...)
... deletes those unique rows so that the only rows remaining in your table should be the ones with duplicates.
However, by your comments, it sounds like you just want a query that returns only the duplicates. If that's the case, then just use the subquery inside a select statement, but modify the having clause to return only duplicates:
select *
from TableName
where CountryCode in (select CountryCode
from TableName
group by CountryCode
having count(*) > 1)
This is a quick solution, it is probablly not the fastes with alot of entries, but it works.
SELECT * FROM [table] AS tbl
WHERE countrycode IN
(SELECT countrycode FROM [table] WHERE tbl.countryname <> countryname)
/* Words in uppercase are SQL Syntax */
naming the first table (tbl) you can use it in the nested query
I am trying to construct a single SQL statement that returns unique, non-null values from multiple columns all located in the same table.
SELECT distinct tbl_data.code_1 FROM tbl_data
WHERE tbl_data.code_1 is not null
UNION
SELECT tbl_data.code_2 FROM tbl_data
WHERE tbl_data.code_2 is not null;
For example, tbl_data is as follows:
id code_1 code_2
--- -------- ----------
1 AB BC
2 BC
3 DE EF
4 BC
For the above table, the SQL query should return all unique non-null values from the two columns, namely: AB, BC, DE, EF.
I'm fairly new to SQL. My statement above works, but is there a cleaner way to write this SQL statement, since the columns are from the same table?
It's better to include code in your question, rather than ambiguous text data, so that we are all working with the same data. Here is the sample schema and data I have assumed:
CREATE TABLE tbl_data (
id INT NOT NULL,
code_1 CHAR(2),
code_2 CHAR(2)
);
INSERT INTO tbl_data (
id,
code_1,
code_2
)
VALUES
(1, 'AB', 'BC'),
(2, 'BC', NULL),
(3, 'DE', 'EF'),
(4, NULL, 'BC');
As Blorgbeard commented, the DISTINCT clause in your solution is unnecessary because the UNION operator eliminates duplicate rows. There is a UNION ALL operator that does not elimiate duplicates, but it is not appropriate here.
Rewriting your query without the DISTINCT clause is a fine solution to this problem:
SELECT code_1
FROM tbl_data
WHERE code_1 IS NOT NULL
UNION
SELECT code_2
FROM tbl_data
WHERE code_2 IS NOT NULL;
It doesn't matter that the two columns are in the same table. The solution would be the same even if the columns were in different tables.
If you don't like the redundancy of specifying the same filter clause twice, you can encapsulate the union query in a virtual table before filtering that:
SELECT code
FROM (
SELECT code_1
FROM tbl_data
UNION
SELECT code_2
FROM tbl_data
) AS DistinctCodes (code)
WHERE code IS NOT NULL;
I find the syntax of the second more ugly, but it is logically neater. But which one performs better?
I created a sqlfiddle that demonstrates that the query optimizer of SQL Server 2005 produces the same execution plan for the two different queries:
If SQL Server generates the same execution plan for two queries, then they are practically as well as logically equivalent.
Compare the above to the execution plan for the query in your question:
The DISTINCT clause makes SQL Server 2005 perform a redundant sort operation, because the query optimizer does not know that any duplicates filtered out by the DISTINCT in the first query would be filtered out by the UNION later anyway.
This query is logically equivalent to the other two, but the redundant operation makes it less efficient. On a large data set, I would expect your query to take longer to return a result set than the two here. Don't take my word for it; experiment in your own environment to be sure!
try something like SubQuery:
SELECT derivedtable.NewColumn
FROM
(
SELECT code_1 as NewColumn FROM tbl_data
UNION
SELECT code_2 as NewColumn FROM tbl_data
) derivedtable
WHERE derivedtable.NewColumn IS NOT NULL
The UNION already returns DISTINCT values from the combined query.
Union is applied wherever the row data required is similar in terms of type, values etc. It doesnt matter you have column in the same table or the other to retrieve from as the results would remain the same ( in one of the above answers already mentioned though).
As you didn't wanted duplicates theres no point using UNION ALL and use of distinct is simply unnecessary as union gives distinct data
Can create a view would be best choice as view is a virtual representation of the table. Modifications could be then done neatly on that view created
Create VIEW getData AS
(
SELECT distinct tbl_data.code_1
FROM tbl_data
WHERE tbl_data.code_1 is not null
UNION
SELECT tbl_data.code_2
FROM tbl_data
WHERE tbl_data.code_2 is not null
);
Try this if you have more than two Columns:
CREATE TABLE #temptable (Name1 VARCHAR(25),Name2 VARCHAR(25))
INSERT INTO #temptable(Name1, Name2)
VALUES('JON', 'Harry'), ('JON', 'JON'), ('Sam','harry')
SELECT t.Name1+','+t.Name2 Names INTO #t FROM #temptable AS tSELECT DISTINCT ss.value FROM #t AS t
CROSS APPLY STRING_SPLIT(T.Names,',') AS ss