Merging two columns but only unique combinations - sql

I have two columns, each with identification numbers that have been brought in from different datasheets.
I want to combine this into one column with both identification numbers if they are different, but only one of the identification numbers if they are the same.
I'm using SELECT DISTINCT CONCAT(column 1, column 2) AS column 3 to combine the columns, but can not filter out UNIQUE combinations.
When I try WHERE column 1 <> column 2, I get an error message.
Any suggestions?

You can use CASE WHEN to test for conditions:
SELECT DISTINCT CASE WHEN column1 = column2 THEN column1
ELSE CONCAT(column1, column2)
END AS column3
FROM table1

try this using IIF or CASE and CONCAT
select
distinct
iif(col1<>col2,concat(col1,col2),col1) [myid]
from mytable
or
select
distinct
case when col1<>col2 then
concat(col1,col2)
else col1 end [myid]
from mytable

You should do something like:
SELECT DISTINCT CASE WHEN column1 = column2 THEN column1
ELSE column1 + '|' + column2
END AS combinedColumn
FROM table1
Consider the following chart:
column1 column2 column1+column2 column1+'|'+column2
12 34 1234 12|34
123 4 1234 123|4
1234 1234 1234 1234
Also, column1+column2 loses some information - what the original parts were.

Related

PostgreSQL count multiple columns of the same table

I want to count some columns from a table in PostgreSQL.
For each column count I have some conditions and I want to have all in one query. My problem is the fact that I don't get the expected results in counting because I tried to apply all the conditions for the entire data set.
The table:
column1
column2
column3
UUID10
UUID20
UUID30
NULL
UUID21
NULL
NULL
UUID22
UUID31
UUID11
UUID20
UUID30
This is what I tried so far:
SELECT
COUNT(DISTINCT column1) AS column1_count,
COUNT(DISTINCT column2) AS column2_count,
COUNT(DISTINCT column3) AS column3_count
FROM TABLE
WHERE
column2 IN ('UUID20', 'UUID21', 'UUID22')
AND column1 = 'UUID10' -> this condition should be removed from this where clause
OR column3 IN ('UUID30', 'UUID31')
Result:
column1_count
column2_count
coumn3_count
2
3
2
The result in not correct because I should have column1_count = 1. I mean, this is what the query does, but is not what I intended. So I thought to have some constrains for column2 and column3 in a subquery, and having a another condition just for column1.
A second try:
SELECT *
FROM
(
SELECT
column1
column2,
column3
FROM TABLE
WHERE
column2 IN ('UUID20', 'UUID21', 'UUID22')
OR column3 IN ('UUID30', 'UUID31')
) x
WHERE
column1 = 'UUID10'
Result:
column1_count
column2_count
coumn3_count
1
1
1
Because the last condition on column1 is restricting my result, I end up having 1 for all the counts.
How can I apply different conditions for counting each column?
I would try not to use UNION if is possible. Maybe there can be made some subqueries in another way than what I tried so far. I just have to find a way for the constraint for the column1, to not be on the same WHEN clause as for the column2 and column3.
I think you want conditional aggregation:
SELECT COUNT(DISTINCT CASE WHEN column1 = 'UUID10' THEN column1 END) AS column1_count,
COUNT(DISTINCT column2) AS column2_count,
COUNT(DISTINCT column3) AS coumn3_count
FROM TABLE
WHERE column2 IN ('UUID20', 'UUID21', 'UUID22') OR
column3 IN ('UUID30', 'UUID31');
I assume that you are aware that COUNT(DISTINCT CASE WHEN column1 = 'UUID10' THEN column1 END) is not particularly useful code. It returns 1 or 0 depending on whether the value is present. I assume your code is actually more interesting.

SQL select earlier date (including NULL)

I am trying to select the earlier date/time from a two given columns. However, I am running into issues if one of the two columns have a null value.
my thought is
select case when dateTime1 < datetime2 then column1 else column2
end as EarlierDate
from table
However, using the above method will always return null values regardless how I change the greater or smaller sign.
You can have:
Select Case When Column1 is null then Column2 when Column2 is Null then Column1 When Column1 > Column2 Then Column2 When Column1 < Column2 Then Column1 End As EarlierDate From TableName

SQL sum different columns into one statement

I am building an application that uses a SQL statement to sum three columns. Below is a sample table:
Table1
column1 column2 column3
NULL 30.00 NULL
60.00 NULL NULL
NULL 10.00 NULL
NULL NULL 15.00
I want to sum column1, column2, and column3 into one statement. I want the result to be 115.00 (30.00 + 60.00 + 10.00 + 15.00). The table can have data in one of the three columns, but never in any two.
This is what I have so far:
SELECT ISNULL(sum(column1),ISNULL(sum(column2),sum(column3)) as amount FROM Table1
The result is something not remotely close.
The COALESCE function will also work. In the given example:
SELECT sum(COALESCE(column1,0))
+ COALESCE(column2,0)
+ COALESCE(column3,0)
AS TOTAL FROM Table1
In your case, you can do:
select sum(column1) + sum(column2) + sum(column3)
from table1 t;
Because each column has at least one value, the individual sums will not be NULL, so this will give the expected value.
To be safe, you could use coalesce():
select coalesce(sum(column1), 0) + coalesce(sum(column2), 0) + coalesce(sum(column3), 0)
from table1 t;
Use coalesce to assign 0 when there is null in the column and then sum the values.
SELECT SUM(coalesce(column1,0)+coalesce(column2,0)+coalesce(column3,0)) as amount
FROM Table1

Searching in SQL based on values in 2 columns

I currently have 2 columns for my database and I'm trying to return all values in column 1 that don't contain a certain value in column two:
ex: Column 1 has 9 digit random value, sometimes repeated. There are 4 different options for column 2; P1, P2, P3, P4.
I'm trying to only display values in column 1 that don't have a value of P4 in column 2. If they don't have a P4, then I want them all to be displayed, but once a Column 1 value is associated with P4, I don't want any of the column 1 values displayed. This process will continue through all column 1 values until the only values displayed in column 1 are values that do not have a P4 column 2 value associated with them.
You mean something like this?
SELECT *
FROM YOUR_TABLE
WHERE COLUMN1 NOT IN (
SELECT COLUMN1
FROM YOUR_TABLE
WHERE COLUMN2 = 'P4'
)
Wouldn't this just be
SELECT column1 FROM <table> WHERE column2 != 'P4'
This is an example of a query where you are looking at sets within sets -- that is, sets of column2 within values of column1. I prefer using group by and having for these queries:
select column1
from t
group by column1
having sum(case when column2 = 'P4' then 1 else 0 end) = 0
To get all the values, you would join back to the original table:
select t.*
from t join
(select column1
from t
group by column1
having sum(case when column2 = 'P4' then 1 else 0 end) = 0
) c1
on t.column1 = c1.column1

Select BETWEEN column values

I'm trying to use the BETWEEN with column names instead of direct values, something like this:
SELECT * FROM table WHERE column1 BETWEEN column2 AND column3;
This is returning something like 17 rows, but if i write:
SELECT * FROM table WHERE (column1 <= column2 AND column1 >= column3) OR (column1 >= column2 AND column1 <= column3)
i get around 600 rows..
In both cases i only get rows where column1 value is actually the middle value, but 2nd method gives me much more results, so 1st method has something wrong with it.
I suspect the problem might be on using BETWEEN clause with column names, instead of pure values, and somehow SQL is converting the column names to actual values..its strange, but can someone enlighten me please?
Thanks
SELECT * FROM table WHERE column1 BETWEEN column2 AND column3; # gives 17 rows
is same as
SELECT * FROM table WHERE (column1 >= column2 AND column1 <= column3) # gives 17 rows
Because of your addition check of
(column1 <= column2 AND column1 >= column3)
which is ORed, you get additional rows.
Between A And B assumes that A<B, i.e., that the first expression in the Between, (A), is less than the second expression, (B) it does not check or execute with the opposite option.
e.g., if you put Where 3 Between 4 And 2 no rows will be returned:
or, if you write
Select Case When 3 Between 4 and 2 then 'true' else 'false' end
it will return false
Your logic for the two statements is not the same:
SELECT * FROM table WHERE (column1 <= column2 AND column1 >= column3) OR (column1 >= column2 AND column1 <= column3)
Has two clauses. Remove the first and you should have the same results as your between statement.
SELECT * FROM table WHERE (column1 >= column2 AND column1 <= column3)