SQL order by two different (possibly null) columns - sql

I have a table with three columns; the first column contains IDs and the other two columns contain dates (where at most one is null, but I don't think this should affect anything). How would I go about ordering the IDs based on which date is larger? I've tried
ORDER BY CASE
WHEN date1 > date2 THEN date1
ELSE date2
END
but this didn't work. Can anyone help me? Also, all of the similar problems that I've seen others post have it so that the query sorts the results based on the first column, and then if the first column is null, the second column. Would I first have to define every single null value? I'm creating this table by a full outer join, so that would be an entirely different question to ask, so hopefully it can be done with null values.

I believe your problem is related to the comparison failing when either column is NULL. So, you probably need:
ORDER BY CASE
WHEN date1 IS NULL THEN date2
WHEN date2 IS NULL THEN date1
WHEN date1 > date2 THEN date1
ELSE date2
END

Try...
SELECT MAX(date1,date2) date FROM table ORDER BY date;

Related

SQL - Join two tables based on a condition

I have an issue.
Table 1 ->
It has columns ID1 and Date1 (yyyymmdd) and Time1 (hh:mm:ss)
Table 2 ->
It has columns ID2 and Date2 (yyyymmdd hh:mm:ss)
None of the ID columns are unique.
What my first intention was, joining the tables on ID1=ID2 and date1 = cast (date2 as date)
and time1= cast(date2 as time).
However, I realized that time of Time1 and Date2 are not the same although they supposed to be.
Instead, I wanted to see what happens if I only joined using ID and date, excluding time.
With left join, I have only one extra row. That is because : I have more than one row in table 2
with the corresponding ID1 and Date1 values. Therefore join returns 2 rows instead of 1 row. BUT one of the time values in table1 and table2 actually match up.
To sum up :
I can't have the correct results when I join the tables by using ID,date, and time columns in table1.
I have one extra row when I join the tables by using only ID and date.
Is there any way that I can set up a condition -somewhere in the query-so that the query does the join by ID and date columns as a default but when this row duplication is a possibility, it will include time column, too?
I mean, I actually want to mix up "join on" and "case when".
I hope I could tell the problem understandable enough. Thank you all !

Condition inside a count -SQL

I am trying to write a condition inside a count statement where it should only count the entries which do not have an ENDDATE. i am looking for writing the condition inside the count as this is a very small part of a large SQl Query
sample query,
select product, count(*) as quantity
from table
where end_date is null
group by age
This query lists quantity for each product which do not have an end date
One method uses conditional aggregation:
select sum(case when end_date is null then 1 else 0 end) as NumNull
. . .
Another method is just to subtract two counts:
select ( count(*) - count(end_date) ) as NumNull
count(end_date) counts the number that are not NULL, so subtracting this from the full count gets the number that are NULL.
Uhmmmm.
It sounds like you are looking for conditional aggregation.
So, if you have a current statement that's sort of working (and we're just guessing because we don't see anything you have attempted so far...)
SELECT COUNT(1)
FROM mytable t
And you want another another expression that returns a count of rows that meet some set of conditions...
and when you say "do not have an ENDDATE", you are refderring to rows that have an ENDDATE value of NULL (and again, we're just guessing that the table has a column named ENDDATE. Every row will have an ENDDATE column.)
We'll use a ANSI standards compliant CASE expression, because this would work in most databases (SQL Server, Oracle, MySQL, Postgres... and we don't have clue what database you are using.
SELECT COUNT(1)
, COUNT(CASE WHEN t.ENDDATE IS NULL THEN 1 ELSE NULL END) AS cnt_null_enddate
FROM mytable t

Turning multiple rows into single row based on ID, and keeping null values

I have tried some of the various solutions posted on Stack for this issue but none of them keep null values (and it seems like the entire query is built off that assumption).
I have a table with 1 million rows. There are 10 columns. The first column is the id. Each id is unique to "item" (in my case a sales order) but has multiple rows. Each row is either completely null or has a single value in one of the columns. No two rows with the same ID have data for the same column. I need to merge these multiple rows into a single row based on the ID. However, I need to keep the null values. If the first column is null in all rows I need to keep that in the final data.
Can someone please help me with this query I've been stuck on it for 2 hours now.
id - Age - firstname - lastname
1 13 null null
1 null chris null
should output
1 13 chris null
It sounds like you want an aggregation query:
select id, max(col1) as col1, max(col2) as col2, . . .
from t
group by id;
If all values are NULL, then this will produce NULL. If one of the rows (for an id) has a value, then this will produce that value.
select id, max(col1), max(col2).. etc
from mytable
group by id
As some others have mentioned, you should use an aggregation query to achieve this.
select t1.id, max(t1.col1), max(t1.col2)
from tableone t1
group by t1.id
This should return nulls. If you're having issues handling your nulls, maybe implement some logic using ISNULL(). Make sure your data fields really are nulls and not empty strings.
If nulls aren't being returned, check to make sure that EVERY single row that has a particular ID has ONLY nulls. If one of them returns an empty string, then yes, it will drop the null and return anything else over the null.

Exclude leading NULL values from table

To give some context, I am using time series data (one column) and I want to study gaps in the data, represented by NULL values in the data set. Although I expect some leading NULL values that I am not interested in including in my final data set. However the number of leading NULL values will vary between data sets.
I would like to exclude the top x number of rows of my data set where the value of a particular column is NULL, without excluding NULL values that appear lower in the same column.
Any help would be much appreciated.
Thanks!
EDIT: I also know that my first record in the value column is always 1, if that helps.
Unfortunately, for SQL Server 2008, I can't think of anything cleaner than:
SELECT row_number,value FROM <table> t1
WHERE value is not NULL OR
EXISTS (select * FROM <table> t2
where t2.value is not null and
t2.row_number < t1.row_number)
Just as an aside, for SQL Server 2012, you could use MAX() with an appropriate OVER() clause such that it considers all previous rows. If that MAX() returns NULL then all preceding rows are known to be NULL, and that's what I'd recommend if/when you upgrade.
You could find the first non-null item for each data set and then just query everything after that:
WITH FirstItem AS
(
SELECT
DataSetID,
MIN(row_number) row_number
FROM Data
WHERE value IS NOT NULL
GROUP BY DataSetID
)
SELECT d.* FROM Data d
INNER JOIN FirstItem fi
ON d.DataSetID = fi.DataSetid
AND d.row_number >= fi.row_number

Checksum for datetime field in SQL

Task:
We are validating the data in 2 different tables (of same structure). We tried to use the checksum function on the records for this.
Problem:
The records in the table are same. But when we use the checksum(*), it gives different CheckSum.
SELECT statusName,CheckSum(*) from OrderStatus
If i calculate the checksum, excluding the DateTime column, it gives the same value in both tables.
SELECT statusName,CheckSum(StatusName,CreatedByUser,ModifiedByUser) from OrderStatus
Columns in Tables:
StatusName,CreatedByUser,ModifiedByUser,CreatedDateTime,LastModifiedTime
How to resolve this, by including the datetime column.
Any help is appreciated!!!!
For a checksum, the order of the columns makes a difference.
Replace the * in the first query with the exact same list of columns. CheckSum should work on date and datetimes.
It appears to be working for me. Here is the SQL Fiddle.
I suspect something is off with your DateTime columns -- run this to see if anything is different:
SELECT *
FROM OrderStatus OS1
LEFT JOIN OrderStatus OS2 ON OS1.CreatedDateTime = OS2.CreatedDateTime AND OS1.LastModifiedTime=OS2.LastModifiedTime
WHERE OS2.LastModifiedTime IS NULL
Try this:
SELECT statusName,
CheckSum(CreatedByUser,ModifiedByUser,CreatedDateTime,LastModifiedTime)
from OrderStatus