SQL - Join two tables based on a condition

SQL - Join two tables based on a condition - sql

I have an issue.
Table 1 ->
It has columns ID1 and Date1 (yyyymmdd) and Time1 (hh:mm:ss)
Table 2 ->
It has columns ID2 and Date2 (yyyymmdd hh:mm:ss)
None of the ID columns are unique.
What my first intention was, joining the tables on ID1=ID2 and date1 = cast (date2 as date)
and time1= cast(date2 as time).
However, I realized that time of Time1 and Date2 are not the same although they supposed to be.
Instead, I wanted to see what happens if I only joined using ID and date, excluding time.
With left join, I have only one extra row. That is because : I have more than one row in table 2
with the corresponding ID1 and Date1 values. Therefore join returns 2 rows instead of 1 row. BUT one of the time values in table1 and table2 actually match up.
To sum up :
I can't have the correct results when I join the tables by using ID,date, and time columns in table1.
I have one extra row when I join the tables by using only ID and date.
Is there any way that I can set up a condition -somewhere in the query-so that the query does the join by ID and date columns as a default but when this row duplication is a possibility, it will include time column, too?
I mean, I actually want to mix up "join on" and "case when".
I hope I could tell the problem understandable enough. Thank you all !

Related

DB2 select with different date format

My problem is that in an ancient database there are two tables that I need to query for matching rows based on date. Only in one table date is represented as YYYYMM as a decimal(6) and in the other as YYYY-MM-DD as a date.
How do I join these two tables together?
I am perfectly happy searching on any day or day 01.

You can format that date as YYYYMM using TO_CHAR or VARCHAR_FORMAT, then join the two tables together.
Assuming table A has the date field in col1, and table B has the decimal(6) field in col2, it would look like this:
select *
from A
join B on dec(varchar_format(a.col1, 'YYYYMM'),6,0) = b.col2

You can perform join on those two tables. Suppose first table where date is stored as decimal(6) is A in column col1 and another table be B with date stored as column col2.The query would be something like below :
SELECT * FROM A, B
WHERE INT(A.col1) = INT(SUBSTR(B.col2,1,4)|| SUBSTR(B.col2,6,2))

Cross joining two tables with "using" instead of "on"

I found a SQL query in a book which i am not able to understand. From what i understand there are two tables - date which has a date_id and test_Date column, the 2nd table has date_id and obs_cnt.
select t1.test_date
,sum(t2.obs_cnt)
from date t1
cross join
(transactions join date using (date_id)) as t2
where t1.test_date>=t2.test_date
group by t1.test_date
order by t1.test_date
Can someone help me understand what this code does or how the output will look like.
I understand obs_cnt variable is being aggregated at a test_date level.
I understand the use of using in placed on on. But what i dont get is how the date table is being reference twice, does it mean it is being joined twice?

But what i dont get is how the date table is being reference twice, does it mean it is being joined twice?
Yes it is, although it's probably easier to think of t2 as a whole rather than as a function of the date table: t2 is the transaction table but with the actual date representation of the test_date rather than an ID.
I assume there's actually some context for all of this in the book, but it looks like this will produce:
one row of output for every row in the date table (t1), in order of test_date
for each row, total up the number of observations for all transactions that happened on or before that date, using our transactions-with-date table t2.
I understand obs_cnt variable is being aggregated at a test_date level.
It's being aggregated against t1 test_date, which is the constraint we're using to select the rows in t2 that are summed.

SQL: how to control contiguous periods of time

I want to make a query in this type of table.
On the right side appears what I want. In query I want rows that contains NIFs with overlapping periods.
I want that if there are one (or more) periods that are overlapped, this NIF who are periods overlapped have to be added to the query.

You can use below query for this kind of result -
SELECT NIF -- use distinct if you want to get distinct NIF value in your result
FROM T T1 -- T is your tablename
WHERE EXISTS (SELECT 1
FROM T T2
WHERE T1.NIF = T2.NIF
AND T1."START" BETWEEN T2."START" AND T2."END"
AND T1.ROWID <> T2.ROWID);

SQL: Move duplicates to another table where condition

I am quite new to SQL and Stackoverflow, so pardon the layout of my post.
Currently, I am struggling with putting the following workflow into an executable SQL statement:
I have a table containing the following columns:
ID (not unique)
PARTYTYPE (1 or 2)
DATE column
several other, not relevant columns
Now I need to find those observations (rows) that have the same ID and same PARTYTYPE but are not the most recent, i.e. have a date in the DATE column that is less than the most recent for the given combination of PARTYTYPE and ID. The rows that satisfy this condition need to be moved to another table with the same table scheme in order to archive them.
Is there an efficient, yet simple way to accomplish this in SQL?
I have been looking for a long time, but since it involves finding duplicates with certain conditions and inserting it into a table, it is a rather specific problem.
This is what I have so far:
INSERT INTO table_history
select ID, PARTYTYPE, count(*) as count_
from table
group by ID, PARTYTYPE, DATE
having DATE = MAX(DATE)
Any help would be appreciated!

The way you describe the SQL almost exactly conforms to a correlated subquery:
INSERT INTO table_history( . . . )
select t.*
from table t
where date < (select max(date)
from table t2
where t2.id = t.id and t2.partytype = t.partytype
);

SQL order by two different (possibly null) columns

I have a table with three columns; the first column contains IDs and the other two columns contain dates (where at most one is null, but I don't think this should affect anything). How would I go about ordering the IDs based on which date is larger? I've tried
ORDER BY CASE
WHEN date1 > date2 THEN date1
ELSE date2
END
but this didn't work. Can anyone help me? Also, all of the similar problems that I've seen others post have it so that the query sorts the results based on the first column, and then if the first column is null, the second column. Would I first have to define every single null value? I'm creating this table by a full outer join, so that would be an entirely different question to ask, so hopefully it can be done with null values.

I believe your problem is related to the comparison failing when either column is NULL. So, you probably need:
ORDER BY CASE
WHEN date1 IS NULL THEN date2
WHEN date2 IS NULL THEN date1
WHEN date1 > date2 THEN date1
ELSE date2
END

Try...
SELECT MAX(date1,date2) date FROM table ORDER BY date;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Join two tables based on a condition - sql

Related

DB2 select with different date format

Cross joining two tables with "using" instead of "on"

SQL: how to control contiguous periods of time

SQL: Move duplicates to another table where condition

SQL order by two different (possibly null) columns

Categories

Resources