sql: how to join on a field that results from computation

sql: how to join on a field that results from computation - sql

Is it possible to join on a field that isn't in a table, but is derived from it?
For example, if I have one table mapping calendar dates to data, and another mapping days of the week (0-6) to data. How would one join the calendar dates table to the days of week table without adding a "day of week" field to the former?

try something like this:
select
a.one+a.two, b.three
from TableA a
inner join TableB b on a.one+a.two=b.three
just put your calculation in the join, index usage is unlikely though. you don'y say your database, but if you have some command to take the weekday() of the date, you can join on that:
inner join TableB on weekday(a.EventDate)=b.Weekday

If you're using SQL server, you can use the DATEPART function to give you which day of the week (0-7) a particular date is on. You should be able to join the date column using this function and your day of the week number:
select * from
t1 inner join t2 on
DATEPART(weekday,t1.dateColumnName) = t2.dayOfTheWeek
A gotcha though - this may vary dependant on which day of the week is set as the first in your SQL Server settings.

Sure, why not.
select foo.dayofweek, bar.date from foo
join bar on datepart(dw, bar.date) = foo.dayofweek
Don't think this will leverage your indexes though, as the other guy said.

Related

Question on getting number of day from single date column

Above is the screenshot of the tables for my practice. I want to extract the number of days between the earliest and latest sales made by staff 'Ali'. I do not have any SQL IDE to run the code and want to check any problem with my code.
SELECT DAYDIFF(day, MAX(st.Date), MIN(st.Date)) AS Duration
FROM SALES_TRANSACTION AS ST
LEFT JOIN SALES_MASTER AS sm
ON sm.Product_ID = st.Product_ID
GROUP BY sm.Staff_Name
HAVING sm.Staff_Name = 'Ali'
ORDER BY st.Date DESC
Here is the dataset
https://drive.google.com/file/d/13XCxQgbEONU22ZDYhQq-I1u-dh3A2fPc/view?usp=sharing

You want logic more like this:
SELECT DAYDIFF(day, MIN(st.Date), MAX(st.Date)) AS Duration
FROM SALES_TRANSACTION ST JOIN
STAFF_MASTER sm
ON sm.Staff_id = st.Staff_Id
WHERE sm.Staff_Name = 'Ali';
Note the changes:
The filtering is the in WHERE clause rather than the HAVING. In general, it is better to filter before aggregating if possible.
The LEFT JOIN is replaced by a JOIN. First, you need a match to get the name. Second, the foreign key reference should be valid so an outer join should not be necessary.
The correct table for the staff name is STAFF_MASTER.
If you are using SQL Server (which has the 3 argument DATEDIFF() syntax), then the smaller date is the second argument.
And finally, there are many tools on the web where you can test SQL, such as db<>fiddle, SQL Fiddle, and db-fiddle. You can also download free databases onto almost any platform.

Join two tables on date where date format is different DB2

I've searched on here for answer to similar problems, but I have not found a solution to the problem with DB2 SQL
I need to join two tables on dates, pulling their date information and conducting sum functions on information pulled from both tables with the eventual goal of combining both sum values together and other analysis. The date format between the tables are VARCHAR(6) that is displayed as YYYYMM and VARCHAR(32) as YYYY-MM. I do not have the ability to change the tables directly.
I've attempted the following (pesudo) solution
Select TIMESTAMP_FORMAT(Date.Table1) as Date1,
TIMESTAMP_FORMAT(Date.Table1) as Date2,
SUM(Value.Table1) as Sum1,
SUM(Value.Table2) as Sum2
From Table1
Full Outer Join Table2 on Date.Table1 = Date.Table2
Order By Date.Table1, Date.Table2,
Group By Date.Table1, Date.Table2;
The result puts all the information on the same table, as expected, but not side by side where dates are the same.
Any help would be greatly appreciated.

You can remove the hyphen:
From Table1 Full Outer Join
Table2
on Date.Table1 = replace(Date.Table2, '-', '')

sap hana sql dates aggregation

I have an issue with a query I have written for sap hana.
There is basically two tables.
First table is a dates table which contains dates for each single day in a calendar. second table is a results table containing a customer reference number and for each customer reference number a start date and end date. In this customer ref table, I have approximately 4 million records. So essentially in the inner part of the query I would be getting 4 million records for each day since 01012011. There must be a simple way of aggregating the results. I have tried an inner select query however it seems like hana is having performance issues.
I have written the code like this, however this is not optimal.
select date_sql, count(*) as count
from (
select date_sql
from tbl_ref_cal_link tbl_date
where date_sql between '2011-01-01' and add_days (to_date(current_date, 'YYYY-MM-DD'), -1)
)tbl_date
Left join #cust_ref_table M1
On tbl_date.date_sql between m1.startdate and m2.enddate)z
I would appreciate anyone's help or suggestions.

You could use Group By here
And you need to change the m2 in WHERE clause to m1 as in following SQLScript code
select
date_sql, count(m1.CustomerId) as count
from (
-- dates table here
) tbl_date
Left join cust_ref_table m1 On tbl_date.date_sql between m1.startdate and m1.enddate
group by date_sql

Working around unsupported Correlated Where Subqueries in Hive

I'm trying to figure out a work around for the fact HIVE doesn't support correlated subqueries. Ultimately, I've been counting how many items exist in the data each week over the last month, and now I want to know how many items dropped out this week, came back, or are totally new. Wouldn't be too hard if I could use a where subquery but I'm having a tough time thinking of a work around without it.
Select
count(distinct item)
From data
where item in (Select item from data where date <= ("2016-05-10"))
And date between "2016-05-01" and getdate()
Any help would be great. Thank you.

Work around is left join with two result set and where second result set column is null.
ie
Select count (a.item)
from
(select distinct item from data where date between "2016-05-01" and getdate()) a
left join (Select distinct item from data where date <= ("2016-05-10")) b
on a.item =b.item
and b.item is null

Cumulative Summing Values in SQLite

I am trying to perform a cumulative sum of values in SQLite. I initially only needed to sum a single column and had the code
SELECT
t.MyColumn,
(SELECT Sum(r.KeyColumn1) FROM MyTable as r WHERE r.Date < t.Date)
FROM MyTable as t
Group By t.Date;
which worked fine.
Now I wanted to extend this to more columns KeyColumn2 and KeyColumn3 say. Instead of adding more SELECT statements I thought it would be better to use a join and wrote the following
SELECT
t.MyColumn,
Sum(r.KeyColumn1),
Sum(r.KeyColumn2),
Sum(r.KeyColumn3)
FROM MyTable as t
Left Join MyTable as r On (r.Date < t.Date)
Group By t.Date;
However this does not give me the correct answer (instead it gives values that are much larger than expected). Why is this and how could I correct the JOIN to give me the correct answer?

You are likely getting what I would call mini-Cartesian products: your Date values are probably not unique and, as a result of the self-join, you are getting matches for each of the non-unique values. After grouping by Date the results are just multiplied accordingly.
To solve this, the left side of the join must be rid of duplicate dates. One way is to derive a table of unique dates from your table:
SELECT DISTINCT Date
FROM MyTable
and use it as the left side of the join:
SELECT
t.Date,
Sum(r.KeyColumn1),
Sum(r.KeyColumn2),
Sum(r.KeyColumn3)
FROM (SELECT DISTINCT Date FROM MyTable) as t
Left Join MyTable as r On (r.Date < t.Date)
Group By t.Date;
I noticed that you used t.MyColumn in the SELECT clause, while your grouping was by t.Date. If that was intentional, you may be relying on undefined behaviour there, because the t.MyColumn value would probably be chosen arbitrarily among the (potentially) many in the same t.Date group.
For the purpose of this example, I assumed that you actually meant t.Date, so, I replaced the column accordingly, as you can see above. If my assumption was incorrect, please clarify.

Your join is not working cause he will find way more possibilities to join then your subselect would do.
The join is exploding your table.
The sub select does a sum of all records where the date is lower then the one from the current record.
The join joins every row multiple times aslong as the date is lower then the current record. This mean a single record could do as manny joins as there are records with a date lower. This causes multiple records. And in the end a higher SUM.
If you want the sum from mulitple columns you will have to use 3 sub query or define a unique join.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql: how to join on a field that results from computation - sql

Sure, why not. select foo.dayofweek, bar.date from foo join bar on datepart(dw, bar.date) = foo.dayofweek Don't think this will leverage your indexes though, as the other guy said.

Related

Question on getting number of day from single date column

Join two tables on date where date format is different DB2

sap hana sql dates aggregation

Working around unsupported Correlated Where Subqueries in Hive

Cumulative Summing Values in SQLite

Categories

Resources