joins in sql giving me weird results - sql

I have two queries, Q1 and Q2.
Q1 produces one result for each demo and date.
Q2 produces one result for each demo, date and site.
Also, the dates for a given demo and site from Q2 will have some overlap with Q1,
but all dates from Q1 won't be there and there might even be some new dates in Q2 that were not there in Q1.
What I want to do is produce a resulting table that has the results of Q1 basically repeated (rows beneath rows) equal to the number of sites in Q2.
And the results from Q2 should be in the second column with a match on the date and demo.
If a date in Q1 doesn't exist in that site of Q2, the entry should be zero or null. I know this can be achieved with joins, but I can't get it to work. I tried -
select a.result, b.site, b.result from
(Q1) as a right join (Q2) as b on a.demo = b.demo and a.date=b.date
but this is producing some weird results. The entries of a.result are different for each site of Q2 though they shouldn't be.
edit - here is what I'm trying to do -
Q1 -
demo | date
------------------------------
1 | 10/31/2013
1 | 11/01/2013
2 | 11/02/2013
Q2 -
demo | site | date
------------------------------
1 | A | 10/31/2013
1 | A | 11/01/2013
2 | B | 11/01/2013
2 | B | 11/02/2013
desired result -
demo | date | site
---------------------------------------
1 | 10/31/2013 | A
1 | 11/01/2013 | A
2 | 11/02/2013 | null
1 | 10/31/2013 | null
1 | 11/01/2013 | B
2 | 11/02/2013 | B

Use inner join instead of right join
select a.result, b.site, b.result from (Q1) as a
inner join (Q2) as b on a.demo = b.demo and a.date=b.date

Here is an SQL Fiddle example of what I think you are asking for:
SELECT M.demo, M.date, M.site FROM
(
SELECT 2 AS FromQuery, Q2.demo, Q2.date, Q2.site
FROM Q2
UNION
SELECT 1 AS FromQuery, Q1.demo, Q1.date, null AS site
FROM Q1
) AS M
ORDER BY M.FromQuery

Based on your clarification, you could get that result with this query.
SELECT
a.demo,
a.date,
b.site
FROM (Q1) a
LEFT JOIN (Q2) a ON b.date = a.date
Sorting it as you have in your result list would require more information in the subqueries, however. You'd need to use a function like Row_Number() (assuming you're using MSSQL) to generate unique IDs in the sub-queries to use for sorting.

Related

Sql inner join only with last row in second table

I have two tables: leads and tracking_leads.
Table structure is as below,
---------------------------- ----------------------
| leads | | tracking_leads |
---------------------------- ----------------------
| id | | tracking_id |
| lead_id | | lead_id |
| anzahl_tickets | | field_name |
| bearbeitungs_id_einkauf | | date |
---------------------------- -----------------------
I need sql for join table lead with tracking_leads table but get only LAST match row in table tracking_leads .
Sql example:
SELECT DATE_FORMAT(tracking_leads.date, "%d.%m.%Y") as trackDate, SUM(l.anzahl_tickets)
as sumValue FROM leads as l INNER JOIN tracking_leads ON l.lead_id=tracking_leads.lead_id
WHERE bearbeitungs_id_einkauf <> '' AND tracking_leads.field_name='bearbeitungs_id_einkauf'
GROUP BY DATE_FORMAT(tracking_leads.date, "%d.%m.%Y")
In this part : INNER JOIN tracking_leads ON l.lead_id=tracking_leads.lead_id need only last record from tracking_leads table.
For example, leads data:
id lead_id anzahl_tickets bearbeitungs_id_einkauf
1 20 2 100
tracking_leads data:
tracking_id lead_id field_name date
1 20 bearbeitungs_id_einkauf 2019-05-31 13:55
2 20 bearbeitungs_id_einkauf 2019-05-31 15:00
In result i need get :
2019-05-31 2
But now i get
2019-05-31 4
Because there are duplicated of lead_id (need only last record).
How can i solve this problem?
Thanks!
My preference would be to use an inline view to get the max dates.
A correlated subquery would be executed once for each row, while the inline view would only need to be executed once.
This should work:
SELECT DATE_FORMAT(tl.date, "%d.%m.%Y") as trackDate,
SUM(l.anzahl_tickets) as sumValue
FROM leads as l
INNER JOIN (
select x.lead_id, max(x.date) date from tracking_leads x where x.field_name = 'bearbeitungs_id_einkauf' group by x.lead_id
) tl ON l.lead_id=tl.lead_id
WHERE bearbeitungs_id_einkauf <> ''
GROUP BY DATE_FORMAT(tl.date, "%d.%m.%Y")
Side node: the test for empty value of bearbeitungs_id_einkauf in the WHERE clause is database-specific, so watch out for issues there. In Oracle, for example, there is no such thing as an empty string, so you would have to test it for NOT NULL. I'm assuming this is not Oracle.
First, I don't like the date format DD-MM-YYYY, because you cannot sort by it. Just use YYYY-MM-DD.
Second, you can use a correlated subquery to get the most recent date:
SELECT DATE(tl.date) as trackDate, SUM(l.anzahl_tickets) as sumValue
FROM leads l INNER JOIN
tracking_leads tl
ON l.lead_id = tl.lead_id
WHERE l.bearbeitungs_id_einkauf <> '' AND
tl.field_name = 'bearbeitungs_id_einkauf' AND
tl.date = (SELECT MAX(tl2.date)
FROM tracking_leads tl2
WHERE tl2.lead_id = tl.lead_id AND
tl2.field_name = tl.field_name
)
GROUP BY DATE(tl.date);
Of course, you can leave your original date format if you prefer. If you do, you can use:
ORDER BY MIN(tl.date)
so the results are order by the date.

How to fill in empty date rows multiple times?

I am trying to fill in dates with empty data, so that my query returned has every date and does not skip any.
My application needs to count bookings for activities by date in a report, and I cannot have skipped dates in what is returned by my SQL
I am trying to use a date table (I have a table with every date from 1/1/2000 to 12/31/2030) to accomplish this by doing a RIGHT OUTER JOIN on this date table, which works when dealing with one set of activities. But I have multiple sets of activities, each needing their own full range of dates regardless if there were bookings on that date.
I also have a function (DateRange) I found that allows for this:
SELECT IndividualDate FROM DateRange('d', '11/01/2017', '11/10/2018')
Let me give an example of what I am getting and what I want to get:
BAD: Without empty date rows:
date | activity_id | bookings
-----------------------------
1/2 | 1 | 5
1/4 | 1 | 4
1/3 | 2 | 6
1/4 | 2 | 2
GOOD: With empty date rows:
date | activity_id | bookings
-----------------------------
1/2 | 1 | 5
1/3 | 1 | NULL
1/4 | 1 | 4
1/2 | 2 | NULL
1/3 | 2 | 6
1/4 | 2 | 2
I hope this makes sense. I get the whole point of joining to a table of just a list of dates OR using the DateRange table function. But neither get me the "GOOD" result above.
Use a cross join to generate the rows and then left join to fill in the values:
select d.date, a.activity_id, t.bookings
from DateRange('d', ''2017-11-01',''2018-11-10') d cross join
(select distinct activity_id from t) a left join
t
on t.date = d.date and t.activity_id = a.activity_id;
It is a bit hard to follow what your data is and what comes from the function. But the idea is the same, wherever the data comes from.
I figured it out:
SELECT TOP 100 PERCENT masterlist.dt, masterlist.activity_id, count(r_activity_sales_bymonth.bookings) AS totalbookings
FROM (SELECT c.activity_id, dateadd(d, b.incr, '2016-12-31') AS dt
FROM (SELECT TOP 365 incr = row_number() OVER (ORDER BY object_id, column_id), *
FROM (SELECT a.object_id, a.column_id
FROM sys.all_columns a CROSS JOIN
sys.all_columns b) AS a) AS b CROSS JOIN
(SELECT DISTINCT activity_id
FROM r_activity_sales_bymonth) AS c) AS masterlist LEFT OUTER JOIN
r_activity_sales_bymonth ON masterlist.dt = r_activity_sales_bymonth.purchase_date AND masterlist.activity_id = r_activity_sales_bymonth.activity_id
GROUP BY masterlist.dt, masterlist.activity_id
ORDER BY masterlist.dt, masterlist.activity_id

OraSQL Select Command where multiple entry have same data

I have a table (We call it t_table) with the columns "DATE" and "TIME" (There are more columns, but only these are interesting).
I want my SELECT command to show me only entrys, that have more than one entry with the same combination of "DATE" and "TIME".
example:
Entry | DATE | TIME
1 | 1/1/14 | 8:00
2 | 1/1/14 | 8:00
3 | 2/1/14 | 8:10
4 | 3/1/14 | 8:10
5 | 3/1/14 | 8:10
Should only display me the rows (1+2) + (4+5) because those entrys have the same combination of DATE/TIME in more than one entry.
I'm quite new to SQL so i am relly thankful for every help - Thanks!
You need to check the combination of data(other than the id here) in the table again using EXISTS.
SELECT A.* FROM TABLE A
WHERE EXISTS
(SELECT 'X' FROM TABLE B
WHERE A.DATE = B.DATE
AND A.TIME = B.TIME
AND A.ID <> B.ID)
SQL Fiddle DEMO
If I understand well, it might work with this:
Select a.id,b.id from t_table a, t_table b where a.date=b.date and a.time = b.time;

Oracle SQL: Optimizing LEFT OUTER JOIN of two similar select statements to be smaller and/or more efficient

So I have this Oracle SQL query:
SELECT man.Toilet_Type, NVL(man.manual_PORTA_POTTY, 0) MANUAL, NVL(reg.regular_PORTA_POTTY, 0) REGULAR FROM (
SELECT A.Visitor Toilet_Type, COUNT(A.Toilet_ID) MANUAL_PORTA_POTTY FROM
BORE.EnragedPotty A,
BORE.SemiEnragedPotty B,
BORE.ManualPotty C
WHERE B.SemiEnragedPotty_ID = C.SemiEnragedPotty_ID
AND B.Toilet_ID = A.Toilet_ID
GROUP BY Visitor
ORDER BY Visitor ASC) man
LEFT OUTER JOIN
(SELECT A.Visitor Toilet_Type, COUNT(B.Toilet_ID) REGULAR_PORTA_POTTY FROM
BORE.EnragedPotty A,
BORE.RegularPotty B
WHERE B.Toilet_ID = A.Toilet_ID
GROUP BY Visitor
ORDER BY Visitor ASC) reg ON man.Toilet_Type = reg.Toilet_Type
This gives two table results. The first query, man, gives me the following output:
+===============+========+
| Toilet_Type | Manual |
+===============+========+
| Portable | 234 |
+---------------+--------+
| Home | 10 |
+---------------+--------+
| Assassination | 2 |
+---------------+--------+
The second query, reg, gives me the same output as above, but with REGULAR instead of MANUAL.
What I want to do is query the databases in a more efficient manner. I want the output to be formatted like so:
+===============+========+=========+
| Toilet_Type | Manual | Regular |
+===============+========+=========+
| Portable | 234 | 444 |
+---------------+--------+---------+
| Home | 10 | 222 |
+---------------+--------+---------+
| Assassination | 2 | 111 |
+---------------+--------+---------+
Surely this can be done in a single query without using a LEFT OUTER JOIN?
This is untested, as I didn't have any sample data, but I think something similar to this might get it done in one query:
SELECT
E.Visitor Toilet_Type,
SUM(case when SE.SemiEnragedPotty_ID is not null and
M.Toilet_ID is not null then 1 else 0 end) MANUAL_PORTA_POTTY,
SUM(case when R.Toilet_ID is not null then 1 else 0 end) REGULAR_PORTA_POTTY
FROM
BORE.EnragedPotty E,
BORE.SemiEnragedPotty SE,
BORE.ManualPotty M,
BORE.RegularPotty R
WHERE
E.SemiEnragedPotty_ID = SE.SemiEnragedPotty_ID (+) AND
E.Toilet_ID = M.Toilet_ID (+)
E.Toilet_ID = R.Toilet_ID (+)
GROUP BY Visitor
ORDER BY Visitor ASC
I may have some of the details off -- I had to rename your aliases to follow which table was which, so it wouldn't shock me if I misplaced one of them.
If you need to pull from the same dataset twice, you should consider using subquery factoring.
WITH
some_result_you_dont_want_to_repeat AS (
-- Chunk of SQL goes here
)
SELECT
-- More SQL here
FROM some_result_you_dont_want_to_repeat once
JOIN some_result_you_dont_want_to_repeat twice
ON ...
In your case, it appears that your A-B table join can be factored out.

Select multiple (non-aggregate function) columns with GROUP BY

I am trying to select the max value from one column, while grouping by another non-unique id column which has multiple duplicate values. The original database looks something like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 15 | b | 8m
65789 | 1 | c | 1o
65790 | 10 | a | 7n
65790 | 26 | b | 8m
65790 | 5 | c | 1o
...
This works just fine using:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.mukey;
Which returns a table like:
mukey | ComponentPercent
65789 | 20
65790 | 26
65791 | 50
65792 | 90
I want to be able to add other columns in without affecting the GROUP BY function, to include columns like name and type into the output table like:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65790 | 26 | b | 8m
65791 | 50 | c | 7n
65792 | 90 | d | 7n
but it always outputs an error saying I need to use an aggregate function with select statement. How should I go about doing this?
You have yourself a greatest-n-per-group problem. This is one of the possible solutions:
select c.mukey, c.comppct_r, c.name, c.type
from c yt
inner join(
select c.mukey, max(c.comppct_r) comppct_r
from c
group by c.mukey
) ss on c.mukey = ss.mukey and c.comppct_r= ss.comppct_r
Another possible approach, same output:
select c1.*
from c c1
left outer join c c2
on (c1.mukey = c2.mukey and c1.comppct_r < c2.comppct_r)
where c2.mukey is null;
There's a comprehensive and explanatory answer on the topic here: SQL Select only rows with Max Value on a Column
Any non-aggregate column should be there in Group By clause .. why??
t1
x1 y1 z1
1 2 5
2 2 7
Now you are trying to write a query like:
select x1,y1,max(z1) from t1 group by y1;
Now this query will result only one row, but what should be the value of x1?? This is basically an undefined behaviour. To overcome this, SQL will error out this query.
Now, coming to the point, you can either chose aggregate function for x1 or you can add x1 to group by. Note that this all depends on your requirement.
If you want all rows with aggregation on z1 grouping by y1, you may use SubQ approach.
Select x1,y1,(select max(z1) from t1 where tt.y1=y1 group by y1)
from t1 tt;
This will produce a result like:
t1
x1 y1 max(z1)
1 2 7
2 2 7
Try using a virtual table as follows:
SELECT vt.*,c.name FROM(
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke;
) as VT, c
WHERE VT.mukey = c.mukey
You can't just add additional columns without adding them to the GROUP BY or applying an aggregate function. The reason for that is, that the values of a column can be different inside one group. For example, you could have two rows:
mukey | comppct_r | name | type
65789 | 20 | a | 7n
65789 | 20 | b | 9f
How should the aggregated group look like for the columns name and type?
If name and type is always the same inside a group, just add it to the GROUP BY clause:
SELECT c.mukey, Max(c.comppct_r) AS ComponentPercent
FROM c
GROUP BY c.muke, c.name, c.type;
Use a 'Having' clause
SELECT *
FROM c
GROUP BY c.mukey
HAVING c.comppct_r = Max(c.comppct_r);