SQL join not not real results - sql

I have a database with some tables and I want to retrieve from a user the last 8 tee of user that I follow:
This is my table:
Table users:
- id
- name
- surname
- created
- modified
2 | Caesar | Surname1
3 | Albert | Surname2
4 | Paul | Surname3
5 | Nicol | Surname4
Table tee
- id
- name
- user_id
1 | first | 3
2 | second | 3
3 | third | 4
4 | fourth | 4
5 | fifth | 5
6 | sixth | 5
7 | seventh | 5
table user_follow
- id
- user_follower_id //the user that decide to follo someone
- user_followed_id //the user that I decide to follow
1 | 2 | 3
2 | 2 | 5
I expect to retrieve this tee with its creator because my id is 2 (I'm Caesar for example):
1 | first | 3
2 | second | 3
5 | fifth | 5
6 | sixth | 5
7 | seventh | 5
For example if I user that I follow have created 4 tee another that I follow 1, another 2, I think that I can retrieve all this tee if are the last inserted in all sites because are created from user that I follow.
But I retrieve only one tee of an user
This is my query:
SELECT *, `tee`.`id` as id, `tee`.`created` as created, `users`.`id` as user_id, `users`.`created` as user_created
FROM (`tee`)
LEFT JOIN `users`
ON `users`.`id` = `tee`.`user_id`
LEFT JOIN `user_follow` ON `tee`.`user_id` = `user_follow`.`user_followed_id`
WHERE `tee`.`id` != '41' AND
`tee`.`id` != '11' AND
`tee`.`id` != '13' AND
`tee`.`id` != '20' AND
`tee`.`id` != '14' AND
`tee`.`id` != '35' AND
`tee`.`id` != '31' AND
`tee`.`id` != '36' AND
`user_follow`.`user_follower_id` = '2'
ORDER BY `tee`.`created` desc LIMIT 8
I have added 8 tee id that I don't want to retrieve because are "special".
Why this query doesn't work?
I think the problem is in left join or I have to make other thing to retreve this results.
Thanks

I don't see anything wrong with your query -- I have updated the syntax to use INNER JOINs and to use NOT IN though:
SELECT *,
`tee`.`id` as id, `tee`.`created` as created, `users`.`id` as user_id, `users`.`created` as user_created
FROM `tee`
JOIN `users` ON `users`.`id` = `tee`.`user_id`
JOIN `user_follow` ON `tee`.`user_id` = `user_follow`.`user_followed_id`
WHERE `tee`.`id` NOT IN (41,11,13,20,14,35,31,36)
AND `user_follow`.`user_follower_id` = '2'
ORDER BY `tee`.`created` desc
LIMIT 8
Condensed SQL Fiddle Demo

you can use Not In ('41','11',...) instead of tee.id != '41' AND tee.id != '11'.....

Use a UNION clause. First select your tees then select the tees from people you follow.

Related

I need merge 2 lines in only one based on other column attribute name

I am stuck for some time in this. Imagine that I have this table:
diagId | astigmatic
1 | No
1 | Yes
2 | No
3 | No
4 | No
5 | No
5 | Yes
6 | No
And I want the output:
diagId | astigmatic
1 | Yes
2 | No
3 | No
4 | No
5 | Yes
6 | No
So if there is a diagId with Yes and No, I want the Yes tuple to pervail and the No tuple to disappear.
How can I achieve this?
Thanks!
One method is aggregation:
select diagid, max(astigmatic) as astigmatic
from t
group by diagid;
This works because 'yes' > 'no'.
Or, a conceptually similar method but one that is probably faster in Postgres:
select distinct on (diagid) t.*
from t
order by diagid, astigmatic desc;
Another approach is or and not exists:
select t.*
from t
where t.astigmatic = 'yes' or
(t.astigmatic = 'no' and
not exists (select 1
from t t2
where t2.id = t.id and
t2.astigmatic = 'yes'
)
);
The first two method return one row per id -- guaranteed. This last method could return multiple rows, if there are multiple 'yes's or 'no's for a given id.

IDs in Two Columns - Find One ID and Retrieve Other ID

This question has probably been asked before, but I haven't found it. I have a PostgreSQL table with IDs in two columns.
user_left | user_right |
-----------+------------
1 | 2 |
2 | 3 |
4 | 2 |
If I select user 2, I'd like to return something like:
users |
------+
1 |
3 |
4 |
I know I will have to join something, but what I've tried (inner joins, outer joins) baffles me. Any help or direction is appreciated.
Here is one method:
select user_left
from t
where user_right = 2
union all
select user_right
from t
where user_left = 2;
You can also use a case expression:
select (case when user_left = 2 then user_right else user_left end)
from t
where 2 in (user_left, user_right);

SQL query that finds dates between a range and takes values from another query & iterates range over them?

Sorry if the wording for this question is strange. Wasn't sure how to word it, but here's the context:
I'm working on an application that shows some data about the how often individual applications are being used when users make a request from my web server. The way we take data is by every time the start page loads, it increments a data table called WEB_TRACKING at the date of when it loaded. So there are a lot of holes in data, for example, an application might've been used heavily on September 1st but not at all September 2nd. What I want to do, is add those holes with a value on hits of 0. This is what I came up with.
Select HIT_DATA.DATE_ACCESSED, HIT_DATA.APP_ID, HIT_DATA.NAME, WORKDAYS.BENCH_DAYS, NVL(HIT_DATA.HITS, 0) from (
select DISTINCT( TO_CHAR(WEB.ACCESS_TIME, 'MM/DD/YYYY')) as BENCH_DAYS
FROM WEB_TRACKING WEB
) workDays
LEFT join (
SELECT TO_CHAR(WEB.ACCESS_TIME, 'MM/DD/YYYY') as DATE_ACCESSED, APP.APP_ID, APP.NAME,
COUNT(WEB.IP_ADDRESS) AS HITS
FROM WEB_TRACKING WEB
INNER JOIN WEB_APP APP ON WEB.APP_ID = APP.APP_ID
WHERE APP.IS_ENABLED = 1 AND (APP.APP_ID = 1 OR APP.APP_ID = 2)
AND (WEB.ACCESS_TIME > TO_DATE('08/04/2018', 'MM/DD/YYYY')
AND WEB.ACCESS_TIME < TO_DATE('09/04/2018', 'MM/DD/YYYY'))
GROUP BY TO_CHAR(WEB.ACCESS_TIME, 'MM/DD/YYYY'), APP.APP_ID, APP.NAME
ORDER BY TO_CHAR(WEB.ACCESS_TIME, 'MM/DD/YYYY'), app_id DESC
) HIT_DATA ON HIT_DATA.DATE_ACCESSED = WORKDAYS.BENCH_DAYS
ORDER BY WORKDAYS.BENCH_DAYS
It returns all the dates that between the date range and even converts null hits to 0. However, it returns null for app id and app name. Which makes sense, and I understand how to give a default value for 1 application. I was hoping someone could help me figure out how to do it for multiple applications.
Basically, I am getting this (in the case of using just one application):
| APP_ID | NAME | BENCH_DAYS | HITS |
| ------ | ---------- | ---------- | ---- |
| NULL | NULL | 08/04/2018 | 0 |
| 1 | test_app | 08/05/2018 | 1 |
| NULL | NULL | 08/06/2018 | 0 |
But I want this(with multiple applications):
| APP_ID | NAME | BENCH_DAYS | HITS |
| ------ | ---------- | ---------- | ---- |
| 1 | test_app | 08/04/2018 | 0 |<- these 0's are converted from null
| 1 | test_app | 08/05/2018 | 1 |
| 1 | test_app | 08/06/2018 | 0 | <- these 0's are converted from null
| 2 | prod_app | 08/04/2018 | 2 |
| 2 | prod_app | 08/05/2018 | 0 | <- these 0's are converted from null
So again to reiterate the question in this long post. How should I go about populating this query so that it fills up the holes in the dates but also reuses the application names and ids and populates that information as well?
You need a list of dates, that probably comes from a number generator rather than a table (if that table has holes, your report will too)
Example, every date for the past 30 days:
select trunc(sysdate-30) + level as bench_days from dual connect by level < 30
Use TRUNC instead of turning a date into a string in order to cut the time off
Now you have a list of dates, you want to add in repeating app id and name:
select * from
(select trunc(sysdate-30) + level as bench_days from dual connect by level < 30) dat
CROSS JOIN
(select app_id, name from WEB_APP WHERE APP.IS_ENABLED = 1 AND APP_ID in (1, 2) app
Now you have all your dates, crossed with all your apps. 2 apps and 30 days will make a 60 row resultset via a cross join. Left join your stat data onto it, and group/count/sum/aggregate ...
select app.app_id, app.name, dat.artificialday, COALESCE(stat.ct, 0) as hits from
(select trunc(sysdate-30) + level as artificialday from dual connect by level < 30) dat
CROSS JOIN
(select app_id, name from WEB_APP WHERE APP.IS_ENABLED = 1 AND APP_ID in (1, 2) app
LEFT JOIN
(SELECT app_id, trunc(access_time) accdate, count(ip_address) ct from web_tracking group by app_id, trunc(access_time)) stat
ON
stat.app_id = app.app_id AND
stat.accdate = dat.artificialday
You don't have to write the query this way/do your grouping as a subquery, I'm just representing it this way to lead you to thinking about your data in blocks, that you build in isolation and join together later, to build more comprehensive blocks

Best Way to Join One Column on Columns From Two Other Tables

I have a schema like the following in Oracle
Section:
+--------+----------+
| sec_ID | group_ID |
+--------+----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 2 |
+--------+----------+
Section_to_Item:
+--------+---------+
| sec_ID | item_ID |
+--------+---------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
| 2 | 4 |
+--------+---------+
Item:
+---------+------+
| item_ID | data |
+---------+------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | d |
+---------+------+
Item_Version:
+---------+----------+--------+
| item_ID | start_ID | end_ID |
+---------+----------+--------+
| 1 | 1 | |
| 2 | 1 | 3 |
| 3 | 2 | |
| 4 | 1 | 2 |
+---------+----------+--------+
Section_to_Item has FK into Section and Item on the *_ID columns.
Item_version is indexed on item_ID but has no FK to Item.item_ID (ran out of space in the snapshot group).
I have code that receives a list of version IDs and I want to get all items in sections in a given group that are valid for at least one of the versions passed in. If an item has no end_ID, it's valid for anything starting with start_ID. If it has an end_id, it's valid for anything up until (not including) end_ID.
What I currently have is:
SELECT Items.data
FROM Section, Section_to_Items, Item, Item_Version
WHERE Section.group_ID = 1
AND Section_to_Item.sec_ID = Section.sec_ID
AND Item.item_ID = Section_to_Item.item_ID
AND Item.item_ID = Item_Version.item_ID
AND exists (
SELECT *
FROM (
SELECT 2 AS version FROM DUAL
UNION ALL SELECT 3 AS version FROM DUAL
) passed_versions
WHERE Item_Version.start_ID <= passed_versions.version
AND (Item_Version.end_ID IS NULL or Item_Version.end_ID > passed_version.version)
)
Note that the UNION ALL statement is dynamically generated from the list of passed in versions.
This query currently does a cartesian join and is very slow.
For some reason, if I change the query to join
AND Item_Version.item_ID = Section_to_Item.item_ID
which is not a FK, the query does not do the cartesian join and is much faster.
A) Can anyone explain why this is?
B) Is this the right way to be joining this sequence of tables (I feel weird about joining Item.item_ID to two different tables)
C) Is this the right way to get versions between start_ID and end_ID?
Edit
Same query with inner join syntax:
SELECT Items.data
FROM Item
INNER JOIN Section_to_Items ON Section_to_Items.item_ID = Item.item_ID
INNER JOIN Section ON Section.sec_ID = Section_to_Items.sec_ID
INNER JOIN Item_Version ON Item_Version.item_ID = Item_.item_ID
WHERE Section.group_ID = 1
AND exists (
SELECT *
FROM (
SELECT 2 AS version FROM DUAL
UNION ALL SELECT 3 AS version FROM DUAL
) passed_versions
WHERE Item_Version.start_ID <= passed_versions.version
AND (Item_Version.end_ID IS NULL or Item_Version.end_ID > passed_version.version)
)
Note that in this case the performance difference comes from joining on Item_Version first and then joining Section_to_Item on Item_Version.item_ID.
In terms of table size, Section_to_Item, Item, and Item_Version should be similar (1000s) while Section should be small.
Edit
I just found out that apparently, the schema has no FKs. The FKs specified in the schema configuration files are ignored. They're just there for documentation. So there's no difference between joining on a FK column or not. That being said, by changing the joins into a cascade of SELECT INs, I'm able to avoid joining the entire Item table twice. I don't love the resulting query, and I don't really understand the difference, but the stats indicate it's much less work (changes the A-Rows returned from the inner most scan on Section from 656,000 to 488 (it used to be 656k starts returning 1 row, now it's 488 starts returning 1 row)).
Edit
It turned out to be stale statistics - the two queries were equivalent the whole time but with the incomplete statistics, the DB happened to notice the correct plan only in the second instance. After updating statistics, both queries generated the same plan.
I'm not sure if this is the best idea but this seems to avoid the cartesian join:
select data
from Item
where item_ID in (
select item_ID
from Item_Version
where item_ID in (
select item_ID
from Section_to_Item
where sec_ID in (
select sec_ID
from Section
where group_ID = 1
)
)
and exists (
select 1
from (
select 2 as version
from dual
union all
select 3 as version
from dual
) versions
where versions.version >= start_ID
and (end_ID is null or versions.version <)
)
)

SQL query filtering

Using SQL Server 2005, I have a table where certain events are being logged, and I need to create a query that returns only very specific results. There's an example below:
Log:
Log_ID | FB_ID | Date | Log_Name | Log_Type
7 | 4 | 2007/11/8 | Nina | Critical
6 | 4 | 2007/11/6 | John | Critical
5 | 4 | 2007/11/6 | Mike | Critical
4 | 4 | 2007/11/6 | Mike | Critical
3 | 3 | 2007/11/3 | Ben | Critical
2 | 3 | 2007/11/1 | Ben | Critical
The query should do the following: return ONLY one row per each FB_ID, but this needs to be the one where Log_Name has changed for the first time, or if the name never changes, then the first dated row.
In layman's terms I need this to browse through a DB to check for each instance where the responsibility of a case (FB_ID) has been moved to another person, and in case it never has, then just get the original logger's name.
In the example above, I should get rows (Log_ID) 2 and 6.
Is this even possible? Right now there's a discussion going on whether the DB was just made the wrong way. :)
I imagine I need to somehow be able to store the first resulting Log_Name into a variable and then compare it with an IF condition etc. I have no idea how to do such a thing with SQL though.
Edit: Updated the date. And to clarify on this, the correct result would look like this:
Log_ID | FB_ID | Date | Log_Name | Log_Type
6 | 4 | 2007/11/6 | John | Critical
2 | 3 | 2007/11/1 | Ben | Critical
It's not the first date per FB_ID I'm after, but the row where the Log_Name is changed from the original.
Originally FB_ID 4 belongs to Mike, but the query should return the row where it moves on to John. However, it should NOT return the row where it moves further on to Nina, because the first responsibility change already happened when John got it.
In the case of Ben with FB_ID 3, the logger is never changed, so the first row for Ben should be returned.
I guess that there is a better and more performant way, but this one seems to work:
SELECT *
FROM log
WHERE log_id IN
( SELECT MIN(log_id)
FROM log
WHERE
( SELECT COUNT(DISTINCT log_name)
FROM log log2
WHERE log2.fb_id = log.fb_id ) = 1
OR log.log_name <> ( SELECT log_name
FROM log log_3
WHERE log_3.log_id =
( SELECT MIN(log_id)
FROM log log4
WHERE log4.fb_id = log.fb_id ) )
GROUP BY fb_id )
This will efficiently use an index on (fb_id, cdate, id):
SELECT lo4.*
FROM
(
SELECT CASE WHEN ln.log_id IS NULL THEN lo2.log_id ELSE ln.log_id END AS log_id,
ROW_NUMBER() OVER (PARTITION BY lo2.fb_id ORDER BY lo2.cdate) AS rn
FROM (
SELECT
lo.*,
(
SELECT TOP 1 log_id
FROM t_log li
WHERE li.fb_id = lo.fb_id
AND li.cdate >= lo.cdate
AND li.log_id <> lo.log_id
AND li.log_name <> lo.log_name
ORDER BY
cdate, log_id
) AS next_id
FROM t_log lo
) lo2
LEFT OUTER JOIN
t_log ln
ON ln.log_id = lo2.next_id
) lo3, t_log lo4
WHERE lo3.rn = 1
AND lo4.log_id = lo3.log_id
If I've understood the problem correctly, the following SQL should do the trick:
SELECT Log_ID, FB_ID, min(Date), Log_Name, Log_Type
FROM Log
GROUP BY Date
The SQL will select the row with the earliest date for each FP_ID.