SQL JOIN - taking all left rows with grouping - sql

So, I have tables named files and folders with ids and usual relation between them: file.folder_id = folder.id
Additionally, some of files/folders could be ignored by field ignore.
I'm trying to get list of folders and counts of files in corresponding folders.
My first approach worked fine but missed empty folders:
SELECT folders.id, folders.name, count(files.id) kount
FROM folders, files
WHERE folders.site_id=111
AND files.ignored=0
AND folders.ignored=0
AND files.site_id=111
AND files.folder_id=folders.id
GROUP BY folders.name
ORDER BY folders.name
So I look for LEFT JOIN
SELECT folders.id, folders.name, count(files.id) kount
FROM folders
LEFT JOIN files
ON files.folder_id=folders.id
WHERE folders.site_id=111
AND folders.ignored=0
AND files.ignored=0
AND files.site_id=111
GROUP BY folders.name
ORDER BY folders.name
but again - empty folders are missing. What I'm doing wrong?

You need to put the conditions filtering the joined table directly into the left join
SELECT folders.id, folders.name, count(files.id) kount
FROM folders
LEFT JOIN files ON files.folder_id=folders.id
AND files.ignored=0
AND files.site_id=111
WHERE folders.site_id=111
AND folders.ignored=0
GROUP BY folders.name
ORDER BY folders.name

Try this.
SELECT folders.id, folders.name, count(files.id) kount
FROM folders
LEFT JOIN files
ON files.folder_id=folders.id
AND files.ignored=0
AND files.site_id=111
WHERE folders.site_id=111
AND folders.ignored=0
GROUP BY folders.name
ORDER BY folders.name

Related

Counting between two different tables Oracle

I have two ORACLE tables, FOLDER and FILES. Each folder contains several files.
I am trying to get the number of files for number of folders. The number of folders x that contains the number of files y.
For example 50 folders contain 10 files, 35 folders contain 8 files...
Can I get some help please on the query :
select count(fl.id_folder) ,count(fi.fileID) from FOLDER fl inner join FILES fi on fl.id_folder=fi.fileID group by fl.id_folder;
You can use two levels of aggregation. Assuming that table files has a column called id_folder, you would do:
select cnt_files, count(*) cnt_folders
from (
select count(*) cnt_files
from files
group by id_folder
) t
group by cnt_files
We can write the query using group by as follows:
Select cnt_files, count(1) as num_of_folders
from
(select fl.id_folder, count(fi.fileid) as cnt_files
from FOLDER fl
Left join FILES fi on fl.id_folder=fi.fileID
Group by fl.id_folder)
Group by cnt_files;
Note: I have used the LEFT JOIN to consider all the folders (With and Without files in it)

Get content data from specific files from bigquery-public-data:github_repos different results with JOIN and WHERE

The most common way of getting content data from specific files bigquery-public-data:github_repos by name is like this:
SELECT *
FROM [bigquery-public-data:github_repos.sample_contents]
WHERE id IN (SELECT id FROM (
SELECT *
FROM [bigquery-public-data:github_repos.sample_files]
WHERE path = 'README.md'
))
This query gives me 14557 results.
I thought that running below query will give me the same ammount of results:
SELECT contents.*
FROM [bigquery-public-data:github_repos.sample_contents] contents
INNER JOIN [bigquery-public-data:github_repos.sample_files] files
ON contents.id = files.id
WHERE files.path = 'README.md'
But it ends up with 14645 results.
Why there is the difference between this two results, and witch one is a proper one for selecting content data of README.md file?
EDIT:
It looks like forked files without modification have the same id across others repos (forks).
First query gives you all contents with files having path = 'README.md' no matter how many times that file id is present in files table
Second query gives you same content as many times as respective file is in files table - because of JOIN
You can run below to validate this
SELECT EXACT_COUNT_DISTINCT(contents.id)
FROM [bigquery-public-data:github_repos.sample_contents] contents
INNER JOIN [bigquery-public-data:github_repos.sample_files] files
ON contents.id = files.id
WHERE files.path = 'README.md'

Merge / Join SQL Select Queries

I am struggling with combining the below Select Statments, I know I could cheat and add some fake columns in and then use Union, but I want to do this correctly.
Once I have them joined, I will be putting the Statment in to a XML file for use with Word and CRM4.
SELECT BILLTO_NAME,
BILLTO_LINE1,
BILLTO_LINE2,
BILLTO_LINE3,
BILLTO_CITY,
BILLTO_COUNTRY,
BILLTO_POSTALCODE,
ORDERNUMBER,
REQUESTDELIVERYBY,
MODIFIEDON,
SHIPTO_NAME,
SHIPTO_LINE1,
SHIPTO_LINE2,
SHIPTO_LINE3,
SHIPTO_CITY,
SHIPTO_STATEORPROVINCE,
SHIPTO_COUNTRY,
SHIPTO_POSTALCODE,
CREATEDBY
FROM SALESORDERBASE
SELECT QUANTITY,
DESCRIPTION
FROM SALESORDERDETAILBASE
SELECT NEW_ORDERNOTES,
NEW_NOTES
FROM SALESORDEREXTENSIONBASE
They all have the common column of SalesOrderID, which I need to add in somewhere as well.
You can use a LEFT JOIN on the tables:
SELECT ob.SalesOrderID
ob.BILLTO_NAME,
ob.BILLTO_LINE1,
ob.BILLTO_LINE2,
ob.BILLTO_LINE3,
ob.BILLTO_CITY,
ob.BILLTO_COUNTRY,
ob.BILLTO_POSTALCODE,
ob.ORDERNUMBER,
ob.REQUESTDELIVERYBY,
ob.MODIFIEDON,
ob.SHIPTO_NAME,
ob.SHIPTO_LINE1,
ob.SHIPTO_LINE2,
ob.SHIPTO_LINE3,
ob.SHIPTO_CITY,
ob.SHIPTO_STATEORPROVINCE,
ob.SHIPTO_COUNTRY,
ob.SHIPTO_POSTALCODE,
ob.CREATEDBY,
od.QUANTITY,
od.DESCRIPTION,
oe.NEW_ORDERNOTES,
oe.NEW_NOTES
FROM SALESORDERBASE ob
LEFT JOIN SALESORDERDETAILBASE od
on ob.SalesOrderID = od.SalesOrderID
LEFT JOIN SALESORDEREXTENSIONBASE oe
on ob.SalesOrderID = oe.SalesOrderID
Assuming the column that identifies the relationship is called id on all three tables, you can do this:
SELECT sob.BILLTO_NAME,
sob.BILLTO_LINE1,
sob.BILLTO_LINE2,
sob.BILLTO_LINE3,
sob.BILLTO_CITY,
sob.BILLTO_COUNTRY,
sob.BILLTO_POSTALCODE,
sob.ORDERNUMBER,
sob.REQUESTDELIVERYBY,
sob.MODIFIEDON,
sob.SHIPTO_NAME,
sob.SHIPTO_LINE1,
sob.SHIPTO_LINE2,
sob.SHIPTO_LINE3,
sob.SHIPTO_CITY,
sob.SHIPTO_STATEORPROVINCE,
sob.SHIPTO_COUNTRY,
sob.SHIPTO_POSTALCODE,
sob.CREATEDBY,
sodb.QUANTITY,
sodb.DESCRIPTION,
soeb.NEW_ORDERNOTES,
soeb.NEW_NOTES
From SalesOrderBase sob
JOIN SalesOrderDetailBase sodb
ON sob.id = sodb.SalesOrderID
JOIN SalesOrderExtensionBase soeb
ON sob.id = soeb.SalesOrderID
You can think of JOINing as slamming together rows side-by-side, whereas UNIONing is slamming together rows one on top of the other. UNIONS require that the columns be the same and JOINs require that there is a relationship of some kind between each row.
EDIT - The OP provided more details

SQL: order by how many rows there are (using COUNT?)

For SMF, I'm making a roster for the members of my clan (please don't come with "You should ask SMF", because that is completely irrelevant; this is just contextual information).
I need it to select all members (from smf_members) and order it by how many permissions they have in smf_permissions (so the script can determine who is higher in rank).
You can retrieve how many permissions there are by using: COUNT(permission) FROM smf_permissions.
I am now using this SQL:
SELECT DISTINCT(m.id_member), m.real_name, m.date_registered
FROM smf_members AS m, smf_permissions AS p
WHERE m.id_group=p.id_group
ORDER BY COUNT(p.permission)
However, this only returns one row! How to return several rows?
Cheers,
Aart
You need a GROUP BY. I've also rewritten with explicit JOIN syntax. You might need to change to LEFT JOIN if you want to include members with zero permissions.
SELECT m.id_member,
m.real_name,
m.date_registered,
COUNT(p.permission) AS N
FROM smf_members AS m
JOIN smf_permissions AS p
ON m.id_group = p.id_group
GROUP BY m.id_member,
m.real_name,
m.date_registered
ORDER BY COUNT(p.permission)

sql join within join?

I need your help building a SQL statement I can't wrap my head around.
In a database, I have four tables - files, folders, folders_files and links.
I have many files. One of them is called "myFile.txt".
I have many folders. "myFile.txt" is in some of them. The first folder it appears in is called "firstFolder".
I have many links to many folders. The first link to "firstFolder" is called "firstLink".
The data structure for the example would be:
// files
Id: 10
Name: "myFile.txt"
// folders
Id: 20
Name: "firstFolder"
// folder_files (join table)
Id: 30
Folder_Id: 20 (meaning "firstFolder")
File_Id: 1 (meaning "myFile.txt")
// links
Id: 40
Name: "firstLink"
Folder_Id: 20 (meaning "firstFolder")
FIRST QUESTION: How do I get the record for "myFile.txt" AND the Name and Id of "firstLink" (the first link), querying on file Id = 10, based on the lowest Id of the folder and the link?
SECOND QUESTION: How do I get the record for "myFile.txt" AND the Name and Id of "firstLink" (the first link), querying on all files, based on the lowest Id of the folder and the link?
put another way - how do I get the first link to the first folder containing "myFile.txt"?
Resulting in a record that looks like:
Id: 10
Name: "myFile.txt"
LinkId: 40
LinkName: "firstLink"
Thanks!
You should try to think about how you want your result set to look. SQL is designed to describe result sets. If you can write out a hypothetical result set, you might have an easier time writing SQL that will render that result set.
I had a hard time understanding what you are looking for, but I'm sure it's a fairly straight forward problem. I would be able to help you easier if you could describe you results more clearly, although you might not need my help anymore!
For example (going with you original schema) Q1 & Q2:
files.Id, files.Name, links.Id, links.Name (4 columns)
Q1:
SELECT
files.Id, files.Name, links.Id, links.Name
FROM
files, links
INNER JOIN
folder_files
ON files.Id = folder_files.File_Id
INNER JOIN
links
ON links.Id = folder_files.Folder_Id
WHERE
files.Id = 10
ORDER BY
folder_files.File_Id ASC, links.Id ASC
LIMIT 1;
(JOIN with folders table not necessary)
Q2:
Change both ASC to DESC
This selects all links for file id 10:
select links.id, links.name
from files
left join folder_files on files.id = folder_files.file_id
left join folders on folder_files.folder_id = folders.id
left join links on links.folder_id = folders.id
where files.id=10;
Change the where clause, add limit or whatever for other things you want. It should be simple to modify this.
I would try this:
select f.*
, l.Id as LinkId
, l.Name as LinkName,
from Link l
inner join Folder_Files ff on ff.Folder_Id = l.Folder_Id
inner join Files f on f.Id = ff.File_Id
where f.Id = 10
Resulting to:
Id | Name | LinkId | LinkName
10 | myFile.txt | 40 | firstLink
Is this what you want?
Taking into account:
more folders per file
more links per folder
taking the lowest id folder for link, and lowest id link for folder
With help of: mysql: group by ID, get highest priority per each ID
The answer for ALL files in the files table ( go for JohnB's solution for a single file, it would be faster):
SELECT file_id, file_name, link_id, link_name FROM (
SELECT file_id, file_name, link_id, link_name,
#r := CASE WHEN #prev_file_id = file_id
THEN #rn + 1
ELSE 1
END AS r,
#prev_file_id := file_id
FROM (
SELECT
f.id as file_id, f.name as file_name, l.id as link_id, l.name as link_name
FROM files f
JOIN folder_files ff
ON ff.file_id = f.id
JOIN links l
ON l.folder_id = ff.folder_id
ORDER BY ff.folder_id, l.id -- first folder first, first link to first folder second
) derived1,
(SELECT #prev_file_id := NULL,#r:=0) vars
) derived2
WHERE r = 1;