How to write query using self join and group by? - sql

I have sql server 2008 db table FILE_DETAILS in following format.
ID FileName Filesize_in_MB
--------------------------------
1 a.txt 5
2 b.txt 2
3 c.txt 2
3 d.txt 4
4 e.txt 6
4 f.txt 1
4 g.txt 2
5 h.txt 8
6 i.txt 7
now what i want to fetch is as bellow
ID FileName Filesize_in_MB
--------------------------------
1 a.txt 5
2 b.txt 2
3 c.txt;d.txt 6
4 e.txt;f.txt;g.txt 9
5 h.txt 8
6 i.txt 7
In above results what happens ID became unique key and FILENAME has get attached and separated by ; and also FILESIZE_IN_MB field in sum of group by ID
I tried with various combination like groupby + self join, also sub queries and all that
but i think i missing something.
is it possible to handle this in SQL query?
Thanks in advance

Try this:
SELECT ID,
STUFF(( SELECT ';' + [FileName]
FROM FILE_DETAILS
WHERE ID = f.ID FOR XML PATH('')), 1, 1, ''),
SUM(Filesize_in_MB)
FROM FILE_DETAILS f
GROUP BY ID
Here's some more information:
Concatenate many rows into a single text string?

You should be able to do this using a group by. Aggregating Filesize_IN_MB can be done using sum as aggregator. However, to aggregate FileName you may need to create an AGGREGATE in SQL SERVER 2008R2. This will allow you to use Concatenate as an aggregation function.
select Concatenate(FileName), sum(Filesize_IN_MB) FROM FILE_DETAILS group by ID
There is another way of aggregate concatenation which seems fairly simple but I haven't tried.

Related

Very simple BigQuery SQL script won't return "0" for Count rows with no results

I am trying to make this very simple SQL script work:
SELECT
DATE(SEC_TO_TIMESTAMP(created_utc)) date_submission,
COUNT(*) AS num_apples_oranges_submissions
FROM
[fh-bigquery:reddit_comments.2008]
WHERE
(LOWER(body) CONTAINS ('apples')
AND LOWER(body) CONTAINS ('oranges'))
GROUP BY
date_submission
ORDER BY
date_submission
The results look like this:
1 2008-01-07 3
2 2008-01-08 1
3 2008-01-09 2
4 2008-01-10 3
5 2008-01-11 2
6 2008-01-13 2
7 2008-01-15 2
8 2008-01-16 3
As you can see, for days where there were no submissions containing both "apples" and "oranges", instead of a value of 0 being returned, the entire row is simply missing (such as on the 12th and 14th).
How can I fix this? I'm at my wits end. Thank you.
Try below, it will return all submissions days
SELECT
DATE(SEC_TO_TIMESTAMP(created_utc)) date_submission,
SUM((LOWER(body) CONTAINS ('apples') AND LOWER(body) CONTAINS ('oranges'))) AS num_apples_oranges_submissions
FROM
[fh-bigquery:reddit_comments.2008]
GROUP BY
date_submission
ORDER BY
date_submission

SQL Nested Aggregation

I am working with SQL Language.
I have a table named parta. I want to count the fields 40b1 and 40b2 and find the sum of this. My query is here.
select
count(40b1) as 40b1,
count(40b2) as 40b2,
sum(count(40b1) + count(40b2) ) as sum,
code/100 as code
from parta
where 40b1=true and mandays>=1000
group by code/100 ;
Expected output
40b1 40b2 sum code verticalsum
5 5 10 20 7
2 2 4 21 7
How it done? Please help.
For getting this verticalsum column, what query can I use?
You don't need to SUM() the COUNT()'s. Just add them together.
select count(40b1) as 40b1,
count(40b2) as 40b2,
count(40b1) + count(40b2) as sum,
code/100 as code
from parta
where 40b1=true and mandays>=1000
group by code/100 ;

How to sort the string 'MH/122020/[xx]x' in an Access query?

I am trying to sort the numbers,
MH/122020/101
MH/122020/2
MH/122020/145
MH/122020/12
How can I sort these in an Access query?
I tried format(mid(first(P.PFAccNo),11),"0") but it didn't work.
You need to use expressions in your ORDER BY clause. For test data
ID PFAccNo
-- -------------
1 MH/122020/101
2 MH/122020/2
3 MH/122020/145
4 MH/122020/12
5 MH/122021/1
the query
SELECT PFAccNo, ID
FROM P
ORDER BY
Left(PFAccNo,9),
Val(Mid(PFAccNo,11))
returns
PFAccNo ID
------------- --
MH/122020/2 2
MH/122020/12 4
MH/122020/101 1
MH/122020/145 3
MH/122021/1 5
you have to convert your substring beginning with pos 11 to a number, and the number can be sorted.
How about this ?
SELECT
tmpTbl.yourFieldName
FROM
tmpTbl
ORDER BY
CLng(Mid([tmpTbl].[yourFieldname],InStrRev([tmpTbl].[yourFieldname],"/")+1));
Given the following data in my test_table, column DATETIMESTAMP:
XXX123
YYY000
XXX-1234
my Statement:
SELECT CInt(Mid(datetimestamp,4)) AS Ausdr1
FROM test_data
ORDER BY 1;
sorts my data. please hange 4 to 11 and it will work for you

Will multiple columns concatenate in the same order if using STUFF and For Xml Path

Please see http://www.sqlfiddle.com/#!3/fb107/3 for an example schema and query I want to run.
I want to use the STUFF and FOR XML PATH('') solution to concatenate columns having grouped by another column.
If I use this method to concatenate multiple columnns into a csv list, am I guaranteed that the order will be the same in each concatenated string? So if the table was:
ID Col1 Col2 Col3
1 1 1 1
1 2 2 2
1 3 3 3
2 4 4 4
2 5 5 5
2 5 5 5
Am I certain that if Col1 is concatenated such that the result is:
ID Col1Concatenated
1 1,2,3
2 4,5,6
That Col2Concatenated will also be in the same order ("1,2,3", "4,5,6") as opposed to ("2,3,1", "5,6,4") for example?
This solution will only work for me if the index of each row's value is the same in each of the concatenated values. i.e. first row is first in each csv list, second row is second in each csv list etc.
You can add an ORDER BY clause in the query within your STUFF function

Getting a comma-delimited list of PK's for duplicates of a record in SQL Server 2005?

This is an off-shoot of a previous question I had: A little fuzzy on getting DISTINCT on one column?
This query makes a little more sense, given the data:
SELECT Receipts.ReceiptID, FolderLink.ReceiptFolderID
FROM dbo.tbl_ReceiptFolderLnk AS FolderLink INNER JOIN
dbo.tbl_Receipt AS Receipts ON FolderLink.ReceiptID = Receipts.ReceiptID
With results:
ReceiptID ReceiptFolderID NewColumn (duplicate folder ID list)
-------------------- --------------- ----------
1 3
2 3
3 7
4 <---> 4 8,9
5 4
6 1
3 8
4 <---> 8 4,9
4 <---> 9 4,8
That answer provided me to view distinct(ReceiptID)'s... great. Now, for those ID's, 3 and 4, they exist in multiple ReceiptFolderID's.
Given this NON-unique list of ReceiptID's, I'd like an additional column, of comma-delimited ReceiptFolderLinkID's where the ReceiptID also exists.
So for ReceiptID=4, the new column, say, DuplicateFoldersList, should read, "8,9", etc, and similar with ID=3, or any other duplicates.
So basically, I'd like another column to indicate the ReceiptFolderID's additional occurrences of ReceiptID in other folders.
Thanks!
You can create a function that, given a ReceiptID and the "current" ReceiptFolderID for that row, returns the other ReceiptFolderIDs as a concatenated, comma-delimited list. Example:
CREATE FUNCTION [dbo].[GetOtherReceiptFolderIDs](#receiptID int, #receiptFolderID int)
RETURNS varchar(MAX) AS
BEGIN
DECLARE #returnValue varchar(MAX)
SELECT #returnValue = COALESCE(#returnValue + ', ', '') + COALESCE(CONVERT(varchar(MAX), ReceiptFolderID), '')
FROM tbl_ReceiptFolderLink AS FolderLink
WHERE FolderLink.ReceiptID = #receiptID
AND FolderLink.ReceiptFolderID <> #receiptFolderID
RETURN #returnValue
END
Then, you can run a query that uses this function to obtain your new column:
SELECT Receipts.ReceiptID, ReceiptFolderID, dbo.GetOtherReceiptFolderIDs(Receipts.ReceiptID, ReceiptFolderID) AS NewColumn
FROM tbl_Receipt AS Receipts
INNER JOIN tbl_ReceiptFolderLink AS FolderLinks
ON Receipts.ReceiptID = FolderLinks.ReceiptID
I tested this and it produces the following results (if I got your schema correctly):
ReceiptID ReceiptFolderID NewColumn
6 1 NULL
1 3 NULL
2 3 NULL
4 4 8, 9
5 4 NULL
3 7 8
3 8 7
4 8 4, 9
4 9 4, 8
In Mysql there is group_concat aggregate function, but in T-SQL and oracle you need to use another approach... This site lists multiple approaches for T-SQL, but none are very simple and easy (as mysql is)