SQL: Aggregation of string values from diffrent rows in a mulit-leveled JOIN - sql

I have a multi-leveled table hierarchy in SQL Server, and when joining them I want to do an aggregation of strings from rows of one of the tables. In my (simplified) example in the screenshot below, I have Level1->Level2->Level3 and also a table PerformingUser "hanging under" Level1.
I have done a JOIN on all these four tables, resulting in 8 rows. So far all is fine.
Now, what I want is that each of this 8 rows gets a new column ("AllUsersForLevel1") with an aggregation of all PerformingUsers for its Level1.
(The intent is that the PerformingUser also can see what other PerformingUsers have the same Level1.)
So for the first 2 rows I want a new column with 'aretha, mary'. And in the other 6 rows that column should have 'john, jim'.
I have tried STRING_AGG (as indicated in the comment in the image), but it requires a grouping (which I do not want; I want all 8 rows).
I also tried to do a STRING_AGG on a OVER (PARTITION BY Level1ID), but got "The function 'STRING_AGG' is not a valid windowing function, and cannot be used with the OVER clause."
Does anyone have advice on how to do this?

Related

How can I create multiple rows based on the value of one column in SQL?

I have a column of type string in my table, where multiple values are separated by pipe operator. For example, like this,
Value1|Value2|Value3
Now, what I want is to have a query, which will show three rows for this row. Basically something similar to the concept of explode in Dataframes.
Note that I am using Spark SQL. And I want to achieve this using SQL, not dataframes.
I got it working by using the following query.
select t.*, explode(split(values, "\\|")) as value
from table t
\\| here can also be replaced by [|]. Just specifying | doesn't work.

SQL - Count(*) not behaving in expected manner [duplicate]

This question already has an answer here:
Access query producing results like ROW_NUMBER() in T-SQL
(1 answer)
Closed 7 years ago.
I have the following code
SELECT C_Record.BunchOfColumns, Count(*) AS Degrees
FROM C_Record
WHERE (((C_Record.[C#])=[Enter Value])) //Parameter Input from User
GROUP BY C_Record.BunchofColumns;
My Degrees column never increments, it shows 1 always no matter how many rows are returned from the query. I am suspecting that I have not implemented my GROUP BY method properly. If I understand it correctly, all columns that are selected and are not part of the aggregate function (COUNT in my case) should be put together in GROUP BY. Any help is much appreciated. Thanks in advance
Edit: What I am trying to achieve is to check how many rows have a particular value for a column, then select all other relevant columns and create a Index columns. For example if there are three rows that meet my requirement
Col1 Col2 Degrees
A X 1
B Y 2
C Z 3
and if only 2 rows meet my requirement then
Col1 Col2 Degrees
P X 1
Q Y 2
P.S - my C_Record.BunchofColumns consists of about 10 columns that I did not include for the sake of brevity.
P.P.S - If I try to skip out on any column it gives me the error You Tried to execute a query that does not include the specified expression <<column_name>> as part of an aggregate function
When you use Count() with a GROUP BY the count returned is the number of rows in each group. So to get a count greater than one you would have to have more than one row in your table that had exactly the same values. If you are selecting 10 different columns it seems likely that you have no two columns in the database that have exactly those 10 same values.
If you start with a selecting and grouping by a single column you will see count's of more than one.
That is not how GROUP BY works.
GROUP BY completely changes the meaning of your query. Each row of the result is an "aggregate grouping" of the original rows. Each aggregate grouping consists of all the rows with a particular combination of values for their GROUP BY columns. So if you GROUP BY ten columns, each grouping will consist of rows which are identical on all ten columns.
Once these groupings have been formed, you SELECT various aggregate values like count() or sum(), which provide you with information about the group as a whole. count(*) gives you the number of rows in the group, while count(column) gives you the number of rows in which column is non-NULL. You can also select any of the columns which appear in the GROUP BY clause, because those columns are identical across the whole group.
You are getting a count(*) of one because each of your groups only contains a single row. This is probably because you are grouping by ten columns, and there are no two rows which are identical for all ten columns.
If you just want a count of how many rows satisfy some query, and you don't want this aggregation at all, you write it like this:
SELECT count(*)
FROM something
WHERE something
-- no GROUP BY
;
That will form a single aggregate group of your whole query, and count the rows.
If you want something else, you will need to further explain what you're trying to do.

String Grouping from a single column in Oracle database having million rows and removing duplicates

We have a huge table and one of the column contains queries like e.g. in row 1
1. (((firstname:Adam OR firstname:Neil ) AND lastname:Lee) ) AND category:"Legal" AND type:Individual
and in row 2 of same column
2. (((firstname:Adam* OR firstname:Neil ) AND lastname:Lee) ) AND category:"Legal" AND type:Organization
Similarly there are few other types of Query strings which are used eventually to query external services.
Issue is based on certain criteria I have to group and remove duplicates from this table.
There are few rules to determine grouping of Strings in different rows.One of them is that if first name and lastname are same then ignore category and type values, therefore above two rows will be grouped to one. There are around million rows. Comparing Strings and doing grouping is not looking elegant solution. What could be best possible solution using sql.

SQL query with minus and joins

I have three column names, all in one table. Which returns 96 rows. Im trying to use a Minus statement and Join statement to see if there are any duplicates in the database/other tables. The 3 columns are in 3 seperate tables so i am trying to use a Minus and Join statement to get a number of rows that are duplicates. Can anyone help? First time user.
AFAIK, SQL Server does not support MINUS.
Use EXCEPT instead.

Is there any reason this simple SQL query should be so slow?

This query takes about a minute to give results:
SELECT MAX(d.docket_id), MAX(cus.docket_id) FROM docket d, Cashup_Sessions cus
Yet this one:
SELECT MAX(d.docket_id) FROM docket d UNION MAX(cus.docket_id) FROM Cashup_Sessions cus
gives its results instantly. I can't see what the first one is doing that would take so much longer - I mean they both simply check the same two lists of numbers for the greatest one and return them. What else could it be doing that I can't see?
I'm using jet SQL on an MS Access database via Java.
the first one is doing a cross join between 2 tables while the second one is not.
that's all there is to it.
The first one uses Cartesian product to form a source data, which means that every row from the first table is paired with each row from the second one. After that, it searches the source to find out the max values from the columns.
The second doesn't join tables. It just find max from the fist table and the max one from the second table and than returns two rows.
The first query makes a cross join between the tables before getting the maximums, that means that each record in one table is joined with every record in the other table.
If you have two tables with 1000 items each, you get a result with 1000000 items to go through to find the maximums.