mysql query to dynamically convert row data to columns - sql

I am working on a pivot table query.
The schema is as follows
Sno, Name, District
The same name may appear in many districts eg take the sample data for example
1 Mike CA
2 Mike CA
3 Proctor JB
4 Luke MN
5 Luke MN
6 Mike CA
7 Mike LP
8 Proctor MN
9 Proctor JB
10 Proctor MN
11 Luke MN
As you see i have a set of 4 distinct districts (CA, JB, MN, LP). Now i wanted to get the pivot table generated for it by mapping the name against districts
Name CA JB MN LP
Mike 3 0 0 1
Proctor 0 2 2 0
Luke 0 0 3 0
i wrote the following query for this
select name,sum(if(District="CA",1,0)) as "CA",sum(if(District="JB",1,0)) as "JB",sum(if(District="MN",1,0)) as "MN",sum(if(District="LP",1,0)) as "LP" from district_details group by name
However there is a possibility that the districts may increase, in that case i will have to manually edit the query again and add the new district to it.
I want to know if there is a query which can dynamically take the names of distinct districts and run the above query. I know i can do it with a procedure and generating the script on the fly, is there any other method too?
I ask so because the output of the query "select distinct(districts) from district_details" will return me a single column having district name on each row, which i will like to be transposed to the column.

You simply cannot have a static SQL statement returning a variable number of columns. You need to build such statement each time the number of different districts changes. To do that, you execute first a
SELECT DISTINCT District FROM district_details;
This will give you the list of districts where there are details. You then build a SQL statement iterating over the previous result (pseudocode)
statement = "SELECT name "
For each row returned in d = SELECT DISTINCT District FROM district_details
statement = statement & ", SUM(IF(District=""" & d.District & """,1 ,0)) AS """ & d.District & """"
statement = statement & " FROM district_details GROUP BY name;"
And execute that query. You'll then need have to handle in your code the processing of the variable number of columns

a) "For each " is not supported in MySQL stored procedures.
b) Stored procedures cannot execute prepared statements from concatenated strings using so called dynamic SQL statements, nor can it return results with more than One distinct row.
c) Stored functions cannot execute dynamic SQL at all.
It is a nightmare to keep track of once you got a good idea and everyone seems to debunk it before they think "Why would anyone wanna..."
I hope you find your solution, I am still searching for mine.
The closes I got was
(excuse the pseudo code)
-> to stored procedure, build function that...
1) create temp table
2) load data to temp table from columns using your if statements
3) load the temp table out to INOUT or OUT parameters in a stored procedure as you would a table call... IF you can get it to return more than one row
Also another tip...
Store your districts as a table conventional style, load this and iterate by looping through the districts marked active to dynamically concatenate out a querystring that could be plain text for all the system cares
Then use;
prepare stmName from #yourqyerstring;
execute stmName;
deallocate prepare stmName;
(find much more on the stored procedures part of the mysql forum too)
to run a different set of districts every time, without having to re-design your original proc
Maybe it's easier in numerical form.
I work on plain text content in my tables and have nothing to sum, count or add up

The following assumes you want matches of distinct (name/district) pairs. I.e. Luke/CA and Duke/CA would yield two results:
SELECT name, District, count(District) AS count
FROM district_details
GROUP BY District, name
If this is not the case simply remove name from the GROUP BY clause.
Lastly, notice that I switched sum() for count() as you are trying to count all of the grouped rows rather than getting a summation of values.

Via comment by #cballou above, I was able to perform this sort of function which is not exactly what OP asked for but suited my similar situation, so adding it here to help those who come after.
Normal select statement:
SELECT d.id ID,
q.field field,
q.quota quota
FROM defaults d
JOIN quotas q ON d.id=q.default_id
Vertical results:
ID field quota
1 male 25
1 female 25
2 male 50
Select statement using group_concat:
SELECT d.id ID,
GROUP_CONCAT(q.fields SEPARATOR ",") fields,
GROUP_CONCAT(q.quotas SEPARATOR ",") quotas
FROM defaults d
JOIN quotas q ON d.id=q.default_id
Then I get comma-separated fields of "fields" and "quotas" which I can then easily process programmatically later.
Horizontal results:
ID fields quotas
1 male,female 25,25
2 male 50
Magic!

Related

Counting from different categories within the same query

I am trying to make a query from a table in Access that would give me totals for different types of product based off of 2 categories, all within one query. For example my Table looks as follows:
Type
Description 1
Description 2
Date
New
Shiny
Black
1/1/2022
New
Black
Dull
1/1/2022
Old
Shiny
Grey
1/1/2022
Old
Grey
Dull
1/1/2022
The query results that I want to receive are as follows:
Description
New
Old
Shiny
1
1
Black
2
0
Dull
1
1
Grey
0
2
The dataset that I am working with isn't as clean as my example shown here and is causing some of the issues. I never had an issue with the code running, but I just felt that there had to be an easier way that I was missing.
They way I was doing it originally just turned into a bunch of separate query's and was messy to get around. I essentially wrote a query to separate the table into new and old types. From there I used a bunch of
SUM(IIF( Description 1 = "x" OR Description 2 = "x") AS X
SUM(IIF( Description 1 = "y" OR Description 2 = "y") AS Y
expressions to count my totals for each of the objects. This would give me a query where all the totals were displayed in columns. Then I created a separate query to join these data sets together into a presentable manner, but it was turning into too much for how many different "types" I had.
I was just looking for a way to combine all of this into 1 query that would make pulling reports much easier.
Strongly advise not to use space in naming convention nor reserved words as names. Date is a reserved word.
Consider:
Query1
SELECT Type, Description1 AS D, [Date], 1 AS Category FROM Table1
UNION SELECT Type, Description2, [Date], 2 FROM Table1;
UNION will not allow duplicate rows. Use UNION ALL to include all records, even if there are duplicates. There is no query designer or wizard for UNION - must type or copy/paste in SQLView of query builder.
Query2
TRANSFORM Nz(Count(Query1.Category),0) AS CountOfCategory
SELECT Query1.D
FROM Query1
GROUP BY Query1.D
PIVOT Query1.Type;

SAP HANA SQL - Concatenate multiple result rows for a single column into a single row

I am pulling data and when I pull in the text field my results for the "distinct ID" are sometimes being duplicated when there are multiple results for that ID. Is there a way to concatenate the results into a single column/row rather than having them duplicated?
It looks like there are ways in other SQL platforms but I have not been able to find something that works in HANA.
Example
Select
Distinct ID
From Table1
If I pull only Distinct ID I get the following:
ID
1
2
3
4
However when I pull the following:
Example
Select
Distinct ID,Text
From Table1
I get something like
ID
Text
1
Dog
2
Cat
2
Dog
3
Fish
4
Bird
4
Horse
I am trying to Concat the Text field when there is more than 1 row for each ID.
What I need the results to be (Having a "break" between results so that they are on separate lines would be even better but at least a "," would work):
ID
Text
1
Dog
2
Cat,Dog
3
Fish
4
Bird,Horse
I see Kiran has just referred to another valid answer in the comment, but in your example this would work.
SELECT ID, STRING_AGG(Text, ',')
FROM TABLE1
GROUP BY ID;
You can replace the ',' with other characters, maybe a '\n' for a line break
I would caution against the approach to concatenate rows in this way, unless you know your data well. There is no effective limit to the rows and length of the string that you will generate, but HANA will have a limit on string length, so consider that.

HANA concat rows

I use SAP-HANA database. I have a simple 2 column table whose columns are number, name, noodles, fish . The rows are these:
number name noodles fish
1 tom x
1 tom x
1 jack
2 jack x
I would like to group the rows by the id, and concatenate the names into a field, and thus obtain this:
number name noodles fish
1 tom x x
2 jack x
Can you please tell me how we can perform this operation in sap-hana? Thanks in advance.
Well, you did not really concatenate the names, but instead kept the same ones (if you would have concatenated the names as well, you would get something like jackjack in your result). I guess your x's indicate some sort of ABAP-style flags.
In any case, you would do this with grouping. This is a completely non-HANA thing (you can use the same basic SQL for any DB). You can group against several columns. All other columns that you want to select must be used in an aggregated expression (e.g. a SUM, MAX, COUNT, etc.).
To get the output from your question, I wrote the following code:
SELECT "ID", "NAME", MAX("FISH"), MAX("NOODLES")
FROM #TEST GROUP BY "ID", "NAME";
And got the same output as you. I used the MAX function based on the following assumption: you would want to get X if there is any X in the "concatenated" (aggregated) rows in that column. You get nothing / space if all the "concatenated" rows have space in them.

SQL list only unique / distinct values

I have a table which contains geometry lines (ways).There are lines that have a unique geometry (not repeating) and lines which have the same geometry (2,3,4 and more). I want to list only unique ones. If there are, for example, 2 lines with the same geometry I want to drop them. I tried DISTINCT but it also shows the first result from duplicated lines. I only want to see the unique ones.
I tried window function but I get similar result (I get a counter on first line from the duplicating ones). Sorry for a newbie question but I'm learning :) Thanks!
Example:
way|
1 |
1 |
2 |
3 |
3 |
4 |
Result should be:
way|
2 |
4 |
That actually worked. Thanks a lot. I also have other tags in this table for every way (name, ref and few other tags) When I add them to the query I loose the segregation.
select count(way), way, name
from planet_osm_line
group by way, name
having count(way) = 1;
Without "name" in the query I get all unique values listed but I want to keep "name" for every line. With this example I stilll get all the lines in the table listed.
To expound on #Nithila answer:
select count(way), way
from your_table
group by way
having count(way) = 1;
You first calculate the rows you want, and then search for the rest of the fields. So the aggregation doesnt cause you problems.
WITH singleRow as (
select count(way), way
from planet_osm_line
group by way
having count(way) = 1
)
SELECT P.*
FROM planet_osm_line P
JOIN singleRow S
ON P.way = S.way
you can group by way and while taking the data out check the count=1.It will give non duplicating data.
#voyteck
As I understood your question you need to get only non duplicating records of way column and for each row you need to show the name is it
If so, you have to put all the column in select statement, but no need to group by all the columns.
select count(way), way, name
from planet_osm_line
group by way
having count(way) = 1;

SQL index match to find duplicate data

I have the following table
Code Name Task
aa jones DC
ab dave DC
aca james IF
aca james DC
ab trevor IF
aa jones IF
ag francis DC
ag francis IF
af derek SF
af derek DC
This is a very big table, above is just a quick example.
So, I would like some help finding the code and name that have completed a IF or SF task and a DC task.
I would like it to show where one person has touched both of these tasks. The hierarchy of the tasks is; it comes in as either a SF or IF then someone will do that, then off the back of that we receive a DC task, and I want the ones where it has been completed by the same person, with the same reference number.
I am able to do this in excel with an INDEX MATCH function, but this takes up a tremendous amount of calculation time due to the size of the table.
One way to approach this is using group by with a having. This is a flexible way of expressing these types of conditions:
select code, name
from table t
group by code, name
having sum(case when task = 'DC' then 1 else 0 end) > 0 and
sum(case when task in ('IF', 'SF') then 1 else 0 end) > 0;
Each condition in the having clause counts the number of rows that meet the particular condition. The first, for instance, counts the rows that match 'DC' and takes only the code, name pairs that have at least one such match.
SELECT code,name FROM YOUR_TABLE_NAME WHERE task = 'DC' AND (task = 'IF' OR task = 'SF') GROUP BY name
try this query
Gordon Linoff's query can be made easier under the hypothesys that IF and SF are synonym and cannot be both present for the same Code-Name couple, as the data provided by the OP suggests
SELECT code, name
FROM table t
GROUP BY code, name
HAVING SUM(CASE WHEN task IN ('IF', 'SF', 'DC') THEN 1 ELSE 0 END) = 2;
select code,name from (select distinct code,name from table1 where task='SF' or task='IF') as temp1 inner join (select distinct code as code2,name as name2 from table1 where task='DC') as temp2 on code=code2,name=name2;
I'm assuming that you have the table in table1. The code constructs two tables temp1 and temp2. temp1 contains those codes and names which have been assigned SF and IF. temp2 contains those codes and names which have been assigned DC. Finally, I join the two tables together to find code-name pairs in both tables. This is faster than in Excel because the database engine probably temporarily indexes the columns being joined on.
Actually, you can do this in Excel. You sort the table by code and name, then enter the following formulas (assuming "Code" is in A1):
D2=if(and(A2=A1,B2=B1,D1),true,or(C2="IF",C2="SF"))
E2=if(and(A2=A1,B2=B1,E1),true,C2="DC")
Select these two cells, and double-click the fill-handle (the little square at the bottom right of the selection). Then, with the two columns selected, copy, and then "Paste Special..." > "Values". Then, filter (Alt-D-F-F) for the rows with values in columns D and E being both true. That is the result you want. Select these rows and copy to a new sheet if desired.
Alternatively, you can follow the SQL "group by" solution given by Gordon, so that you do not need to sort: Create two new columns like the above, but:
D1: "D"
E1: "E"
D2=if(or(C2="IF",C2="SF"),1,0)
E2=if(C2="DC",1,0)
Then, "Insert" > "PivotTable", drag "Code" and "Name" to be row labels. Drag "D" to be under Values, click on it, "Value Field Settings...", and then select "Max". Do the same for "E", and then the rows with 1 in both D and E will be the result you want.