SQL convert column names to row values - sql

I have a table organized as follows:
Account Location Measure1 Measure2 Measure3
-------------------------------------------
123a A 100 20% 5
234b A 75 80% 8
I want to create records as follows:
Account Location Measure Value
-----------------------------------
123a A Measure1 100
123a A Measure2 20%
123a A Measure3 5
234b A Measure1 75
234b A Measure2 80%
234b A Measure3 8
Because my measure names are the column headings and not column values under a heading called "Measure" I cannot pivot the data on the measure name.
I know how to query what the column names are by querying INFORMATION_SCHEMA.COLUMNS. But I'm not sure how to proceed from there. I don't want to do a Union because there are about 100+ measure columns and the table is large.
The only assistance I have been able to find on the web refers to splitting values in a single column (e.g. semi-colon delimited strings) into multiple records. UNPIVOT doesn't work because again the measure name is not a value in a column, it is a column heading.
I would appreciate any assistance you can give me

A simple method that works in most databases uses union all:
select account, location, 'Measure1', Measure1
from t
union all
select account, location, 'Measure2', Measure2
from t
union all
select account, location, 'Measure3', Measure3
from t;
In databases that support lateral joins, there is more convenient syntax, such as:
select t.account, t.location, v.measure, v.value
from t cross join lateral -- or maybe cross apply
(values ('Measure1', t.Measure1), ('Measure2', t.Measure2), ('Measure3', t.Measure3)
) v(measure, value);

Related

SQL: Finding Quantiles Across Columns

Here's an easy one. I have a sales table that looks like this:
store_id industry_code sales_person_1 sales_person_2 ... sales_person_n
1 1000 20.75 15.50 ... 100
2 2000 15.54 16.84 ... 125
Suppose I want to find out which quantile sales_person_2 falls into for store_id=1. I know I can use a window function ntile(5) OVER(PARTITION BY ____ ORDER BY SUM(__) DESC) to divide a column into 5 buckets and use that to identify which bucket an arbitrary value falls into. What's the best way to do that across columns rather than within a column?
What you can do is explode your columns into several rows:
select t.store_id,
t.industry_code,
s.val
from test_table t
lateral view explode(array(sales_person_1, sales_person_2, ..., sales_person_n)) s as val
and only then use ntile.
See the example from the Hive docs.

grouping with right function sql

I have a query on grouping which I need to do a quick fix on. I am at present grouping column A and counting the value in column B.
select
Column A,
Count ([Column B])
from table1
Group by Column A
The issue is that column A has some entries which are not standard for example.
ABC 100
ABC~ 3
BCA 120
BCA* 4
I need to blast the data to fix long term, but there are 3m rows, so not a quick job, as I need to create a mapping file to deal with the problem.
I currently get returned duplicate entries which is right in theory, but in practice I would like to group the ABC, by either trimming the column to only 3 characters or doing a right. However I have tried it in the select statement and it just removes the ~ or * entry and sums the standard ABC or BCA.
Have you tried something ???
select LEFT([Column A], 3),
Count ([Column B])
from table1
Group by LEFT([Column A], 3)

SQL Server unpivot columns

I have a table that I would like to unpivot in a SQL statement. It consists of a person and phone 1 through 5. Right now I'm doing a union for each phone but I fear it is causing performance issues.
Columns:
PERSON_GUID,
PHONE_1, PHONE_1_VOICE_FLG,
PHONE_2, PHONE_2_VOICE_FLG,
PHONE_3, PHONE_3_VOICE_FLG,
PHONE_4, PHONE_4_VOICE_FLG,
PHONE_5, PHONE_5_VOICE_FLG
How would I best unpivot the row with performance in mind so that the results are:
PERSON_GUID, PHONE_NO, VOICE_FLG
I prefer UNPIVOT but as for your solution -
Make sure you are using UNION ALL and not UNION.
UNION ALL just spills one query result after the other.
UNION eliminates rows duplications and this is where you pay in performance.
select PERSON_GUID,PHONE_NO,
case right(col,1)
when 1 then PHONE_1_VOICE_FLG
when 2 then PHONE_2_VOICE_FLG
when 3 then PHONE_3_VOICE_FLG
when 4 then PHONE_4_VOICE_FLG
when 5 then PHONE_5_VOICE_FLG
end VOICE_FLG
from t unpivot (PHONE_NO for col in
(PHONE_1,PHONE_2,PHONE_3,PHONE_4,PHONE_5)) u

Select query to fetch required data from SQL table

I have some data like this as shown below:
Acc_Id || Row_No
1 1
2 1
2 2
2 3
3 1
3 2
3 3
3 4
and I need a query to get the results as shown below:
Acc_Id || Row_No
1 1
2 3
3 4
Please consider that I'm a beginner in SQL.
I assume you want the Count of the row
SELECT Acc_Id, COUNT(*)
FROM Table
GROUP BY Acc_Id
Try this:
select Acc_Id, MAX(Row_No)
from table
group by Acc_Id
As a beginner then this is your first exposure to aggregation and grouping. You may want to look at the documentation on group by now that this problem has motivated your interest in a solutions. Grouping operates by looking at rows with common column values, that you specify, and collapsing them into a single row which represents the group. In your case values in Acc_Id are the names for your groups.
The other answers are both correct in the the final two columns are going to be equivalent with your data.
select Acc_Id, count(*), max(Row_No)
from T
group by Acc_Id;
If you have gaps in the numbering then they won't be the same. You'll have to decide whether you're actually looking for a count of rows of a maximum of a value within a column. At this point you can also consider a number of other aggregate functions that will be useful to you in the future. (Note that the actual values here are pretty much meaningless in this context.)
select Acc_Id, min(Row_No), sum(Row_No), avg(Row_No)
from T
group by Acc_Id;

SELECT DISTINCT is not working

Let's say I have a table name TableA with the below partial data:
LOOKUP_VALUE LOOKUPS_CODE LOOKUPS_ID
------------ ------------ ----------
5% 120 1001
5% 121 1002
5% 123 1003
2% 130 2001
2% 131 2002
I wanted to select only 1 row of 5% and 1 row of 2% as a view using DISTINCT but it fail, my query is:
SELECT DISTINCT lookup_value, lookups_code
FROM TableA;
The above query give me the result as shown below.
LOOKUP_VALUE LOOKUPS_CODE
------------ ------------
5% 120
5% 121
5% 123
2% 130
2% 131
But that is not my expected result, mt expected result is shown below:
LOOKUP_VALUE LOOKUPS_CODE
------------ ------------
5% 120
2% 130
May I know how can I achieve this without specifying any WHERE clause?
Thank you!
I think you're misunderstanding the scope of DISTINCT: it will give your distinct rows, not just distinct on the first field.
If you want one row for each distinct LOOKUP_VALUE, you either need a WHERE clause that will work out which one of them to show, or an aggregation strategy with a GROUP BY clause plus logic in the SELECT that tells the query how to aggregate the other columns (e.g. AVG, MAX, MIN)
Here's my guess at your problem - when you say
"The above query give me the result as shown in the data table above."
this is simply not true - please try it and update your question accordingly.
I am speculating here: I think you are trying to use "Distinct" but also output the other fields. If you run:
select distinct Field1, Field2, Field3 ...
Then your output will be "one row per distinct combination" of the 3 fields.
Try GROUP BY instead - this will let you select the Max, Min, Sum of other fields while still yielding "one row per unique combined values" for fields included in GROUP BY
example below uses your table to return one row per LOOKUP_VALUE and then the max and min of the remaining fields and the count of total records using your data:
select
LOOKUP_VALUE, min( LOOKUPS_CODE) LOOKUPS_CODE_min, max( LOOKUPS_CODE) LOOKUPS_CODE_max, min( LOOKUPS_ID) LOOKUPS_ID_min, max( LOOKUPS_ID) LOOKUPS_ID_max, Count(*) Record_Count
From TableA
Group by LOOKUP_VALUE
I wanted to select only 1 row of 5% and 1 row of 2%
This will get the lowest value lookups_code for each lookup_value:
SELECT lookup_value,
lookups_code
FROM (
SELECT lookup_value,
lookups_code,
ROW_NUMBER() OVER ( PARTITION BY lookup_value ORDER BY lookups_code ) AS rn
FROM TableA
)
WHERE rn = 1
You could also use GROUP BY:
SELECT lookup_value,
MIN( lookups_code ) AS lookups_code
FROM TableA
GROUP BY lookup_value
How about the MIN() function
I believe this works for your desired output, but am currently not able to test it.
SELECT Lookup_Value, MIN(LOOKUPS_CODE)
FROM TableA
GROUP BY Lookup_Value;
I'm going to take a total shot in the dark on this one, but because of the way you have named your fields it implies you are attempting to mimic the vlookup function within Microsoft Excel. If this is the case, the behavior when there are multiple matches is to pick the first match. As arbitrary as that sounds, it's the way it works.
If this is what you want, AND the first value is not necessarily the lowest (or highest, or best looking, or whatever), then the row_number aggregate function would probably suit your needs.
I give you a caveat that my ordering criteria is based on the database row number, which could conceivably be different than what you think. If, however, you insert them into a clean table (with a reset high water mark), then I think it's a pretty safe bet it will behave the way you want. If not, then you are better off including a field explicitly to tell it what order you want the choice to occur.
with cte as (
select
vlookup_value,
vlookups_code,
row_number() over (partition by vlookup_value order by rownum) as rn
from
TableA
)
select
vlookup_value, vlookups_code
from cte
where rn = 1