Collapse Multiple Records Into a Single Record With Multiple Columns - sql

In a program I'm maintaining we were given a massive (~500 lines) SQL statement by the customer. It is used for generating flat files with fixed length records for transmitting data to another big business. Since its a massive flat file its not relational and the standard normal forms of data are collapsed. So, if you have a record that can have multiple codes associated, in this case upto 19, they all have be written into single line, but seperate fields, in the flat file.
Note: this example is simplified.
The data might look like this, with three tables:
RECORDS
record_id firstname lastname
--------------------------------
123 Bob Schmidt
324 George Washington
325 Ronald Reagan
290 George Clooney
CODE_TABLE
code_id code_cd code_txt
--------------------------------
5 3 President
2 4 Actor
3 7 Plumber
CODES_FOR_RECORDS
record_id code_cd
-------------------
123 7
325 3
290 4
324 3
325 4
123 4
This needs to produce records like:
firstname lastname code1 code2 code3
Bob Schmidt Actor Plumber NULL
George Washington President NULL NULL
Ronald Reagon Actor President NULL
George Clooney Actor NULL NULL
The portion of the current query we were given looks like this, but with 19 code columns instead of the 5:
select
x.record_id,
max(case when x.rankk = 1 then code_txt end) as CodeColumn1,
max(case when x.rankk = 2 then code_txt end) as CodeColumn2,
max(case when x.rankk = 3 then code_txt end) as CodeColumn3,
max(case when x.rankk = 4 then code_txt end) as CodeColumn4,
max(case when x.rankk = 5 then code_txt end) as CodeColumn5,
from
(
select
r.record_id,
ct.code_txt as ctag ,
dense_rank() over (partition by r.record_id order by cfr.code_id) as rankk
from
records as r
codes_for_records as cfr,
code_table as ct
where
r.record_id = cfr.record_id
and ct.code_cd = cfr.code_cd
and cfr.code_cd is not null
and ct.code_txt not like '%V%'
) as x
where
x.record_id is not null
group by
x.record_id
I trimmed down things for simplicties sake, but the actual statment includes an inner query and a join and more where conditions, but that should get the idea across. My brain is telling me there has to be a better way, but I'm not an SQL expert. We are using DB2 v8 if that helps. And the codes have to be in seperate columns, so no coalescing things into a single string. Is there a cleaner solution than this?
Update:
I ended up just refacorting the original query, it sill uses the ugly MAX() business, but overall the query is much more readable due to reworking other parts.

It sounds like what you are looking for is pivoting.
WITH joined_table(firstname, lastname, code_txt, rankk) AS
(
SELECT
r.firstname,
r.lastname,
ct.code_txt,
dense_rank() over (partition by r.record_id order by cfr.code_id) as rankk
FROM
records r
INNER JOIN
codes_for_records cfr
ON r.record_id = cfr.record_id
INNER JOIN
codes_table ct
ON ct.code_cd = cfr.code_cd
),
decoded_table(firstname, lastname,
CodeColumn1, CodeColumn2, CodeColumn3, CodeColumn4, CodeColumn5) AS
(
SELECT
firstname,
lastname,
DECODE(rankk, 1, code_txt),
DECODE(rankk, 2, code_txt),
DECODE(rankk, 3, code_txt),
DECODE(rankk, 4, code_txt),
DECODE(rankk, 5, code_txt)
FROM
joined_table jt
)
SELECT
firstname,
lastname,
MAX(CodeColumn1),
MAX(CodeColumn2),
MAX(CodeColumn3),
MAX(CodeColumn4),
MAX(CodeColumn5)
FROM
decoded_table dt
GROUP BY
firstname,
lastname;
Note that I've never actually done this myself before. I'm relying on the linked document as a reference.
You might need to include the record_id to account for duplicate names.
Edit: Added the GROUP BY.

One of the possible solutions is using of recursive query:
with recursive_view (record_id, rankk, final) as
(
select
record_id,
rankk,
cast (ctag as varchar (100))
from inner_query t1
union all
select
t1.record_id,
t1.rankk,
/* all formatting here */
cast (t2.final || ',' || t1.ctag as varchar (100))
from
inner_query t1,
recursive_view t2
where
t2.rankk < t1.rankk
and t1.record_id = t2.record_id
and locate(t1.ctag, t2.final) = 0
)
select record_id, final from recursive_view;
Can't guarantee that it works, but hope it will be helpful. Another way is using of custom aggregate function.

Related

Is there a way to display the total count of rows in a separate row?

I have a table that looks like this:
City_Id
City
41
Athena
39
Beijing
35
London
30
Rio de Janeiro
28
Salt Lake City
18
Sochi
7
Sydney
4
Torino
is there a way to display another row in the bottom that will display the total count of rows?
City_Id
City
41
Athena
39
Beijing
35
London
30
Rio de Janeiro
28
Salt Lake City
18
Sochi
7
Sydney
4
Torino
Total
8
You can actually use GROUPING SETS for this. This avoids having to scan the table twice.
However you still have the data-type mismatch problem. You could solve it by casting, but it's probably easier to just swap the columns around
SELECT
CASE WHEN GROUPING(City) = 0 THEN City ELSE 'Total' END AS City,
CASE WHEN GROUPING(City_Id) = 0 THEN City_Id ELSE COUNT(*) END AS City_Id
FROM Table1
GROUP BY GROUPING SETS (
(City_Id, City),
()
)
ORDER BY GROUPING(City_Id);
SQL Fiddle
What this does is generate separate result-sets, unioned together. You can differentiate between a grouped row and a non-grouped row using the GROUPING function.
I would agree with most of the other comments that acquiring a result set count would be more appropriate from the application code (which usually has a mechanism specifically for this purpose).
However...
If you must have a TSQL solution for your question, an option is to return the count in a separate column. This is different than returning it in a separate row, of course. There are pros & cons with each approach.
DROP TABLE IF EXISTS #Cities;
CREATE TABLE #Cities (
City_Id INT,
City VARCHAR(128)
);
INSERT INTO #Cities
VALUES
(41, 'Athena'),
(39, 'Beijing'),
(35, 'London'),
(30, 'Rio de Janeiro'),
(28, 'Salt Lake City'),
(18, 'Sochi'),
(7 , 'Sydney'),
(4 , 'Torino');
SELECT *, COUNT(*) OVER(ORDER BY (SELECT NULL)) AS Total
FROM #Cities;
--Count is properly reflected based on WHERE clause.
SELECT *, COUNT(*) OVER(ORDER BY (SELECT NULL)) AS Total
FROM #Cities
WHERE City LIKE 'S%';
--Be careful with this one--the COUNT(*) may not be what you expected.
SELECT TOP(4) *, COUNT(*) OVER(ORDER BY (SELECT NULL)) AS Total
FROM #Cities;
NOTE: be aware that this approach may not scale (perform) well for large result sets. Be sure to do some testing!
As you know already, it should be done in the presentation layer. But if you just want to know if there is any way, then I would suggest to use UNION ALL
select cast(City_Id as varchar(10)) City_Id, City from Table1
union all
select 'Total' as City_Id, cast(count(*) as varchar(14)) from Table1
Here is the sql fiddle

SQLite query to get table based on values of another table

I am not sure what title has to be here to correctly reflect my question, I can only describe what I want.
There is a table with fields:
id, name, city
There are next rows:
1 John London
2 Mary Paris
3 John Paris
4 Samy London
I want to get a such result:
London Paris
Total 2 2
John 1 1
Mary 0 1
Samy 1 0
So, I need to take all unique values of name and find an appropriate quantity for unique values of another field (city)
Also I want to get a total quantity of each city
Simple way to do it is:
1)Get a list of unique names
SELECT DISTINCT name FROM table
2)Get a list of unique cities
SELECT DISTINCT city FROM table
3)Create a query for every name and city
SELECT COUNT(city) FROM table WHERE name = some_name AND city = some_city
4)Get total:
SELECT COUNT(city) FROM table WHERE name = some_name
(I did't test these queries, so maybe there are some errors here but it's only to show the idea)
As there are 3 names and 2 cities -> 3 * 2 = 6 queries to DB
But for a table with 100 cities and 100 names -> 100 * 100 = 10 000 queries to DB
and it may take a lot of time to do.
Also, names and cities may be changed, so, I can't create a query with predefined names or cities as every day it's new ones, so, instead of London and Paris it may be Moscow, Turin and Berlin. The same thing with names.
How to get such table with one-two queries to original table using sqlite?
(sqlite: I do it for android)
You can get the per-name results with conditional aggregation. As for the total, unfortunately SQLite does not support the with rollup clause, that would generate it automatically.
One workaround is union all and an additional column for ordering:
select name, london, paris
from (
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
) t
order by prio, name
Actually the subquery might not be necessary:
select name, sum(city = 'London') london, sum(city = 'Paris') paris, 1 prio
from mytable
group by name
union all
select 'Total', sum(city = 'London'), sum(city = 'Paris'), 0
from mytable
order by prio, name
#GMB gave me the idea of using group by, but as I do it for SQLite on Android, so, the answer looks like:
SELECT name,
COUNT(CASE WHEN city = :london THEN 1 END) as countLondon,
COUNT(CASE WHEN city = :paris THEN 1 END) as countParis
FROM table2 GROUP BY name
where :london and :paris are passed params, and countLondon and countParis are fields of the response class

SQL - how to transpose only some row values into column headers without pivot

I have a table similar to this:
stud_ID | first_name | last_name | email | col_num | user_value
1 tom smith 50 Retail
1 tom smith 60 Product
2 Sam wright 50 Retail
2 Sam wright 60 Sale
but need to convert it to: (basically transpose 'col_num' to column headers and change 50 to function, 60 to department)
stud_ID | first_name | last_name | email | Function | Department
1 tom smith Retail Product
2 Sam wright Retail Sale
Unfortunately Pivot doesn't work in my system, just wondering if there is any other way to do this please?
The code that I have so far (sorry for the long list):
SELECT c.person_id_external as stu_id,
c.lname,
c.fname,
c.mi,
a.cpnt_id,
a.cpnt_typ_id,
a.rev_dte,
a.rev_num,
cp.cpnt_title AS cpnt_desc,
a.compl_dte,
a.CMPL_STAT_ID,
b.cmpl_stat_desc,
b.PROVIDE_CRDT,
b.INITIATE_LEVEL1_SURVEY,
b.INITIATE_LEVEL3_SURVEY,
a.SCHD_ID,
a.TOTAL_HRS,
a.CREDIT_HRS,
a.CPE_HRS,
a.CONTACT_HRS,
a.TUITION,
a.INST_NAME,
--a.COMMENTS,
a.BASE_STUD_ID,
a.BASE_CPNT_TYP_ID,
a.BASE_CPNT_ID,
a.BASE_REV_DTE,
a.BASE_CMPL_STAT_ID,
a.BASE_COMPL_DTE,
a.ES_USER_NAME,
a.INTERNAL,
a.GRADE_OPT,
a.GRADE,
a.PMT_ORDER_TICKET_NO,
a.TICKET_SEQUENCE,
a.ORDER_ITEM_ID,
a.ESIG_MESSAGE,
a.ESIG_MEANING_CODE_ID,
a.ESIG_MEANING_CODE_DESC,
a.CPNT_KEY,
a.CURRENCY_CODE,
c.EMP_STAT_ID,
c.EMP_TYP_ID,
c.JL_ID,
c.JP_ID,
c.TARGET_JP_ID,
c.JOB_TITLE,
c.DMN_ID,
c.ORG_ID,
c.REGION_ID,
c.CO_ID,
c.NOTACTIVE,
c.ADDR,
c.CITY,
c.STATE,
c.POSTAL,
c.CNTRY,
c.SUPER,
c.COACH_STUD_ID,
c.HIRE_DTE,
c.TERM_DTE,
c.EMAIL_ADDR,
c.RESUME_LOCN,
c.COMMENTS,
c.SHIPPING_NAME,
c.SHIPPING_CONTACT_NAME,
c.SHIPPING_ADDR,
c.SHIPPING_ADDR1,
c.SHIPPING_CITY,
c.SHIPPING_STATE,
c.SHIPPING_POSTAL,
c.SHIPPING_CNTRY,
c.SHIPPING_PHON_NUM,
c.SHIPPING_FAX_NUM,
c.SHIPPING_EMAIL_ADDR,
c.STUD_PSWD,
c.PIN,
c.PIN_DATE,
c.ENCRYPTED,
c.HAS_ACCESS,
c.BILLING_NAME,
c.BILLING_CONTACT_NAME,
c.BILLING_ADDR,
c.BILLING_ADDR1,
c.BILLING_CITY,
c.BILLING_STATE,
c.BILLING_POSTAL,
c.BILLING_CNTRY,
c.BILLING_PHON_NUM,
c.BILLING_FAX_NUM,
c.BILLING_EMAIL_ADDR,
c.SELF_REGISTRATION,
c.SELF_REGISTRATION_DATE,
c.ACCESS_TO_ORG_FIN_ACT,
c.NOTIFY_DEV_PLAN_ITEM_ADD,
c.NOTIFY_DEV_PLAN_ITEM_MOD,
c.NOTIFY_DEV_PLAN_ITEM_REMOVE,
c.NOTIFY_WHEN_SUB_ITEM_COMPLETE,
c.NOTIFY_WHEN_SUB_ITEM_FAILURE,
c.LOCKED,
c.PASSWORD_EXP_DATE,
c.SECURITY_QUESTION,
c.SECURITY_ANSWER,
c.ROLE_ID,
c.IMAGE_ID,
c.GENDER,
c.PAST_SERVICE,
c.LST_UNLOCK_TSTMP,
c.MANAGE_SUB_SP,
c.MANAGE_OWN_SP,
d.col_num,
d.user_value
FROM pa_cpnt_evthst a,
pa_cmpl_stat b,
pa_student c,
pv_course cp,
pa_stud_user d
WHERE a.cmpl_stat_id = b.cmpl_stat_id
AND a.stud_id = c.stud_id
AND cp.cpnt_typ_id(+) = a.cpnt_typ_id
AND cp.cpnt_id(+) = a.cpnt_id
AND cp.rev_dte(+) = a.rev_dte
AND a.CPNT_TYP_ID != 'SYSTEM_PROGRAM_ENTITY'
AND c.stud_id = d.stud_id
AND d.col_num in ('10','30','50','60')
I would just use conditional aggregation:
select stud_ID, first_name, last_name, email,
max(case when col_num = 50 then user_value end) as function,
max(case when col_num = 60 then user_value end) as department
from t
group by stud_ID, first_name, last_name, email;
Your code seems to have nothing to do with the sample data. I do notice however that you are using implicit join syntax. You really need to learn how to use proper, explicit, standard JOIN syntax.
I'm assuming you have Sql Server 2000 or 2003. What you need to do in that case is create a script with one cursor.
This cursor will create a text with something like this:
string var = "CREATE TABLE #Report (Col1 VARCHAR(20), Col2, VARCHAR(20), " + ColumnName
That way you can create a temp table on the fly, at the end you will need to do a Select of your temp table to get your pivot table ready.
Its not that easy if you are not familiar with cursors.
OR
if there are only few values on your 'pivot' column and they are not going to grow you can also do something like this:
Pivot using SQL Server 2000
I'm unable to understand your code, so I'll just assume the table mentioned in the sample data as stud(because of stud_id).
So here is what I think can do the work of pivot.
SELECT ISNULL(s1.stud_ID, s2.stud_id),
ISNULL(s1.first_name, s2.first_name),
ISNULL(s1.last_name, s2.last_name),
ISNULL(s1.email, s2.email),
s1.user_value as [Function], s2.user_value as Department
FROM stud s1 OUTER JOIN stud s2
ON s1.stud_ID = s2.stud_ID -- Assuming stud_ID is primary key, else join on all primary keys
AND s1.col_num = 50 AND s2.col_num = 60
Explanation: I'm just trying to simulate here what PIVOT does. For every column you want, you create a new table in the JOIN and constaint it to only one value in your col_num column. For example, if there are no values for 50 in s1, the OUTER JOIN will get make it NULL and we need to pull records from s2.
Note: If you need more than 2 new columns, then you can use COALESCE instead of ISNULL

Display a blank row between every unique row?

I have a simple query like:
SELECT employee, ITEM_TYPE, COUNT(ITEM_TYPE)
FROM hr_database
So the output may look like
BOB MUGS 4
BOB PENCILS 10
CAT MUGS 2
CAT PAPERCLIPS 7
SAL MUGS 11
But for readability, I want to put a blank row between each user in the output(i.e for readability), like this :
BOB MUGS 4
BOB PENCILS 10
CAT MUGS 2
CAT PAPERCLIPS 7
SAL MUGS 11
Is there a way to do this in Oracle SQL ? So far, I found this link but it doesn't match what I need . I'm thinking to use a WITH in the query?
You can do it in the database, but this type of processing should really be done at the application layer.
But, it is kind of an amusing trick to figure out how to do it in the database, and that is your specific question:
WITH e AS (
SELECT employee, ITEM_TYPE, COUNT(ITEM_TYPE) as cnt
FROM hr_database
GROUP BY employee, ITEM_TYPE
)
SELECT (case when cnt is not null then employee end) as employee,
item_type, cnt
FROM (select employee, item_type, cnt, 1 as x from e union all
select distinct employee, NULL, NULL, 2 as x from e
) e
ORDER BY e.employee, x;
I emphasize, though, that this is really for amusement and perhaps for understanding better how SQL works. In the real world, you do this type of work at the application layer.
A summary of how this works. The union all brings in one additional row for each employee. The x is a priority for sorting -- because you have to sort the result set to get the proper ordering. The case statement is needed to prevent the employee from being in the first column. cnt should never be NULL for the valid rows.
You can try like this with normal union & distinct
select emp,item_type,cnt from
(select distinct ' ' as emp,' ' as item_type ,' ' as cnt, employee
from hr_database
union
select employee as emp,item_type ,to_char(count(item_type)) as cnt, employee
from hr_database
group by employee,item_type)a
order by a.employee

sql combining 2 queries with different order by group by

I have a query where I am counting the most frequent response in a database and ranking them by highest amount so using group by and order by.
The following shows how to do it for one:
select health, count(health) as count
from [Health].[Questionaire]
group by Health
order by count(Health) desc
which outputs the following:
Health Count
----------- -----
Very Good 6
Good 5
Poor 4
I would like to do with another column on the same table another query similar to the following so two queries using one sql statement like the following:
Health Count Diet Count
----------- ----- ----- -----
Very Good 6 Very Good 6
Good 5 Good 4
Poor 4 Poor 3
UPDATE!!
Hello this is how the table looks like at the moment
ID Diet Health
----------- ----- -------
101 Very Good Very Good
102 Poor Good
103 Poor Poor
I would like to do with another column on the same table another query similar to the following so two queries using one sql statement like the following:
Health Count Diet Count
----------- ----- ----- -----
Very Good 2 Very Good 1
Poor 1 Good 1
Good 0 Poor 1
Can anyone please help me out with this one?
Can provide further clarification if needed!
Here are 2 different ways of doing it, notice i removed the redundant column:
Test data:
DECLARE #t table(Health varchar(20), Diet varchar(20))
INSERT #t values
('Very good', 'Very good'),
('Poor', 'Good'),
('Poor', 'Poor')
Query 1:
;WITH CTE1 as
(
SELECT Health, count(*) CountHealth
FROM #t --[Health].[Questionaire]
GROUP BY health
), CTE2 as
(
SELECT Diet, count(*) CountDiet
FROM #t --[Health].[Questionaire]
GROUP BY Diet
)
SELECT
coalesce(Health, Diet) Grade,
coalesce(CountHealth, 0) CountHealth,
coalesce(CountDiet, 0) CountDiet
FROM CTE1
FULL JOIN
CTE2
ON CTE1.Health = CTE2.Diet
ORDER BY CountHealth DESC
Result 1:
Grade CountHealth CountDiet
Poor 2 1
Very good 1 1
Good 0 1
Mixing the results like that is really not good practice, so here is a different solution
Query 2:
SELECT Health, count(*) Count, 'Health' Grade
FROM #t --[Health].[Questionaire]
GROUP BY health
UNION ALL
SELECT Diet, count(*) CountDiet, 'Diet'
FROM #t --[Health].[Questionaire]
GROUP BY Diet
ORDER BY Grade, Count DESC
Result 2:
Health Count Grade
Good 1 Diet
Poor 1 Diet
Very good 1 Diet
Poor 2 Health
Very good 1 Health
You need to join the table to itself, but (as your sample data shows) to deal with gaps in actual data for specific values.
If you have a table that has the range of health/diet values:
select
v.value Status,
count(a.id) healthCount,
count(b.id) DietCount
from health_diet_values v
left join Questionaire a on a.health = v.value
left join Questionaire b on b.diet = v.value
group by v.value
or if you don't have such a table, you need to generate the list of values manually and join from that:
select
v.value Status,
count(a.id) healthCount,
count(b.id) DietCount
from (select 'Very Good' value union all
select 'Good' union all
select 'Poor') v
left join Questionaire a on a.health = v.value
left join Questionaire b on b.diet = v.value
group by v.value
Both of these queries produce zeroes if there is no matching data for the value.
Note that in your desired output you have a redundant column - you repeat the value column. The above queries produce output that looks like:
Status HealthCount DietCount
-------------------------------
Very Good 2 1
Good 1 1
Poor 0 1