Two tables:
Parts Table:
Part_Number Load_Date TQTY
m-123 19940102 32
1234Cf 20010809 3
wf9-2 20160421 14
Locations Table:
PartNo Condition Location QTY
m-123 U A02 2
1234Cf S A02 3
m-123 U B01 1
wf9-2 S A06 7
m-123 S A18 29
wf9-2 U F16 7
Result:
Part_Number Load_Date TQTY U_LOC UQTY S_LOC SQTY
m-123 19940102 32 A02,B01 3 A18 29
1234Cf 20010809 3 A02 3
wf9-2 20160421 14 F16 7 A06 7
I am having trouble finding a solution to this with my current DB2 version. I am not completely sure how to find the version, but it is running on an AS400 system, and it seems the version of DB2, is tied to the OS version. Which the box is using: Operating system: i5/OS Version: V5R4M0
(I tried some commands to get the DB2 version using these suggestions Here but none of them worked, like most stated).
In regards to concatenating multiple rows of column data into one row I have come across many articles stating to use XMLAGG or xmlserialize, Here and, Here but I get an error stating the command is not recognized.
Not sure where to go from here, as there seem to be solutions, but I can't get those already suggested functions to work.
EDIT:
Using the accepted answer and explanation, as well as the example
HERE to get a basic idea of recursion with a simple example, and it was
HERE using the "SELECT rownumber() over(partition by category)" statements that really helped pull it all together. Once I understood that statement of course.
I also learned to make sure the data used in the recursion is as narrowed down as possible and then joined up with extra data later. This makes for exponentially faster results. <-- This seems pretty obvious, but when trying to figure all of this out, it wasn't obvious and my query was pretty slow. Once I understood what was actually happening better it was easier to make adjustments for really fast results.
This is rather complicated, so I will show all my work:
Table definitions
create table parts
(part_number Varchar(64),
load_date Date,
total_qty Dec(5,0));
create table locations
(part_number Varchar(64),
condition Char(1),
location Char(3),
qty Dec(5,0));
insert into parts
values ('m-123', '1994-01-02', 32),
('1234Cf', '2001-08-09', 3),
('wf9-2', '2016-04-21', 14);
insert into locations
values ('m-123', 'U', 'A02', 2),
('1234Cf', 'S', 'A02', 3),
('m-123', 'U', 'B01', 1),
('wf9-2', 'S', 'A06', 7),
('m-123', 'S', 'A18', 29),
('wf9-2', 'U', 'F16', 7);
The query:
with -- CTE's
-- This collects locations into a comma seperated list
tmp (part_number, condition, location, csv, level) as (
select part_number, condition, min(location),
cast(min(location) as varchar(128)), 1
from locations
group by part_number, condition
union all
select a.part_number, a.condition, b.location,
a.csv || ',' || b.location, a.level + 1
from tmp a
join locations b using (part_number, condition)
where a.csv not like '%' || b.location || '%'
and b.location > a.location),
-- This chooses the correct csv list, and adds quantity for the condition
tmp2 (part_number, condition, csv, qty) as (
select t.part_number, t.condition, t.csv,
(select sum(qty) qty
from locations
where part_number = t.part_number
and condition = t.condition)
from tmp t
where level = (select max(level)
from tmp
where part_number = t.part_number
and condition = t.condition))
-- This is the final select that combines the parts file with
-- the second stage CTE and arranges things horizontally by condition
select p.part_number, p.load_date,
(select sum(qty)
from locations
where part_number = p.part_number) as total_qty,
coalesce(u.csv, '') as u_loc,
coalesce(u.qty, 0) as uqty,
coalesce(s.csv, '') as s_loc,
coalesce(s.qty, 0) as sqty
from parts p
left outer join tmp2 u
on u.part_number = p.part_number and u.condition = 'U'
left outer join tmp2 s
on s.part_number = p.part_number and s.condition = 'S'
order by p.load_date;
EDIT I have had to add some extra bits in here to support more than two locations for a part/condition, and I have made the column naming in the CTEs more consistent. Ok, so let me explain this a bit, there are 3 parts to this quety, 2 CTEs and the query, you can see the three parts are separated by comments. The first CTE is a recursive CTE. It's purpose is to produce the comma separated location list. You should be able to run the select by itself to see just what it does. tmp is the table name, part_number, condition, csv, and level are the column names. A recursive CTE needs a SELECT to prime the CTE and a UNION ALL with a SELECT that fills in the next details. In this case the priming SELECT retrieves a part number, a condition, and the first location (alphabetically) for that combination. level is set to 1. If you run just the priming select, you will get:
part_number condition location csv level
----------- --------- -------- --- -----
1234Cf S A01 A02 1
m-123 S A18 A18 1
m-123 U A02 A02 1
wf9-2 U F16 F16 1
wf9-2 S A06 A06 1
Note one line per part/condition. The remainder of the recursive CTE will fill in the remaining locations in csv, but it will actually add additional records so we need to filter the results here and later. So records are processed as they are added. The first rows listed above are joined with the location file
on part_number and condition. Note in the priming select I have a cast of the second min(location) to a varchar(128). This leaves room for the CSV column to expand. Without this, it will still expand, but not enough to hold more than 2 locations.
The second select in the recursive CTE concatenates a comma and the next location to the end of CSV. The specific bit that does this is a.csv || ',' || b.location. It also increments the level column. This helps us keep track of where we are in the query. Eventually, the row with the highest level is the one we want to use. We also have a way to end the recursion, and some filters to reduce the number of rows added to the temporary result set. If we have 2 locations, A02 and B02, left unchecked, we will get the following rows: A02, A02,A02, A02,B02, A02,A02,A02, A02,B02,A02, A02,A02,B02, A02,B02,B02, ... ad infinitum. The anti-duplication filter where a.csv not like '%' || b.location || '%' is sufficient for two locations to end the recursion, and minimize rows, like above, for locations A02 and B02, with the anti-duplication filter, we will get rows A02, and A02,B02. Note that none of the other results from the first example with duplicate locations are returned. Adding a third location C02 will yield, with anti-duplication filter, the following rows: A02, A02,B02, A02,C02, A02,B02,C02, A02,C02,B02. No duplicates here, but we do have redundant rows, and as you add locations, it gets worse. This is where we need a way to detect these redundant rows. Since we are starting with the lowest location number, we can always make sure that locations added to CSV are greater than the previously added location. To do that all we need to do is include a column in the result set that indicates which column was added (we could interrogate CSV, but that is harder). This is why we need the location column in tmp. Then we can write filter b.location > a.location. In the above 3 location example, this filter prevents row A02,C02,B02 leaving just a single row with all three locations. Adding more than three locations to the locations file will cause the number of rows to expand even more in TMP, but for each part and condition, there will only be one row with all locations, and it will contain all locations in ascending order.
The second CTE does two things. First, it filters TMP to drop all but the rows containing all locations for a given part/condition. Second, it accumulates the total quantity for each part/condition.
The bit that performs the filtering is in the where clause:
where level = (select max(level)
from tmp
where part_number = t.part_number
and condition = t.condition))
Pretty straight forward. The bit that accumulates the total quantity for a part/condition is also an easy to understand sub-query:
(select sum(qty) qty
from locations
where part_number = t.part_number
and condition = t.condition)
The final piece of this monster query is the main select. It joins the parts file with the results of the second CTE to form the ultimate result set:
select p.part_number, p.load_date,
(select sum(qty) from locations where part_number = p.part_number) as total_qty,
coalesce(u.csv, '') as u_loc, coalesce(u.qty, 0) as uqty,
coalesce(s.csv, '') as s_loc, coalesce(s.qty, 0) as sqty
from parts p
left outer join tmp2 u
on u.part_number = p.part_number and u.condition = 'U'
left outer join tmp2 s
on s.part_number = p.part_number and s.condition = 'S'
order by p.load_date
Bits of note are the subquery to retrieve the total quantity from the locations table. You could use the tqty field in parts, but that can get out of sync with the actual quantities in the locations table. In addition there are two left outer joins with tmp2, one for condition U, and another for condition S. These construct the horizontal array of Location/Quantity in the result row. The last thing is the coalesce functions. These give null values (when a result from an outer join is missing) a default value.
End of EDIT
The final result is:
part_number load_date tqty u_loc uqty s_loc sqty
----------- ---------- ---- ------- ---- ----- ----
m-123 1994-01-02 32 A02,B01 3 A18 29
1234Cf 2001-08-09 3 0 A02 3
wf9-2 2016-04-21 14 F16 7 A06 7
Note XMLAGG and XMLSERIALIZE became available at DB2 for i v7.1 and LISTAGG became available at DB2 for i v7.2. Most recent version as of 8/9/2017 is v7.3. As you are on v5r4, it is likely you will need not only a software, but also a hardware upgrade to get current.
No idea what the rules are for UQTY, S_LOC, SQTY but here is the column you asked about ---
SELECT
P.Part_Number,
P.Load_Date,
P.TQTY,
LISTAGG(L.Location, ', ') WITHIN GROUP (ORDER BY L.Location) AS U_LOC
FROM "Parts Table" AS P
LEFT JOIN "Locations Table" AS L ON P.Part_Number = L.Part_Number
GROUP BY P.Part_Number, P.Load_Date, P.TQTY
I have a below table
id name total
1 a 2
2 b 3
3 c,d,e,f 15
Expected Output:-
id name total
1 a 2
2 b 3
3 c 15
4 d 15
5 e 15
5 f 15
I tried split function and also XML, but didn't work.
As you dont specify the DB name, Assuming SQL SERVER. You can try this one.
Working Example
SELECT A.[id],
Split.a.value('.', 'VARCHAR(100)') AS String,A.total
FROM (SELECT [id],
CAST ('<M>' + REPLACE([name], ',', '</M><M>') + '</M>' AS XML) AS String ,
[total]
FROM #t) AS A
CROSS APPLY String.nodes ('/M') AS Split(a);
Refer this article
Which version of SQL are you using?
The split function is for splitting a string of text, but what you are requesting is a change to the format of the table itself.
Your table has a tuple of id=3, name=c,d,e,f, total=15.
If you want id=3, name=c and so on, you have to change the data.
From the way your question is phrased, it implies that you want the data to be presented in a different way, but the id is the defining column which differentiates between rows in the database.
You could automatically generate a new table, in which case the split statement would be useful to get each element out of your comma separated record.
Once you have that list of items, assuming your id field is an identity field (auto incrementing), you could run an insert statement for each element.
You might be able to get the sort of output you're looking for using an inner select that splits the comma separated list of values, but you would need some procedural SQL (or T-SQL... you do not specify your SQL server) to iterate over the values and insert them into a new table.
If you do go down this route, the id values will have to be thrown away, and you would treat the list as just a raw data set.
EDIT: The example posted by Have No Display Name is about as close as you're going to get with the data in the form it is.
The IDs for the names 'c','d','e' and 'f' will all be 3, but your format will be very close.
I have the following query that generates my pivot results:
SELECT * FROM
(
SELECT
#tmp1.Name,
DATEDIFF(D,#tmp1.AuthDate,#tmp1.AuthExpirationDate) AS AuthLenInDays,
#tmp1.NbrOfAuthorizations,
#tmp1.MODE
FROM #tmp1
LEFT JOIN #tmp2
ON #tmp2.AuthID = #tmp1.AuthID
GROUP BY #tmp1.Name, #tmp1.NbrOfAuthorizations, #tmp1.AuthDate, #tmp1.AuthExpirationDate, #tmp1.MODE
) AS InnerTbl
PIVOT
(AVG(AuthLenInDays) FOR [MODE] IN ([Preservation])
) PivotResults1
The results are as follows:
Name NbrOfAuthorizations Preservation
Centro 1 79
Dennis 1 92
Therapy Center 1 68
Florez 1 92
I have two problems that I have not been able to figure out, I've tried everything I can think of and even other suggestions from stackoverflow.
I can't figure out how to change the name of the right-most column (Preservation)
in my results. It's an average number so I'd like to label that
column 'Average'.
Also, the NbrOfAuthorizations needs to be summed for all the values
in the table. I have tried using a pivot and this gets me close but
not all the way there, I have also tried using a SUM in the InnerTbl
query but that isn't it either.
If I take my raw data and export that to excel and do a pivot there, I can see the numbers and what I should be getting. I am trying to take that process and do it purely in SQL. Based on the data in the table, the values for the SUM should be
Name NbrOfAuthorizations Preservation
Centro 5 79
Dennis 1 92
Therapy Center 57 68
Florez 1 92
Any masters of pivot out there?
Looks like you don't need pivot at all:
select
t1.Name,
sum(t1.NbrOfAuthorizations) as NbrOfAuthorizations,
avg(datediff(dd, t1.AuthDate, t1.AuthExpirationDate)) as AuthLenInDays
from #tmp1 as t1
-- looks like you don't need join also, or there're multiple rows
-- in #tmp2 for row in #tmp1
-- left outer join #tmp2 as t2 on t2.AuthID = t1.AuthID
where t1.mode = 'Preservation'
group by t1.Name
i have these records coming from my stored procedure which i am calling in linq to sql
int_PostTypeId vcr_PostType int_PostTypeId_fk vcr_Slug HLevel
49 c 36 c 1
77 e 49 c/e 2
78 f 77 c/e/f 3
79 g 77 c/e/g 3
i have these set of records.
suppose while editing the int_PostTypeId 49 i changed the slug to c1
1) now the slug in the child records also ought to be changed.
slug in 77 will become c1/e
slug in 78 will become c1/e/f
slug in 79 will become c1/e/g
2) if i edit the record 77 and change the slug to c/e2 then the slug 78 and 79 should also be changed to c/e2/f and c/e2/g.
so editing the slug in the record will change the child slug if exists. what is the most appropriate and efficient way of doing it in linq. i am taking the recursive loop path but i think that is highly inefficient. any idea for more general approach? or any other approach.
If I'm understanding you correctly, you're actually updating a column called vcr_Slug in a record somewhere, rather than building the column's value in your stored procedure. Since you're actually using a stored procedure, why not calculate the column's value? I'm not sure if you're using a recursive common table expression to select your results, but if you are, it could take the form of something like the following (making some assumptions about your table structure which may not, of course, be completely representative):
CREATE FUNCTION dbo.GetPostTypes(#ParentTypeID int)
RETURNS TABLE
AS RETURN
(
WITH CurrentPostTypes(int_PostTypeId, vcr_PostType, int_PostTypeId_fk, vcr_Slug,
HLevel)
AS
(
-- Anchor member definition
SELECT pt.int_PostTypeId, pt.vcr_PostType, pt.int_PostTypeId_fk,
pt.vcr_Slug, 1 AS HLevel
FROM dbo.tblPostTypes AS pt
WHERE pt.int_PostTypeId = #ParentTypeID
UNION ALL
-- Recursive member definition
SELECT pt.int_PostTypeId, pt.vcr_PostType, pt.int_PostTypeId_fk,
cpt.vcr_Slug + '/' + pt.vcr_Slug AS vcr_Slug, cpt.HLevel + 1 AS HLevel
FROM dbo.tblPostTypes AS pt
INNER JOIN CurrentPostTypes AS cpt
ON pt.int_PostTypeId_fk = cpt.int_PostTypeId
)
SELECT *
FROM CurrentPostTypes
)
You'll notice in the recursive member definition where the previous value of vcr_Slug is suffixed with a slash and the current record's column value: cpt.vcr_Slug + '/' + pt.vcr_Slug AS vcr_Slug
Currently I have a query which is partly based on a join on two tables according to two number columns within them.
Say one table has a number like 123456789999 (NUM1)
And the other table has a number ranging from 1 - 9999 (NUM2)
I want to pull out the records which have 'NUM2' within the 5th - 8th digits of 'NUM1'
Currently I am doing something like this,
FROM Table1 AS T INNER JOIN Table2 AS S
ON SUBSTRING(T.num1, 5, 4) = S.num2
I know it should be retrieving approx 100 records, but I only get 8. I believe it to be because of the small ranges within number two. Where have I gone wrong? OR how could my code be made more robust/effective?
You need to use CAST like this:
FROM Table1 AS T INNER JOIN Table2 AS S
ON CAST(SUBSTRING(T.num1, 5, 4) AS INT) = S.num2
SEE THIS FIDDLE
For more info see SQL SERVER – Convert Text to Numbers (Integer) – CAST and CONVERT
try this:
Since the datatype of NUM2 is int, 0001 will be considered as just 1
so try this:
FROM Table1 AS T INNER JOIN Table2 AS S
ON cast(SUBSTRING(T.num1, 5, 4) as int) = S.num2