Renumbering a summary section field starting from "1" in SQL Server - sql

I have the following input table. I need a smart way to dynamically renumber the parents section indexes starting from "01" and show them in a new column.
I'm using SQL Server 2014 Express SP2
MyTable:
ID Integer
SECTION Varchar
Query:
SELECT * FROM MyTable
Results:
+--+--------+
|ID|SECTION |
+--+--------+
|1 |03 |
|2 |03.01 |
|3 |03.01.01|
|4 |03.02 |
|5 |03.03 |
|6 |04 |
|7 |04.01 |
|8 |04.02 |
|9 |05 |
+--+--------+
Here is what I'm trying to achieve from my select or procedure:
+--+--------+--------+
|ID|SECTION |NEWSECT |
+--+--------+--------+
|1 |03 |01 |
|2 |03.01 |01.01 |
|3 |03.01.01|01.01.01|
|4 |03.02 |01.02 |
|5 |03.03 |01.03 |
|6 |04 |02 |
|7 |04.01 |02.01 |
|8 |04.02 |02.02 |
|9 |05 |03 |
+--+--------+--------+

This is just string operations:
select t.*,
stuff(section, 1, 2,
right(concat('00', dense_rank() over (order by left(section, 2))), 2)
)
from t;
I mean, the dense_rank() is doing the work for renumbering the main sections. The rest is just getting the value into your section.
Here is a db<>fiddle.

Related

Postgres - How to achieve UNION behaviour with UNION ALL?

I have a table with parent and child ids.
create table if not exists stack (
parent int,
child int
)
Each parent can have multiple children and each child can have multiple children again.
insert into stack (parent, child) values
(1,2),
(2,3),
(3,4),
(4,5),
(5,6),
(6,7),
(7,8),
(8,9),
(9,null),
(1,7),
(7,8),
(8,9),
(9,null);
The data looks like this.
|parent|child|
|------|-----|
|1 |2 |
|2 |3 |
|3 |4 |
|4 |5 |
|5 |6 |
|6 |7 |
|7 |8 |
|8 |9 |
|9 |NULL |
|1 |7 |
|7 |8 |
|8 |9 |
|9 |NULL |
I'd like to find all children. I can use a recursive cte with a UNION ALL.
with recursive cte as (
select
child
from
stack
where
stack.parent = 1
union
select
stack.child
from
cte
left join stack on
cte.child = stack.parent
where
cte.child is not null
)
select * from cte;
This gives me the result I'd like to achieve.
|child|
|-----|
|2 |
|7 |
|3 |
|8 |
|4 |
|9 |
|5 |
|NULL |
|6 |
However I'd like to include the depth / level and also the path for each node. I can do this using a different recursive cte.
with recursive cte as (
select
parent,
child,
0 as level,
array[parent,
child] as path
from
stack
where
stack.parent = 1
union all
select
stack.parent,
stack.child,
cte.level + 1,
cte.path || stack.child
from
cte
left join stack on
cte.child = stack.parent
where
cte.child is not null
)
select * from cte;
That gives me this data.
|parent|child|level|path |
|------|-----|-----|--------------------|
|1 |2 |0 |{1,2} |
|1 |7 |0 |{1,7} |
|2 |3 |1 |{1,2,3} |
|7 |8 |1 |{1,7,8} |
|7 |8 |1 |{1,7,8} |
|3 |4 |2 |{1,2,3,4} |
|8 |9 |2 |{1,7,8,9} |
|8 |9 |2 |{1,7,8,9} |
|8 |9 |2 |{1,7,8,9} |
|8 |9 |2 |{1,7,8,9} |
|4 |5 |3 |{1,2,3,4,5} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|9 | |3 |{1,7,8,9,} |
|5 |6 |4 |{1,2,3,4,5,6} |
|6 |7 |5 |{1,2,3,4,5,6,7} |
|7 |8 |6 |{1,2,3,4,5,6,7,8} |
|7 |8 |6 |{1,2,3,4,5,6,7,8} |
|8 |9 |7 |{1,2,3,4,5,6,7,8,9} |
|8 |9 |7 |{1,2,3,4,5,6,7,8,9} |
|8 |9 |7 |{1,2,3,4,5,6,7,8,9} |
|8 |9 |7 |{1,2,3,4,5,6,7,8,9} |
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
|9 | |8 |{1,2,3,4,5,6,7,8,9,}|
My problem is that I have a lot of duplicate data. I'd like to get the same result as the UNION query but with the level and the path.
I tried something like
where
cte.child is not null
and stack.parent not in (cte.parent)
or
where
cte.child is not null
and not exists (select parent from cte where cte.parent = stack.parent)
but the first does not change anything and the second returns an error.
ERROR: recursive reference to query "cte" must not appear within a subquery
Any ideas? Thank you very much!
Your problem is inappropriate table data. Your table contains the information that 8 is a direct child to 7 twice for instance. I suggest you remove the duplicate data and implement a unique constraint on the pairs.
If you cannot do so for some reason, make the rows distinct in your query:
with recursive
good_stack as (select distinct * from stack)
,cte as
(
select
parent,
child,
0 as level,
array[parent,
child] as path
from good_stack
where good_stack.parent = 1
union all
select
good_stack.parent,
good_stack.child,
cte.level + 1,
cte.path || good_stack.child
from cte
left join good_stack on cte.child = good_stack.parent
where cte.child is not null and good_stack.child is not null
)
select * from cte;
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=acb1d7a1a1d26c3fd9caf0e7dedc12b2
(You may also make the columns not nullable. The entries 9|null add no information. If the table were lacking these entries, 9 would still be without a child.)

How to fill a column to differentiate a set of rows from other rows in a group in Impala?

I have the following table in Impala.
|LogTime|ClientId|IsNewSession|
|1 |123 |1 |
|2 |123 | |
|3 |123 | |
|3 |666 |1 |
|4 |666 | |
|10 |123 |1 |
|23 |666 |1 |
|24 |666 | |
|25 |444 |1 |
|26 |444 | |
I want to make a new table as follows:
|LogTime|ClientId|IsNewSession|SessionId|
|1 |123 |1 |1 |
|2 |123 | |1 |
|3 |123 | |1 |
|3 |666 |1 |1 |
|4 |666 | |1 |
|10 |123 |1 |2 |
|23 |666 |1 |2 |
|24 |666 | |2 |
|25 |444 |1 |1 |
|26 |444 | |1 |
Basically, I want to make SessionId column that has a unique session ID per set of rows until there's a value of 1 in IsNewSession column after group by ClientId, to differentiate different sessions per ClientId.
I've made IsNewSession column to do so, but not sure how to iterate on the rows to make SessionId column.
Any help would be greatly appreciated!
You can use a cumulative sum:
select t.*,
sum(isnewsession) over (partition by clientid order by logtime) as sessionid
from t;

Group rows based on column values in SQL / BigQuery

Is it possible to "group" rows within BigQuery/SQL depending on column values? Let's say I want to assign a string/id for all rows between stream_start_init and stream_start and then do the same for the rows between stream_resume and the last stream_ad.
The amount of stream_ad event can differ hence I can't use a RANK() or ROW() to group them be based on those values.
|id, timestamp, event|
|1 | 1231231 | first_visit|
|2 | 1231232 | login|
|3 | 1231233 | page_view|
|4 | 1231234 | page_view|
|5 | 1231235 | stream_start_init|
|6 | 1231236 | stream_ad|
|7 | 1231237 | stream_ad|
|8 | 1231238 | stream_ad|
|9 | 1231239 | stream_start|
|6 | 1231216 | stream_resume|
|6 | 1231236 | stream_ad|
|7 | 1231217 | stream_ad|
|8 | 1231258 | stream_ad|
|10| 1231240 | page_view|
How I wish the table to be
|id, timestamp, event, group_id|
|1 | 1231231 | first_visit, null|
|2 | 1231232 | login, null|
|3 | 1231233 | page_view, null|
|4 | 1231234 | page_view, null|
|5 | 1231235 | stream_start_init, group_1|
|6 | 1231236 | stream_ad, group_1|
|7 | 1231237 | stream_ad, group_1|
|8 | 1231238 | stream_ad, group_1|
|9 | 1231239 | stream_start, group_1|
|6 | 1231216 | stream_resume, group_2|
|6 | 1231236 | stream_ad, group_2|
|7 | 1231217 | stream_ad, group_2|
|8 | 1231258 | stream_ad, group_2|
|10| 1231240 | page_view, null|
I wouldn't assign a string. I would assign a number. This appears to be a cumulative sum. I think a sum of the number of "stream_start_init" and "stream_resume" does what you want:
select t.*,
countif(event in ('stream_start_init', 'stream_resume')) over (order by timestamp) as group_id
from t;
Note that this produces 0 for the first group -- which seems like a good thing. You can convert that to a NULL using NULLIF().
If you really want strings, you can use CONCAT().
Below is for BigQuery Standard SQL
#standardSQL
SELECT *,
IF(event IN ('stream_start_init', 'stream_start', 'stream_resume', 'stream_ad'),
COUNTIF(event IN ('stream_start_init', 'stream_resume')) OVER(ORDER BY timestamp),
NULL
) AS group_id
FROM `project.dataset.table`

How to create roll-ups in Informix SQL?

I often meet in different reports intermediate roll-ups like this:
|calltypename |rating |number |
+-----------------+----------------------------------------+-------+
|sales |1.0 |1 |
|sales |5.0 |2 |
| |3.666666666666666666666666666667 |3 |
|service |1.0 |1 |
|service |3.0 |1 |
|service |5.0 |3 |
|service |9.0 |1 |
| |4.666666666666666666666666666667 |6 |
Here records are grouped by calltypename with intermediate roll-ups:
average rating and sum of numbers.
Informix SQL have no ROLLUP operator, so I'm trying to achieve similar result with UNION:
select calltypename, TO_NUMBER(datavalue) as rating, count(*) as number
from calldata
where datakey="qrate1"
group by calltypename, rating
union all
select calltypename, AVG(TO_NUMBER(datavalue)) as rating, count(*) as number
from calldata
where datakey="qrate1"
group by calltypename
order by calltypename, rating
It produces the following result:
|calltypename |rating |number |
+-----------------+----------------------------------------+-------+
|sales |1.0 |1 |
|sales |3.666666666666666666666666666667 |3 |
|sales |5.0 |2 |
|service |1.0 |1 |
|service |3.0 |1 |
|service |4.666666666666666666666666666667 |6 |
|service |5.0 |3 |
|service |9.0 |1 |
Is there any hint how to order the records so that roll-ups will always take their place below the related group?
After some time, I have found a solution that I do not like very much. Idea is to add a fake column "ROLLUP" that will be used in ORDER BY statement:
select calltypename as queue, "" as rollup,
datavalue as rating, count(*) as number
from calldata
where datakey="qrate1"
group by queue, rating
union all
select calltypename as queue, "rollup" as rollup,
TO_CHAR(AVG(TO_NUMBER(datavalue)),"*.*") as rating, count(*) as number
from calldata
where datakey="qrate1"
group by queue
order by queue, rollup, rating
This produce a result:
|queue |rollup |rating|number |
+-------+----------+------+-------+
|sales | |1 |1 |
|sales | |5 |2 |
|sales |rollup |3.7 |3 |
|service| |1 |1 |
|service| |3 |1 |
|service| |5 |3 |
|service| |9 |1 |
|service|rollup |4.7 |6 |
But I would like it without ROLLUP column...

Pivoting Dates to Day of week

I'm currently trying to create a grid showing worked hours of employees.
Here's what my data look like (simplified) :
|ID |Client |Task |Hours |Date |
------------------------------------------
|1 |ABC |A |3 |09/06/2014|
|2 |ABC |A |5 |09/06/2014|
|3 |DEF |B |8 |10/06/2014|
|4 |DEF |C |8 |11/06/2014|
|5 |ABC |A |8 |12/06/2014|
And here's what the output must look like:
|Client |Task |Sun |Mon |Tue |Wed |Thu |Fri |Sat |
--------------------------------------------------
|ABC |A | |3 | | |8 | | |
|ABC |A | |5 | | | | | |
|DEF |B | | |8 | | | | |
|DEF |C | | | |8 | | | |
My problem is really close to this one. However there's a major diffrence: it's possible in my case to have multiple values for the same combination of Client-Task-Date.
As shown in the desired output, employees will sometime seperate their work hours even if they worked for the same client and on the same task and i can't use aggregate since all the data shown in the grid will be interactive to the end user.
Is there a way to obtain such output using pivot or any other SQL mechanics such as CASE WHEN ?
WITH t AS (
SELECT
Client,
Task,
Hours,
ROW_NUMBER() OVER(PARTITION BY Client,Task,Date ORDER BY Date) rn,
DATEPART(dw,date) DayOfWeek
FROM MyTable
)
SELECT Client, Task, [1] Sun, [2] Mon, [3] Tues, [4] Wed, [5] Thu, [6] Fri, [7] Sat
FROM t
PIVOT(SUM(Hours) FOR DayOfWeek IN ([1],[2],[3],[4],[5],[6],[7])) p