Hive - Group by with respect to following values - sql

I have a table with rows:
id
a
b
0
1
1
1
1
2
2
2
1
3
1
1
I need to get sum of field "b" values grouped by "a" with respect to changes in "a".
For my example i want to get:
a
b
1
3
2
1
1
1

Related

SQL Query to get multiple resultant on single column

I have a table that looks something like this:
id name status
2 a 1
2 a 2
2 a 3
2 a 2
2 a 1
3 b 2
3 b 1
3 b 2
3 b 1
and the resultant i want is:
id name total count count(status3) count(status2) count(status1)
2 a 5 1 2 2
3 b 4 0 2 2
please help me get this result somehow, i can just get id, name or one of them at a time, don't know how to put a clause to get this table at once.
Here's a simple solution using group by and case when.
select id
,count(*) as 'total count'
,count(case status when 3 then 1 end) as 'count(status1)'
,count(case status when 2 then 1 end) as 'count(status3)'
,count(case status when 1 then 1 end) as 'count(status2)'
from t
group by id
id
total count
count(status3)
count(status2)
count(status1)
2
5
1
2
2
3
4
0
2
2
Fiddle
Here's a way to solve it using pivot.
select *
from (select status,id, count(*) over (partition by id) as "total count" from t) tmp
pivot (count(status) for status in ([1],[2],[3])) pvt
d
total count
1
2
3
3
4
2
2
0
2
5
2
2
1
Fiddle

Sequence in SELECT statement

I need to create SELECT statement with sequence in Oracle. When col_flag is 1 then sequence increase with mod(col_seq, max_seq) and when col_flag is 0 then sequence don't increment.
Example:
col_group col_flag col_seq
--------- -------- --------
A 1 1
A 1 2
A 1 3
A 0 3
A 0 3
B 1 4
B 1 1
B 1 2
B 1 3
B 0 3
B 1 4
B 1 1
C 1 2
C 0 2
...
It guess that a window sum and arithmetics can do what you want - but you need a column that defines the ordering of the rows, I assumed id.
select col_flag,
mod(sum(col_flag) over(order by id), 4) + 1 col_seq
from mybltae

Custom aliases for all fields with GROUP BY ROLLUP

I have such tables:
Group - combination of TypeId and ZoneId
ID TypeID ZoneID
-- -- --
1 1 1
2 1 2
3 2 1
4 2 2
5 2 3
6 3 3
Object
ID GroupId
-- --
1 1
2 1
3 2
4 3
5 3
6 3
I want to build a query for grouping all these tables by TypeId and ZoneId, with number of objects which have specific combination of these field:
ResultTable
TypeId ZoneId Number of objects
-- -- --
1 1 2
1 2 1
2 1 3
2 2 1
2 3 0
3 3 0
Query for this:
SELECT
group.TypeId,
group.ZoneId,
COUNT(obj.ID) as NumberOfObjects
FROM[Group] group
JOIN[Object] obj on obj.GroupID = group.ID
GROUP BY group.TypeId, group.ZoneId ORDER BY group.TypeId
But! I want to add summarize row after each group, and make it like:
ResultTableWithSummary
TypeId ZoneId Number of objects
-- -- --
1 1 2
1 2 1
Summary (empty field) 3
2 1 3
2 2 1
2 3 0
Summary (empty field) 4
3 3 0
Summary (empty field) 0
The problem is that I can use GROUP BY ROLLUP(group.TypeId, group.ZoneId):
TypeId ZoneId Number of objects
-- -- --
1 1 2
1 2 1
1 null 3
2 1 3
2 2 1
2 3 0
2 null 4
3 3 0
3 null 0
but I cannot or don't know how to change not-null group.TypeId in summary rows with "Summary".
How can I do this?
The simplest method is coalesce(), but you need to be sure the types match:
SELECT COALESCE(CONVERT(VARCHAR(255), group.TypeId, 'Summary') as TypeId,
. . .
This is not the most general method, because it does not handle real NULL values in the GROUP BY keys. That doesn't seem to be an issue in this case. If it were, you could use a CASE expression with GROUPING().
EDIT:
For your particular variant (which I find strange), you can use:
SELECT (CASE WHEN group.TypeId IS NULL OR group.ZoneID IS NULL
THEN 'Summary' ELSE CONVERT(VARCHAR(255), group.TypeId)
END) as TypeId,
. . .
In practice, I would use something similar to the COALESCE() in both columns, so I don't lose the information on what the summary is for.

Formatting the results of a query

Let's say I have the following table:
first second
A 1
A 1
A 2
B 1
B 2
C 1
C 1
If I run the following query:
select first, second, count(second) from tbl group by first, second
It will produce a table with the following information:
first second count(second)
A 1 2
A 2 1
B 1 1
B 2 1
C 1 2
How can I write the query so that I am given the information with the options from the second column as columns and the values for those columns being the count like this:
first 1 2
A 2 1
B 1 1
C 2 0
You can use CASE:
SELECT "first",
SUM(CASE WHEN "second" = 1 THEN 1 ELSE 0 END) AS "1",
SUM(CASE WHEN "second" = 2 THEN 1 ELSE 0 END) AS "2"
FROM tbl
GROUP BY "first"

SQL: Need to create two unique records for each single record

The simple question is how can you take a set of records with a PK and create exactly two records for each source with a slightly altered key for the duplicate? In other words, I take 4000 records and produce 8000 records where 4000 are identical and the other 4000 have a slightly altered key. I cannot do a union because this is essentially two selects (long story).
The rest gets complicated, but maybe necessary to provide examples.
This is my original set (it contains over 4000 records)
dateGroup areaGroup itemID editionID
1 1 1 1
1 1 1 2
1 2 1 1
1 2 2 1
2 1 1 1
2 1 1 2
2 2 1 1
2 2 1 2
For each record I need to create a duplicate record ganging the areaGroups together under '0', then create a comma separated list of original areaGroups as a separate field. (The "why" is some dumb programmer (me) made a mistake about 15 years ago.) I can renumber the editionIDs as necessary, but the original and duplicate record must have the same editionID (thus why a union wouldn't work). The PK remains the same as above (all fields)
dateGroup areaGroup itemID editionID aGroups
1 0 1 1 1
1 0 1 2 1
1 0 1 1 2 // Duplicate (EditionID)
1 0 2 1 2
2 0 1 1 1
2 0 1 2 1
2 0 1 1 2 // Duplicate (EditionID)
2 0 1 2 2
The end result would renumber the editionID as needed to make the record unique.
dateGroup areaGroup itemID editionID aGroups (EditionID is what is altered)
1 0 1 1 1
1 0 1 2 1
1 0 1 2 2 1 changed to 2 (one more than row 1)
1 0 2 1 2
2 0 1 1 1
2 0 1 2 1
2 0 1 2 2 1 changed to 2 (one more than row 1)
2 0 1 2 2
1 1 1 1
1 1 1 2
1 2 1 2 1 changed to 2 (editionID) to match
1 2 2 1
2 1 1 1
2 1 1 2
2 2 1 2 1 changed to 2 to match above
2 2 1 2
I know you could calculate the editionID like a row rank like so:
select row_number() over (
partition by dateGroup, itemID
order by dateGroup, itemID) as editionID
So all I need is to know how to duplicate the records from a single set
do a cross join on a derived table:
( select 1 as aGroups union all select 2 )
I'd create a temporary table with duplicates and their count.
Then I'd filter the original table to have only unique rows, and insert another row for each row in the temporary table, incrementing their editionID.
In MySQL, I'd use user #variables; not sure about MS SQL.
Did you try UNION ALL instead of just UNION
UDPATE perhaps I misunderstood the problem and I thought you were having a problem with the union loosing the duplicates.
If the problem is that you want to do a row_number over a union why don't you do somthing like
select row_number() over (
partition by dateGroup, itemID
order by dateGroup, itemID) as editionID
FROM
(
SELECT
dateGroup, itemID
FROM TableA
UNION ALL
SELECT
dateGroup, itemID
FROM TableB
) Data