How to group by table and create the new columns to put the original value in there? - sql

I have a table in postgresql
type id n occurred_at
A 159 4 2013/12/20 18:05
A 159 5 2013/12/27 18:05
A 159 6 2014/1/20 18:05
A 159 8 2014/3/22 12:34
B 180 5 2014/3/29 12:34
B 180 6 2014/4/22 12:34
C 207 4 2014/3/13 03:24
C 207 8 2014/3/20 03:24
C 207 6 2014/4/13 03:24
D 157 4 2013/12/20 18:07
D 157 5 2013/12/27 18:07
D 157 6 2014/1/20 18:07
D 157 8 2013/1/20 17:41
D 157 8 2013/12/27 17:41
D 157 8 2014/1/20 17:41
I want different n which has same type and id in one row and order by occurred_at (The n is only for 6 or 8). The result just like below, I was trying to use group by type, id to get this but seems difficult.
Does any guy have better ideas to do this?
A 159 6 8
B 180 6
C 207 8 6
D 157 8 8 6 8

If you want a space separated list of n-values, you can use string_agg to aggregate all values:
select type, id, string_agg(n::text, ' ' order by occurred_at) as n_values
from the_table
group by type, id;

I'm not 100% sure on what you want for the desired output, but if you are looking for a unique "n" value for each row based on the type and id, I think the row_number() analytic function would do that.
select
type, id,
row_number() over (partition by type, id order by occurred_at) as n,
occurred_at
from my_table

Related

Group repeating pattern in pandas Dataframe

so i have a Dataframe that has a repeating Number Series that i want to group like this:
Number Pattern
Value
Desired Group
Value.1
1
723
1
Max of Group
2
400
1
Max of Group
8
235
1
Max of Group
5
387
2
Max of Group
7
911
2
Max of Group
3
365
3
Max of Group
4
270
3
Max of Group
5
194
3
Max of Group
7
452
3
Max of Group
100
716
4
Max of Group
104
69
4
Max of Group
2
846
5
Max of Group
3
474
5
Max of Group
4
524
5
Max of Group
So essentially the number pattern is always monotonly increasing.
Any Ideas?
You can compare Number Pattern by 1 with cumulative sum by Series.cumsum and then is used GroupBy.transform with max:
df['Desired Group'] = df['Number Pattern'].eq(1).cumsum()
df['Value.1'] = df.groupby('Desired Group')['Value'].transform('max')
print (df)
Number Pattern Value Desired Group Value.1
0 1 723 1 723
1 2 400 1 723
2 3 235 1 723
3 1 387 2 911
4 2 911 2 911
5 1 365 3 452
6 2 270 3 452
7 3 194 3 452
8 4 452 3 452
9 1 716 4 716
10 2 69 4 716
11 1 846 5 846
12 2 474 5 846
13 3 524 5 846
For monotically increasing use:
df['Desired Group'] = (~df['Number Pattern'].diff().gt(0)).cumsum()

How do I change my SQL SELECT GROUP BY query to show me which records are missing a value?

I have a list of codes by area and type. I need to get the unique codes for each type, which I can do with a simple SELECT query with a GROUP BY. I now need to know which area does not have one of the codes. So how do I run a query to group by unique values and tell me how records do not have one of the values?
ID Area Type Code
1 10 A 123
2 10 A 456
3 10 B 789
4 10 B 987
5 10 C 654
6 10 C 321
7 20 A 123
8 20 B 789
9 20 B 987
10 20 C 654
11 20 C 321
12 30 A 137
13 30 A 456
14 30 B 579
15 30 B 789
16 30 B 987
17 30 C 654
18 30 C 321
I can run this query to group them by type and get get the unique codes:
SELECT tblExample.Type, tblExample.Code
FROM tblExample
GROUP BY tblExample.Type, tblExample.Code
This gives me this:
Type Code
A 123
A 137
A 456
B 579
B 789
B 987
C 321
C 654
Now I need to know which areas do not have a given code. For example, Code 123 does not appear for Area 10 and code 137 does not appear for codes 10 and 20. How do I write a query to give me that areas are missing a code? The format of the output doesn't matter, I just need to get the results. I'm thinking the results could be in one column or spread out in multiple columns:
Type Code Missing Areas or Missing1 Missing2
A 123 30 30
A 137 10, 20 10 20
A 456 20 20
B 579 10, 20 10 20
B 789
B 987
C 321
C 654
You can get a list of the missing code/areas by first generating all combinations and then filtering out the ones that exist:
select t.type, c.code
from (select distinct type from tblExample) t cross join
(select distinct code from tblExample) c left join
tblExample e
on t.type = e.type and c.code = e.code
where e.type is null;

Running total of rows by ID

I have a list of IDs, transactions, and the date of those transactions. I want to create a count of each transaction within each ID.
The starting table I have is looks something like this:
id trxn_dt trxn_amt
1 10/31/2014 58
1 11/9/2014 34
1 12/10/2014 12
2 7/8/2014 78
2 11/20/2014 99
3 1/5/2014 120
4 2/17/2014 588
4 2/18/2014 8
4 3/9/2014 65
4 4/25/2014 74
and I want the end result to look something like this:
id trxn_dt trxn_amt trxn_count
1 10/31/2014 58 1
1 11/9/2014 34 2
1 12/10/2014 12 3
2 7/8/2014 78 1
2 11/20/2014 99 2
3 1/5/2014 120 1
4 2/17/2014 588 1
4 2/18/2014 8 2
4 3/9/2014 65 3
4 4/25/2014 74 4
Count(distinct(id)) would only give me the overall number of distinct IDs and not a running total by each ID that restarts at each new ID.
Thank you!
In SQL-Server you can use ROW_NUMBER in following:
SELECT id,
trxn_dt,
trxn_amt,
ROW_NUMBER() OVER(PARTITION BY Id ORDER BY Id, trxn_dt) AS trxn_count
FROM StarningTable
In MySQL you can do in following:
SELECT
t.id,
t.trxn_dt,
t.trxn_amt,
#cur:= IF(id=#id, #cur+1, 1) AS RowNumber,
#id := id
FROM
StarningTable t
CROSS JOIN
(SELECT #id:=(SELECT MIN(id) FROM StarningTable t), #cur:=0) AS init
ORDER BY
t.id
using Row_number we can achieve this
Select *,
ROW_NUMBER()OVER(PARTITION BY id ORDER BY (SELECT NULL))trxn_count
from Transactions

Oracle : Generate rows with slightly different values in a column

I have a table with data like below :
ID SUMMARY_DATE KEYWORD_ID DATA
123 9/1/2014 5 98
I need to generate 18 more rows with the summary_date + 18 months, something like below :
ID SUMMARY_DATE KEYWORD_ID DATA
123 9/1/2014 5 98
123 10/1/2014 5 98
123 11/1/2014 5 98
123 12/1/2014 5 98
...
123 3/1/2016 5 98
I could do that using UNION but it will be so long. Is there any other ways to do it?
Thanks in advance.
Just generate a list of numbers and use add_months():
with n as (
select level as m
from dual
connect by level <= 18
)
select t.id, add_months(t.summary_date, n.m, keywork_id, data
from table t cross join
n;

Select from table repeat first value for combination of two keys

I would like to transfer some existing data into new data table.
I have table with substitutions:
- ID
- currentItemId
- formerItemId
- contentId
For the same content there is possibility I have multiple entries for combinations currentItemId and formerItemId.
Let me show how it is now:
ID_T1 currentItemId formerItemId contentId
1 100 200 300
2 100 200 301
3 100 200 302
4 105 201 303
5 105 201 304
6 110 205 320
7 111 206 321
8 120 204 322
9 130 208 323
10 130 208 324
Now, I would like to select TOP ID for each combination formerItemId and currentItemId:
ID ID_T1 contentId
1 1 300
2 1 301
3 1 302
4 4 303
5 4 304
6 6 320
7 7 321
8 8 322
9 9 323
10 9 324
Both tables also contains timestamp and some other data - I haven't included that in order example to be more understandable.
I tried self join (no success), nested select (gives me right value for the original combination, but it doesn't repeat, it gives me NULL on ID for other records), but nothing seems to work. Tried something like:
SELECT di1.ID,
(SELECT TOP(1) di1.ID
FROM TABLE
WHERE
di1.currentItemtId = di2.currentItemtId AND di1.formerItemId = di1.formerItemId
) AS repeat
,di2.deleteItemId
,di1.currentitemtId
,di1.formerItemId
,di1.contentId
FROM Table di1
LEFT JOIN
Table di2 ON di1.ID = di2.ID
But this way ID doesn't repeat - I get same values for ID as in ordinary select.
I am using SQL server 2008.
Any help would be greatly appreciated.
Please try:
SELECT
MIN(ID) OVER (PARTITION BY currentItemId, formerItemId) ID,
currentItemId,
formerItemId,
contentId
FROM YourTable
SELECT
ID,
MIN(ID) OVER (PARTITION BY currentItemId, formerItemId) ID_T1,
contentId
FROM YourTable