PostgreSQL Composite Primary Key and Serial increment? - sql

I'm trying to create a table as follows:
CREATE TABLE SCHEDULE (
SESSIONID SERIAL,
MODULECODE VARCHAR(10),
CONSTRAINT SCHEDULE_FOREIGN_KEY FOREIGN KEY (MODULECODE) REFERENCES MODULES (MODULECODE),
CONSTRAINT SCHEDULE_PRIMARY_KEY PRIMARY KEY (SESSIONID, MODULECODE));
The idea being that SESSION ID would auto increment with each new row but only local to MODULECODE, for example:
----------------------
|SESSIONID|MODULECODE|
|---------|----------|
| 1 | A |
| 2 | A |
| 3 | A |
| 1 | B |
| 2 | B |
| 1 | C |
| 2 | C |
|--------------------|
I believe this is how AUTO_INCREMENT functions in MySQL but I suspect PostgreSQL doesn't work this way. How else would I achieve this in PostgreSQL?

Show the data as suggested by #Juan
select
row_number() over (
partition by modulecode order by modulecode
) as sessionid,
modulecode
from schedule
Then when the user asks for a certain sessionid from a certain module do:
select *
from schedule
where sessionid = (
select sessionid
from (
select
sessionid,
row_number() over (order by sessionid) as module_sessionid
from schedule
where modulecode = 'B'
) s
where module_sessionid = 2
)

as hourse said you cant do it on your db. But you can asign those values in the select
SELECT row_number() over (partition by MODULECODE order by MODULECODE) as SESSIONID,
MODULECODE
FROM YourTable

Related

Check records that have gone through the same status more than once in a row

I have a status history table and I need to know which id_user pass by the same status sequentially.
Table structure
create table user (
id_user number,
user_name number,
status_name char(1),
created_at timestamp,
primary key (id_user)
);
create table user_status_hist (
id_user_status_hist number,
id_user number,
status_name char(1),
updated_at timestamp,
primary key (id_user),
constraint fk foreign key (id_user) references user(id_user)
);
imagine that in the example below, for user 123 it has passed 2 times in a row for status B.
How can i find all cases like this in my table?
select id_user, status_name, updated_at
from user_status_history
where id_user = 123;
--------+-------------+------------+
id_user | status_name | updated_at |
--------+-------------+------------+
123 | A | 2020-11-01 |
--------+-------------+------------+
123 | B | 2020-11-02 |
--------+-------------+------------+
123 | B | 2020-11-05 |
--------+-------------+------------+
With this query i find cases where i have a user that pass more than one time for the same status, but i cannot see if is sequential considering the updated_at column.
select count(*), idt_card
from user_status_hist
group by id_user, status_name
having count(*) > 1;
How can i get a output like this below? (the "count" column would be the number of times he went through these status sequentialy)
--------+-------------+------------+
id_user | status_name | count |
--------+-------------+------------+
123 | A | 3 |
--------+-------------+------------+
456 | B | 2 |
--------+-------------+------------+
789 | B | 6 |
--------+-------------+------------+
Use the LAG() analytic function. Since you must use it in a comparison, and analytic functions can only be computed in the SELECT clause (which comes after all the filters were applied), you must compute the analytic function in a subquery and reference it in an outer query.
select id_user, status_name, updated_at
from (
select id_user, status_name, updated_at,
lag(status_name) over (partition by id_user order by updated_at)
as prev_status
from user_status_hist
)
where status_name = prev_status
;
This will give you the full details of all occurrences. If you then want to group by id_user and status_name and count, you already know how to do that. (You can do it directly in the outer query of the solution shown above.)
You just need to include the columns you want in the select:
select idt_card, status_name, count(*)
from user_status_hist
group by id_user, status_name
having count(*) > 1;

Redshift window function for change in column

I have a redshift table with amongst other things an id and plan_type column and would like a window function group clause where the plan_type changes so that if this is the data for example:
| user_id | plan_type | created |
|---------|-----------|------------|
| 1 | A | 2019-01-01 |
| 1 | A | 2019-01-02 |
| 1 | B | 2019-01-05 |
| 2 | A | 2019-01-01 |
| 2 | A | 2-10-01-05 |
I would like a result like this where I get the first date that the plan_type was "new":
| user_id | plan_type | created |
|---------|-----------|------------|
| 1 | A | 2019-01-01 |
| 1 | B | 2019-01-05 |
| 2 | A | 2019-01-01 |
Is this possible with window functions?
EDIT
Since I have some garbage in the data where plan_type can sometimes be null and the accepted solution does not include the first row (since I can't have the OR is not null I had to make some modifications. Hopefully his will help other people if they have similar issues. The final query is as follows:
SELECT * FROM
(
SELECT
user_id,
plan_type,
created_at,
lag(plan_type) OVER (PARTITION by user_id ORDER BY created_at) as prev_plan,
row_number() OVER (PARTITION by user_id ORDER BY created_at) as rownum
FROM tablename
WHERE plan_type IS NOT NULL
) userHistory
WHERE
userHistory.plan_type <> userHistory.prev_plan
OR userHistory.rownum = 1
ORDER BY created_at;
The plan_type IS NOT NULL filters out bad data at the source table and the outer where clause gets any changes OR the first row of data that would not be included otherwise.
ALSO BE CAREFUL about the created_at timestamp if you are working of your prev_plan field since it would of course give you the time of the new value!!!
This is a gaps-and-islands problem. I think lag() is the simplest approach:
select user_id, plan_type, created
from (select t.*,
lag(plan_type) over (partition by user_id order by created) as prev_plan_type
from t
) t
where prev_plan_type is null or prev_plan_type <> plan_type;
This assumes that plan types can move back to another value and you want each one.
If not, just use aggregation:
select user_id, plan_type, min(created)
from t
group by user_id, plan_type;
use row_number() window function
select * from
(select *,row_number()over(partition by user_id,plan_type order by created) rn
) a where a.rn=1
use lag()
select * from
(
select user_id, plant_type, lag(plan_type) over (partition by user_id order by created) as changes, created
from tablename
)A where plan_type<>changes and changes is not null

Reverse col and rows in SQL

I have to create query, to reverse rows and cols correct. I am using MS SQL SERVER 2016.
This is what I have:
Row_ID | Group_ID | Group_Status | MemberRole | name
2807 | 10568 | accept | chairman | Rajah
2808 | 10568 | accept | member | Vaughan
2812 | 10568 | accept | secretary | Susan
This is what I need:
Group_ID | Status | Chairman | Secretary | Member1 | Member2 | Member3 | ... | Member20
10568 | Accept | Rajah | Susan | Vaughan | Kane | Oprah | ... | Imelda
(users with member role can be between 0-20)
Probably I should use pivot, but I have no idea how.
Ok, I have this code:
SELECT *
FROM
(
SELECT group_id,
group_status,
memberRole,
name
FROM DataGroup
) dataSource PIVOT(MAX(name) FOR memberRole IN([chairman],
[secretary],
[member])) pivotTab;
But I losing rows with members (get only one member), how to extract them to columns?
You can try this with a unioned query:
Some mockup (please provide such a dummy table with your sample data yourself in your next question):
DECLARE #mockup TABLE(Row_ID INT,Group_ID INT,Group_Status VARCHAR(100),MemberRole VARCHAR(100),[name] VARCHAR(100));
INSERT INTO #mockup VALUES
(2807,10568,'accept','chairman','Rajah')
,(2808,10568,'accept','member','Vaughan')
,(2812,10568,'accept','secretary','Susan')
,(2899,10568,'accept','member','Onemore');
--The query
SELECT p.*
FROM
(
SELECT Group_ID
,Group_Status
,[name]
,MemberRole
FROM #mockup
WHERE MemberRole IN('chairman','secretary')
UNION ALL
SELECT Group_ID
,Group_Status
,[name]
,CONCAT('Member',ROW_NUMBER() OVER(PARTITION BY Group_ID ORDER BY Row_ID))
FROM #mockup
WHERE MemberRole='member'
) t
PIVOT
(
MAX([name]) FOR MemberRole IN(Chairman,Secretary,Member1,Member2,Member3 /*add as many as you need*/)
) p;
The result
Group_ID Group_Status Chairman Secretary Member1 Member2 Member3
10568 accept Rajah Susan Vaughan Onemore NULL
In short:
The first part of the query will Just pick the two fix names.
The second part will pick the members and number them sorted by their Row_ID.
The PIVOT will then transform this to a single row, using the column MemberRole for the new column names.
You will have to think about some more things:
What if not all the lines are accepted?
What of there are many groups?
If you need help, you can comeback with a new question. Happy Coding!
I would simply use conditional aggregation:
select group_id, group_status,
max(case when member_role = 'chairman' then name end) as chairman,
max(case when member_role = 'secretary' then name end) as secretary,
max(case when member_role = 'member' and seqnum = 1 then name end) as member_01,
max(case when member_role = 'member' and seqnum = 2 then name end) as member_02,
. . .
from (select m.*,
row_number() over (partition by group_id, member_role order by row_id) as seqnum
from #mockup m
) m
group by group_id, group_status;
I find conditional aggregation to be much more flexible than pivot. This is an example of the situation where the query is simpler.

Partitioning function for continuous sequences

There is a table of the following structure:
CREATE TABLE history
(
pk serial NOT NULL,
"from" integer NOT NULL,
"to" integer NOT NULL,
entity_key text NOT NULL,
data text NOT NULL,
CONSTRAINT history_pkey PRIMARY KEY (pk)
);
The pk is a primary key, from and to define a position in the sequence and the sequence itself for a given entity identified by entity_key. So the entity has one sequence of 2 rows in case if the first row has the from = 1; to = 2 and the second one has from = 2; to = 3. So the point here is that the to of the previous row matches the from of the next one.
The order to determine "next"/"previous" row is defined by pk which grows monotonously (since it's a SERIAL).
The sequence does not have to start with 1 and the to - from does not necessary 1 always. So it can be from = 1; to = 10. What matters is that the "next" row in the sequence matches the to exactly.
Sample dataset:
pk | from | to | entity_key | data
----+--------+------+--------------+-------
1 | 1 | 2 | 42 | foo
2 | 2 | 3 | 42 | bar
3 | 3 | 4 | 42 | baz
4 | 10 | 11 | 42 | another foo
5 | 11 | 12 | 42 | another baz
6 | 1 | 2 | 111 | one one one
7 | 2 | 3 | 111 | one one one two
8 | 3 | 4 | 111 | one one one three
And what I cannot realize is how to partition by "sequences" here so that I could apply window functions to the group that represents a single "sequence".
Let's say I want to use the row_number() function and would like to get the following result:
pk | row_number | entity_key
----+-------------+------------
1 | 1 | 42
2 | 2 | 42
3 | 3 | 42
4 | 1 | 42
5 | 2 | 42
6 | 1 | 111
7 | 2 | 111
8 | 3 | 111
For convenience I created an SQLFiddle with initial seed: http://sqlfiddle.com/#!15/e7c1c
PS: It's not the "give me the codez" question, I made my own research and I just out of ideas how to partition.
It's obvious that I need to LEFT JOIN with the next.from = curr.to, but then it's still not clear how to reset the partition on next.from IS NULL.
PS: It will be a 100 points bounty for the most elegant query that provides the requested result
PPS: the desired solution should be an SQL query not pgsql due to some other limitations that are out of scope of this question.
I don’t know if it counts as “elegant,” but I think this will do what you want:
with Lagged as (
select
pk,
case when lag("to",1) over (order by pk) is distinct from "from" then 1 else 0 end as starts,
entity_key
from history
), LaggedGroups as (
select
pk,
sum(starts) over (order by pk) as groups,
entity_key
from Lagged
)
select
pk,
row_number() over (
partition by groups
order by pk
) as "row_number",
entity_key
from LaggedGroups
Just for fun & completeness: a recursive solution to reconstruct the (doubly) linked lists of records. [ this will not be the fastest solution ]
NOTE: I commented out the ascending pk condition(s) since they are not needed for the connection logic.
WITH RECURSIVE zzz AS (
SELECT h0.pk
, h0."to" AS next
, h0.entity_key AS ek
, 1::integer AS rnk
FROM history h0
WHERE NOT EXISTS (
SELECT * FROM history nx
WHERE nx.entity_key = h0.entity_key
AND nx."to" = h0."from"
-- AND nx.pk > h0.pk
)
UNION ALL
SELECT h1.pk
, h1."to" AS next
, h1.entity_key AS ek
, 1+zzz.rnk AS rnk
FROM zzz
JOIN history h1
ON h1.entity_key = zzz.ek
AND h1."from" = zzz.next
-- AND h1.pk > zzz.pk
)
SELECT * FROM zzz
ORDER BY ek,pk
;
You can use generate_series() to generate all the rows between the two values. Then you can use the difference of row numbers on that:
select pk, "from", "to",
row_number() over (partition by entity_key, min(grp) order by pk) as row_number
from (select h.*,
(row_number() over (partition by entity_key order by ind) -
ind) as grp
from (select h.*, generate_series("from", "to" - 1) as ind
from history h
) h
) h
group by pk, "from", "to", entity_key
Because you specify that the difference is between 1 and 10, this might actually not have such bad performance.
Unfortunately, your SQL Fiddle isn't working right now, so I can't test it.
Well,
this not exactly one SQL query but:
select a.pk as PK, a.entity_key as ENTITY_KEY, b.pk as BPK, 0 as Seq into #tmp
from history a left join history b on a."to" = b."from" and a.pk = b.pk-1
declare #seq int
select #seq = 1
update #tmp set Seq = case when (BPK is null) then #seq-1 else #seq end,
#seq = case when (BPK is null) then #seq+1 else #seq end
select pk, entity_key, ROW_NUMBER() over (PARTITION by entity_key, seq order by pk asc)
from #tmp order by pk
This is in SQL Server 2008

Selecting row with highest ID based on another column

In SQL Server 2008 R2, suppose I have a table layout like this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 1 | 1 | TEST 1 |
| 2 | 1 | TEST 2 |
| 3 | 3 | TEST 3 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 6 | 6 | TEST 6 |
| 7 | 6 | TEST 7 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Is it possible to select every row with the highest UniqueID number, for each GroupID. So according to the table above - if I ran the query, I would expect this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 2 | 1 | TEST 2 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Been chomping on this for a while, but can't seem to crack it.
Many thanks,
SELECT *
FROM (SELECT uniqueid, groupid, title,
Row_number()
OVER ( partition BY groupid ORDER BY uniqueid DESC) AS rn
FROM table) a
WHERE a.rn = 1
With SQL-Server as rdbms you can use a ranking function like ROW_NUMBER:
WITH CTE AS
(
SELECT UniqueID, GroupID, Title,
RN = ROW_NUMBER() OVER (PARTITON BY GroupID
ORDER BY UniqueID DESC)
FROM dbo.TableName
)
SELECT UniqueID, GroupID, Title
FROM CTE
WHERE RN = 1
This returns exactly one record for each GroupID even if there are multiple rows with the highest UniqueID (the name does not suggest so). If you want to return all rows in then use DENSE_RANK instead of ROW_NUMBER.
Here you can see all functions and how they work: http://technet.microsoft.com/en-us/library/ms189798.aspx
Since you have not mentioned any RDBMS, this statement below will work on almost all RDBMS. The purpose of the subquery is to get the greatest uniqueID for every GROUPID. To be able to get the other columns, the result of the subquery is joined on the original table.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT GroupID, MAX(uniqueID) uniqueID
FROM tableName
GROUP By GroupID
) b ON a.GroupID = b.GroupID
AND a.uniqueID = b.uniqueID
In the case that your RDBMS supports Qnalytic functions, you can use ROW_NUMBER()
SELECT uniqueid, groupid, title
FROM
(
SELECT uniqueid, groupid, title,
ROW_NUMBER() OVER (PARTITION BY groupid
ORDER BY uniqueid DESC) rn
FROM tableName
) x
WHERE x.rn = 1
TSQL Ranking Functions
The ROW_NUMBER() generates sequential number which you can filter out. In this case the sequential number is generated on groupid and sorted by uniqueid in descending order. The greatest uniqueid will have a value of 1 in rn.
SELECT *
FROM the_table tt
WHERE NOT EXISTS (
SELECT *
FROM the_table nx
WHERE nx.GroupID = tt.GroupID
AND nx.UniqueID > tt.UniqueID
)
;
Should work in any DBMS (no window functions or CTEs are needed)
is probably faster than a sub query with an aggregate
Keeping it simple:
select * from test2
where UniqueID in (select max(UniqueID) from test2 group by GroupID)
Considering:
create table test2
(
UniqueID numeric,
GroupID numeric,
Title varchar(100)
)
insert into test2 values(1,1,'TEST 1')
insert into test2 values(2,1,'TEST 2')
insert into test2 values(3,3,'TEST 3')
insert into test2 values(4,3,'TEST 4')
insert into test2 values(5,5,'TEST 5')
insert into test2 values(6,6,'TEST 6')
insert into test2 values(7,6,'TEST 7')
insert into test2 values(8,6,'TEST 8')