Create a pivot table from two tables based on dates - sql

I have two MS Access tables sharing a one to many relationship. Their structures are like the following:
tbl_Persons
+----------+------------+-----------+
| PersonID | PersonName | OtherData |
+----------+------------+-----------+
| 1 | PersonA | etc. |
| 2 | PersonB | |
| 3 | PersonC | |
tbl_Visits
+----------+------------+------------+-----------------------
| VisitID | PersonID | VisitDate | dozens of other fields
+----------+------------+------------+-----------
| 1 | 1 | 09/01/13 |
| 2 | 1 | 09/02/13 |
| 3 | 2 | 09/03/13 |
| 4 | 2 | 09/04/13 | etc...
I wish to create a new table based on the VisitDate field, the column headings of which are Visit-n where n is 1 to the number of visits, Visit-n-Data1, Visit-n-Data2, Visit-n-Data3 etc.
MergedTable
+----------+----------+---------------+-----------------+----------+----------------+
| PersonID | Visit1 | Visit1Data1 | Visit1Data2... | Visit2 | Visit2Data1... |
+----------+----------+---------------+-----------
| 1 | 09/01/13 | | | 09/02/13 |
| 2 | 09/03/13 | | | 09/04/13 |
| 3 | etc. | |
I am really not sure how to do this. Whether SQL query or using DAO then looping through records and columns. It is essential that there is only 1 PersonID per row and all his data appears chronologically into columns.

Start of by ranking the visits with something like
SELECT PersonID, VisitID,
(SELECT COUNT(VisitID) FROM tbl_Visits AS C
WHERE C.PersonID = tbl_Visits.PersonID
AND C.VisitDate < tbl_Visits.VisitDate) AS RankNumber
FROM tbl_Visits
Use this query as a base for the 'pivot'
Since you seem to have some visits of persons on the same day (visit 1 and 2) the WHERE clause needs to be a bit more sophisticated. But I hope you get the basic concept.
Pivoting can be done with multiple LEFT JOINs.
I question if my solution will have a high performance, since I did not test it. It is easier in SQL Server than in MS Access to accomplish.

Related

SQL on Self Table Join

I am trying to do a simple self-join SQL and a join to a 2nd table and for the life of me I can't figure it out. I've done some research and can't seem to glean the answer from similar questions. This query is for MS-Access running in VB.NET.
I have 2 tables:
TodaysTeams
-----------
TeamNum PlayerName PlayerID
------- ---------- --------
1 Mark 100
1 Brian 101
2 Mike 102
2 Mike 102
(Note the last 2 rows above are not a typo. In this case a player can be paired with themselves to form a team)
TodaysTeamsPoints
-----------------
TeamNum Points
------- ------
1 90
2 85
The result I want is (2 rows, 1 for each team):
TeamNum PlayerName1 PlayerName2 Points
------- ----------- ----------- ------
1 Mark Brian 90
2 Mike Mike 85
Here is my SQL:
SELECT DISTINCT A.TeamNum, A.PlayerName as PlayerName1, B.PlayerName AS PlayerName2, C.Points
FROM ((TodaysTeams A INNER JOIN
TodaysTeamsPoints C ON A.TeamNum = C.TeamNum) INNER JOIN
TodaysTeams B ON A.TeamNum = B.TeamNum)
ORDER BY C.Points DESC
I know I am missing another join as I'm returning a cartesian produce (i.e. too many rows).
I would appreciate help as to what I am missing here.
Thank you.
Whilst Gordon's suggested method will work well providing that there are at most two players per team, the method breaks down if ever you add another team member and wish to display them in a separate column.
The difficulty in displaying the data in a manner that you can describe logically but cannot easily produce using a query usually implies that the database structure is sub optimal.
For your particular setup, I would personally recommend the following structure:
+---------------+ +----------+------------+
| Players | | PlayerID | PlayerName |
+---------------+ +----------+------------+
| PlayerID (PK) | | 100 | Mark |
| PlayerName | | 101 | Brian |
+---------------+ | 102 | Mike |
+----------+------------+
+-------------+ +--------+----------+
| Teams | | TeamID | TeamName |
+-------------+ +--------+----------+
| TeamID (PK) | | 1 | Team1 |
| TeamName | | 2 | Team2 |
+-------------+ +--------+----------+
+-------------------+ +--------+--------------+----------+
| TeamPlayers | | TeamID | TeamPlayerID | PlayerID |
+-------------------+ +--------+--------------+----------+
| TeamID (PK) | | 1 | 1 | 100 |
| TeamPlayerID (PK) | | 1 | 2 | 101 |
| PlayerID (FK) | | 2 | 1 | 102 |
+-------------------+ | 2 | 2 | 102 |
+--------+--------------+----------+
Using this method, you can use condition aggregation or a crosstab query pivoting on the TeamPlayerID to produce each of your columns, and you would not be limited to two columns.
You can use aggregation:
SELECT ttp.TeamNum, MIN(tt.PlayerName) as PlayerName1,
MAX(tt.PlayerName) as PlayerName2,
ttp.Points
FROM TodaysTeamsPoints as ttp INNER JOIN
TodaysTeams as tt
ON tt.TeamNum = ttp.TeamNum
GROUP BY ttp.TeamNum, ttp.Points
ORDER BY ttp.Points DESC;

SQL: tricky question for finding lockout dates

Hope you can help. We have a table with two columns Customer_ID and Trip_Date. The customer receives 15% off on their first visit and on every visit where they haven't received the 15% off offer in the past thirty days. How do I write a single SQL query that finds all days where a customer received 15% off?
The table looks like this
+-----+-------+----------+
| Customer_ID | date |
+-----+-------+----------+
| 1 | 01-01-17 |
| 1 | 01-17-17 |
| 1 | 02-04-17 |
| 1 | 03-01-17 |
| 1 | 03-15-17 |
| 1 | 04-29-17 |
| 1 | 05-18-17 |
+-----+-------+----------+
The desired output would look like this:
+-----+-------+----------+--------+----------+
| Customer_ID | date | received_discount |
+-----+-------+----------+--------+----------+
| 1 | 01-01-17 | 1 |
| 1 | 01-17-17 | 0 |
| 1 | 02-04-17 | 1 |
| 1 | 03-01-17 | 0 |
| 1 | 03-15-17 | 1 |
| 1 | 04-29-17 | 1 |
| 1 | 05-18-17 | 0 |
+-----+-------+----------+--------+----------+
We are doing this work in Netezza. I can't think of a way using just window functions, only using recursion and looping. Is there some clever trick that I'm missing?
Thanks in advance,
GF
You didn't tell us what your backend is, nor you gave some sample data and expected output nor you gave a sensible data schema :( This is an example based on guess of schema using postgreSQL as backend (would be too messy as a comment):
(I think you have Customer_Id, Trip_Date and LocationId in trips table?)
select * from trips t1
where not exists (
select * from trips t2
where t1.Customer_id = t2.Customer_id and
t1.Trip_Date > t2.Trip_Date
and t1.Trip_date - t2.Trip_Date < 30
);

Calculate Final outcome based on Results/ID

For a Table T1
+----------+-----------+-----------------+
| PersonID | Date | Employment |
+----------+-----------+-----------------+
| 1 | 2/28/2017 | Stayed the same |
| 1 | 4/21/2017 | Stayed the same |
| 1 | 5/18/2017 | Stayed the same |
| 2 | 3/7/2017 | Improved |
| 2 | 4/1/2017 | Stayed the same |
| 2 | 6/1/2017 | Stayed the same |
| 3 | 3/28/2016 | Improved |
| 3 | 5/4/2016 | Improved |
| 3 | 4/19/2017 | Worsened |
| 4 | 5/19/2016 | Worsened |
| 4 | 2/16/2017 | Improved |
+----------+-----------+-----------------+
I'm trying to calculate a Final Result field partitioning on Employment/PersonID fields, based on the latest result/person relative to prior results. What I mean by that is explained in the logic behind Final Result:
For every Person,
If all results/person are Stayed the same, then only should final
result for that person be "Stayed the same"
If Worsened/Improved
are in the result set for a person, the final result should be the
latest Worsened/Improved result for that person, irrespective of "Stayed the same" after a W/I result.
Eg:
Person 1 Final result -> Stayed the same, as per (1)
Person 2 Final result -> Improved, as per (2)
Person 3 Final result -> Worsened, as per (2)
Person 4 Final result -> Improved, as per (2)
Desired Result:
+----------+-----------------+
| PersonID | Final Result |
+----------+-----------------+
| 1 | Stayed the same |
| 2 | Improved |
| 3 | Worsened |
| 4 | Improved |
+----------+-----------------+
I know this might involve Window functions or Sub-queries but I'm struggling to code this.
Hmmm. This is a prioritization query. That sounds like row_number() is called for:
select t1.personid, t1.employment
from (select t1.*,
row_number() over (partition by personid
order by (case when employment <> 'Stayed the same' then 1 else 2 end),
date desc
) as seqnum
from t1
) t1
where seqnum = 1;

Window functions limited by value in separate column

I have a "responses" table in my postgres database that looks like
| id | question_id |
| 1 | 1 |
| 2 | 2 |
| 3 | 1 |
| 4 | 2 |
| 5 | 2 |
I want to produce a table with the response and question id, as well as the id of the previous response with that same question id, as such
| id | question_id | lag_resp_id |
| 1 | 1 | |
| 2 | 2 | |
| 3 | 1 | 1 |
| 4 | 2 | 2 |
| 5 | 2 | 4 |
Obviously pulling "lag(responses.id) over (order by responses.id)" will pull the previous response id regardless of question_id. I attempted the below subquery, but I know it is wrong since I am basically making a table of all lag ids for each question id in the subquery.
select
responses.question_id,
responses.id as response_id,
(select
lag(r2.id, 1) over (order by r2.id)
from
responses as r2
where
r2.question_id = responses.question_id
)
from
responses
I don't know if I'm on the right track with the subquery, or if I need to do something more advanced (which may involve "partition by", which I do not know how to use).
Any help would be hugely appreciated.
Use partition by. There is no need for a correlated subquery here.
select id,question_id,
lag(id) over (partition by question_id order by id) lag_resp_id
from responses

Filter by value in last row of LEFT OUTER JOIN table

I have a Clients table in PostgreSQL (version 9.1.11), and I would like to write a query to filter that table. The query should return only clients which meet one of the following conditions:
--The client's last order (based on orders.created_at) has a fulfill_by_date in the past.
OR
--The client has no orders at all
I've looked for around 2 months, on and off, for a solution.
I've looked at custom last aggregate functions in Postgres, but could not get them to work, and feel there must be a built-in way to do this.
I've also looked at Postgres last_value window functions, but most of the examples are of a single table, not of a query joining multiple tables.
Any help would be greatly appreciated! Here is a sample of what I am going for:
Clients table:
| client_id | client_name |
----------------------------
| 1 | FirstClient |
| 2 | SecondClient |
| 3 | ThirdClient |
Orders table:
| order_id | client_id | fulfill_by_date | created_at |
-------------------------------------------------------
| 1 | 1 | 3000-01-01 | 2013-01-01 |
| 2 | 1 | 1999-01-01 | 2013-01-02 |
| 3 | 2 | 1999-01-01 | 2013-01-01 |
| 4 | 2 | 3000-01-01 | 2013-01-02 |
Desired query result:
| client_id | client_name |
----------------------------
| 1 | FirstClient |
| 3 | ThirdClient |
Try it this way
SELECT c.client_id, c.client_name
FROM clients c LEFT JOIN
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY created_at DESC) rnum
FROM orders
) o
ON c.client_id = o.client_id
AND o.rnum = 1
WHERE o.fulfill_by_date < CURRENT_DATE
OR o.order_id IS NULL
Output:
| CLIENT_ID | CLIENT_NAME |
|-----------|-------------|
| 1 | FirstClient |
| 3 | ThirdClient |
Here is SQLFiddle demo