Graphically represent SQL Data - sql

Given a table with the following structure with 11+M transactions.
ID ProductKey CloseDate Part PartAge Sales
1 XXXXP1 5/10/15 P1 13 100
2 XXXXP2 6/1/16 P1 0 15
3 XXXXP3 4/1/08 P1 0 280
4 XXXXP1 3/18/11 P1 0 10
5 XXXXP3 6/29/15 P1 45 15
6 XXXXP1 8/11/13 P1 30 360
Products XXXXP1 and XXXXP3 are entered twice since they are resales. Product Age=0 indicates its a new sale. So these products went from:
New Sale --> ReSale --> ReSale
Using a self-joining query, I can retrieve all the products which were resales. But is there a way to display these in a pretty graph or tree format?
Something which depicts the life-span of the sale transaction of the product?
Any ideas will be appreciated.
TIA,
B

Related

how to access repeat purchase records for the next three months without self join?

I have a table with customer transaction information, for example records for one customer (identified by customer_id) look like this:
order_id
bk_date
booking_has_insurance_indicator
1
7/20
0
2
8/2
0
3
8/3
1
4
8/9
1
5
11/6
0
6
12/2
0
7
12/6
0
8
12/7
0
I'd like to find out for each customer, for each order_id, if there's repeat purchase within 90 days and how many of those, also if so, whether there's insurance attached. For example, for order_id = 1, there's three repeat purchase (order_id = 2,3,4) within 90 days and there exist orders with insurance (order_id = 3,4). Ideal output would look like
order_id
bk_date
repeat_count
repeat_has_insurance_indicator
1
7/20
3
1
2
8/2
2
1
3
8/3
2
1
4
8/9
1
0
5
11/6
3
0
6
12/2
2
0
7
12/6
1
0
8
12/7
0
0
I'm aware that if I only want to access the next order record I can use LEAD window function without joining, but with question above, I could only think of self join to join each order_id to the ones with bk_date within 90 days. However, given the volume of the data with millions of customers, self join is also not an option due to memory limit. Could someone help me if there's a more efficient solution?

Is it possible to set a dynamic window frame bound in SQL OVER(ROW BETWEEN ...)-Clause?

Consider the following table, describing a patients medication plan. For example, the first row describes that the patient with patient_id = 1 is treated from timestamp 0 to 4. At time = 0, the patient has not yet become any medication (kum_amount_start = 0). At time = 4, the patient has received a kumulated amount of 100 units of a certain drug. It can be assumed, that the drug is given in with a constant rate. Regarding the first row, this means that the drug is given with a rate of 25 units/h.
patient_id
starttime [h]
endtime [h]
kum_amount_start
kum_amount_end
1
0
4
0
100
1
4
5
100
300
1
5
15
300
550
1
15
18
550
700
2
0
3
0
150
2
3
6
150
350
2
6
10
350
700
2
10
15
700
1100
2
15
19
1100
1500
I want to add the two columns "kum_amount_start_last_6hr" and "kum_amount_end_last_6hr" that describe the amount that has been given within the last 6 hours of the treatment (for the respective timestamps start, end).
I'm stuck with this problem for a while now.
I tried to tackle it with something like this
SUM(kum_amount) OVER (PARTITION BY patient_id ROWS BETWEEN "dynmaic window size" AND CURRENT ROW)
but I'm not sure whether this is the right approach.
I would be very happy if you could help me out here, thanks!

In UniQuery, how do you get the count of unique values found while doing a BREAK.ON

I know I can get the counts for how many individual entries are in each unique groups of records with the following.
LIST CUSTOMER BREAK-ON CITY TOTAL EVAL "1" COL.HDG "Customer Count" TOTAL CUR_BALANCE BY CITY
And I end up with something like this.
Cust...... City...... Customer Count Currently Owes
6 Arvada 1 4.54
********** -------------- --------------
Arvada 1 4.54
190 Boulder 1 0.00
1 Boulder 1 13.65
********** -------------- --------------
Boulder 2 13.65
...
============== ==============
TOTAL 29 85.28
29 records listed
Which becomes this, after we suppress the details and focus on the groups themselves.
City...... Customer Count Currently Owes
Arvada 1 4.54
Boulder 2 13.65
Chicago 3 4.50
Denver 6 0.00
...
============== ==============
TOTAL 29 85.28
29 records listed
But can I get a count of how many unique grouping are in the same report? Something like this.
City...... Customer Count Currently Owes City Count
Arvada 1 4.54 1
Boulder 2 13.65 1
Chicago 3 4.50 1
Denver 6 0.00 1
...
============== ============== ==========
TOTAL 29 85.28 17
29 records listed
Essentially, I want the unique value count integrated into the other report so that I don't have to create an extra report just for something so simple.
SELECT CUSTOMER SAVING UNIQUE CITY
17 records selected to list 0.
I swear that this should be easier. I see various # variables in the documentation that hint at the possibility of doing this easily but I have never been about to get one of them to work.
If your data is structured in such a way that your id is what you would be grouping by and the data you want is stored in Value delimited field and you don't want to include or exclude anything you can use something like the following.
In UniVerse using the CUSTOMER table in the demo HS.SALES account installed on many systems, you can do this. The CUSTID is the the record #ID and Attribute 13 is where there PRICE is stored in a Value delimited array.
LIST CUSTOMER BREAK-ON CUSTID TOTAL EVAL "DCOUNT(#RECORD<13>,#VM)" TOTAL PRICE AS P.PRICE BY CUSTID DET.SUP
Which outputs this.
DCOUNT(#RECORD<13>,#
Customer ID VM)................. P.PRICE
1 1 $4,200
2 3 $19,500
3 1 $4,250
4 1 $16,500
5 2 $3,800
6 0 $0
7 2 $5,480
8 2 $12,900
9 0 $0
10 3 $10,390
11 0 $0
12 0 $0
==================== =======
15 $77,020
That is a little juice for a lot of squeeze, but I hope you find it useful.
Good Luck!
Since the system variable #NB is set only on the total lines, this will allow your counter to calculate the number of TOTAL lines, which occur per unique city, excluding the grand total.
LIST CUSTOMER BREAK-ON CITY TOTAL EVAL "IF #NB < 127 THEN 1 ELSE 0" COL.HDG "Customer Count" TOTAL CUR_BALANCE BY CITY
I don't have a system to try this on, but this is my understanding of the variable.

Query: Employee Training Schedules Based on Position/Workrole

My company sends folks to training. Based on projected new hires/transfers, I was asked to generate a report that estimates the number of seats we need in each course broken out by quarter.
Question: My question is two-fold:
What is the best way to represent a sequence of courses (i.e. prerequisites) in a relational DB?
How do I create the query(-ies) necessary to produce the following desired output:
Desired Output:
ID PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 1 1 1/14/2017 1/14/2017
2 2 1 2/17/2017 2/17/2017
3 2 2 2/18/2017 2/19/2017
4 2 3 2/20/2017 2/20/2017
5 3 49 1/18/2017 2/03/2017
6 …
Background Info: The courses are taken in-sequence: the first few courses are orientation courses for the company, and later courses are more specific to the employee's workrole. There are over 50 different courses, 40 different workroles and we're projecting ~1k new hires/transfers. Each work role must take a sequence of courses in a prescribed order, but I'm having trouble representing this ordering and subsequently writing the necessary query.
Existing Tables:
I have several tables that I've used to store the data: Personnel, LnkPersonnelToWorkroles,Workroles, LnkWorkrolesToCourses, and Courses (there's many others as well, but I omit them for the sake of scoping this question down). Here's some notional data from these tables:
Personnel (These are the projected new hires and their estimated arrival date.)
ID DisplayName RequiredCompletionDate
1 Kristel Bump 10/1/2016
2 Shelton Franke 3/11/2017
3 Shaunda Launer 4/16/2017
4 Clarinda Kestler 3/13/2017
5 My Wimsatt 6/6/2017
6 Gillian Bramer 10/25/2016
7 ...
Workroles (These are the positions in the company)
ID Workrole
1 Manager
2 Secretary
3 Admin Asst.
4 ...
LnkPersonnelToWorkroles (Links projected new hires to their projected workrole)
ID PersonnelID WorkroleID
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 6 1
7 ...
Courses (All courses available)
ID CourseName LengthInDays
1 Orientation 1
2 Email Etiquette 2
3 Workplace Safety 1
4 ...
LnkWorkrolesToCourses
(Links workroles to their required courses in a Many-to-Many relationship)
ID WorkroleID CourseID
1 1 1
2 2 1
3 2 2
4 2 3
5 3 49
6 ...
Thoughts: My approach is to first develop a person-by-person schedule based upon the new hire's target completion date and workrole. Then for each class, I could sum the number of new hires starting in that quarter.
I've considered trying to represent the courses in the most general way I could think of (i.e. using a directed acyclic graph), but since most of the courses have only a single prerequisite course, I think it's much easier to represent the prerequisites using the Prerequisites table below; however, I don't know how I would use this in a query.
Prerequisites (Is this a good idea?)
ID CourseID PrereqCourseID
1 2 1
2 3 1
3 4 1
4 5 4
5 ...
Note: I am not currently concerned with whether or not the courses are actually offered on those days; we will figure out the course schedules once we know approximately how many we need each quarter. Right now, we're trying to estimate the demand for each course.
Edit 1: To clarify the Desired Output table: if the person begins course 1 on day D, then they can't start course 2 until after they finish course 1, i.e. until the next day. For courses with a length L >1 days, the start date for a subsequent courses is delayed L days. Notice this effect playing out for workrole ID 2 in the Desired Output table: He is expected to arrive on 2/17, start and complete course 1 the same day, begin course 2 the next day (on 2/18), and finish course 2 the day after that (on 2/19).
I'm posting this answer because it gives me an approximate solution; other answers are still welcome.
I avoided a prerequisite table altogether and opted for a simpler approach: a partial ordering of the courses.
First, I drew the course prerequisite tree; it looked similar to this image:
I defined a partial ordering of the courses based on their depth in the prerequisite tree. In the picture above, CHM124 and High School Chem w/ Lab are priority 1, CHM152 is priority 2, CHM 153 is priority 3, CHM260 and CHM 270 are priority 4, and so on... This partial ordering was stored in the CoursePriority table:
CoursePriority:
ID CourseID Priority
1 1 1
2 2 2
3 3 3
4 4 3
5 5 4
6 6 3
7 ...
So that no two courses would every be taken at the same time, I perturbed each course's priority by a small random number using the following Update query:
UPDATE CoursePriority SET CoursePriority.Priority = [Priority]+Rnd([ID])/1000;
(I used [ID] as input to the Rnd method to ensure each course was perturbed by a different random number.) I ended up with this:
ID CourseID Priority
1 1 1.000005623
2 2 2.000094955
3 3 3.000036401
4 4 3.000052486
5 5 4.000076711
6 6 3.00000535
7 ...
The approach above answers my first question "What is the best [sensible] way to represent a sequence of courses (i.e. prerequisites) in a relational DB?" Now as for generating the course schedule...
First, I created a query qryLnkCoursesPriorities to link Courses to the CoursePriority table:
SELECT Courses.ID AS CourseID, Courses.DurationInDays, CoursePriority.Priority
FROM Courses INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID;
Result:
CourseID DurationInDays Priority
1 35 1.000076177
2 21 2.000148297
3 28 3.000094352
4 14 3.000081442
5...
Second, I created the qryWorkrolePriorityDelay query:
SELECT LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.CourseID AS CourseID, qryLnkCoursePriorities.Priority, qryLnkCoursePriorities.DurationInDays, ([DurationInDays]+Nz(DSum("DurationInDays","qryLnkCoursePriorities","[Priority]>" & [Priority] & ""))) AS LeadTimeInDays
FROM LnkWorkrolesToCourses INNER JOIN qryLnkCoursePriorities ON LnkWorkrolesToCourses.CourseID = qryLnkCoursePriorities.CourseID
ORDER BY LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.Priority;
Simply put: The qryWorkrolePriorityDelay query tells me how many days in advance each course should be taken to ensure the new hire can complete all subsequent courses prior to their required training completion deadline. It looks like this:
WorkroleID CourseID Priority DurationInDays LeadTimeInDays
1 7 1.000060646 7 147
1 1 1.000076177 35 140
1 2 2.000148297 21 105
1 4 3.000081442 14 84
1 6 3.000082824 14 70
1 3 3.000094352 28 56
1 5 4.000106905 28 28
2...
Finally, I was able to bring this all together to create the qryCourseSchedule query:
SELECT Personnel.ID AS PersonnelID, LnkWorkrolesToCourses.CourseID, [ProjectedHireDate]-[leadTimeInDays] AS ProjectedStartDate, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays] AS ProjectedEndDate
FROM Personnel INNER JOIN (((LnkWorkrolesToCourses INNER JOIN (Courses INNER JOIN qryWorkrolePriorityDelay ON Courses.ID = qryWorkrolePriorityDelay.CourseID) ON (Courses.ID = LnkWorkrolesToCourses.CourseID) AND (LnkWorkrolesToCourses.WorkroleID = qryWorkrolePriorityDelay.WorkroleID)) INNER JOIN LnkPersonnelToWorkroles ON LnkWorkrolesToCourses.WorkroleID = LnkPersonnelToWorkroles.WorkroleID) INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID) ON Personnel.ID = LnkPersonnelToWorkroles.PersonnelID
ORDER BY Personnel.ID, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays];
This query gives me the following output:
PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 7 5/7/2016 5/14/2016
1 1 5/14/2016 6/18/2016
1 2 6/18/2016 7/9/2016
1 4 7/9/2016 7/23/2016
1 6 7/23/2016 8/6/2016
1 3 8/6/2016 9/3/2016
1 5 9/3/2016 10/1/2016
2...
With this output, I created a pivot table, where course start dates were grouped by quarter and counted. This gave me exactly what I needed.

SQL Server 2005 – How to split multiple Insert into... Output Select TOP (z)...From if (Approx) Max number of child rows in a Transaction is known

I have these staging tables:
Order (PK=OrderID),
SubOrder (PK=SubOrderID, FK=OrderID) and
Item (PK=ItemID, FK1=SubOrderID, FK2=OrderID).
I established relationships on the client (C#.NET and copied tables to staging tables in SQL Server using SQLBulCopy).
Now I need to establish Parent/Child/Grand-Child relationships on the server.
I have scripts that can do that (I am using OUTPUT statements along with Insert statements and I output PKs to temporary tables which I later use
to insert child rows).
Notice that initially I had Foreign key relationships between Grand-child and parent (Item and Order) established on the client.
The SubOrder is introduced as a quantity limit (Imagine that as a Maximum items that can fit into Shipment box. All items are of the same size – in my case Item rows are of the same size.)
The main problem: I can have tens of thousands of Items to insert into Production tables, let’s call them: OrderP, SubOrderP and ItemP. I also dynamically generate temp tables: OrderPWithRealPK and SubOrderPWithRealPK which hold just inserted Parent PKs.
I can have as little as 1 Order, 1 Suborder and 1 Item and many times like that or 1 Order, 10 Suborder and in each Suborder up to 100 Item elements (so the distribution of (n) Order , (m) SubOrder and (k) Item elements is not predictable.
In the table below I have these parameters:
N=7 number of Order-s
M=14 number of SubOrder-s
K=23 number of Item-s
L=2 max number of Items in an Suborder
J= Approx. number of items to be inserted in a transaction. (but items that are included need to belong to the same Order, but it may be OK to be together in the same SubOrder)
P=No. of Items in the largest Order. (this can drive what J number can be, but only if we have larger Order-s).
If we have many small Order-s then J can be predetermined. (In our example about 10)
Given (K) number of Items I would like to create relatively equal buckets of elements that can be inserted at once in a transaction, but to be submitted along with their parents and preferably Grand-parents.
Right now I have a manual transaction where I first insert a special field with ‘TR’ value (representing ‘In Transaction’) and do the insert and do an Update with ‘00’ to that field
to denote all the Items belonging to an Order are Inserted and other processes with query that special field for the value ‘00’ .
It would be good If I can avoid this. I think it would be OK to have transactional scope to SubOrder Level if doing automatic transaction (with Begin Trans/End Trans)
If I have a table below let’s say that I would like to have items with these orders to go together when being saved into Item table (of course Item PK will be generated with OUTPUT clause):
- 1, 3, 4, and 5 (9 Items)
- 2 (9 items)
- 6, 7 (4 itmes)
The orders can be inserted in any order and preferably Suborder and Items elements need to be inserted in the order in which they were created.
Imagine that I would use a While loop and TOP (Z) and a proper join Query to select Items (Grand-Child belonging to Parent-s and associated Child elements)
To be inserted in a transaction.
SeqNo OrderID SubOrder ItemID No. of Items
-----------------------------------------------------------------------------------
01 1 1 100 2
02 1 1 101
====================================================
03 2 2 201 9
04 2 2 202
05 2 3 301
06 2 3 302
07 2 4 401
08 2 4 402
09 2 5 501
10 2 5 502
11 2 6 503
===================================================
12 3 7 601 2
13 3 7 602
===================================================
14 4 8 801 1
===================================================
15 5 9 901 5
16 5 9 902
17 5 10 1001
18 5 10 1002
19 5 11 1201
==================================================
20 6 12 1201 1
==================================================
21 7 13 1301 3
22 7 13 1302
23 7 14 1401