In UniQuery, how do you get the count of unique values found while doing a BREAK.ON - unidata

I know I can get the counts for how many individual entries are in each unique groups of records with the following.
LIST CUSTOMER BREAK-ON CITY TOTAL EVAL "1" COL.HDG "Customer Count" TOTAL CUR_BALANCE BY CITY
And I end up with something like this.
Cust...... City...... Customer Count Currently Owes
6 Arvada 1 4.54
********** -------------- --------------
Arvada 1 4.54
190 Boulder 1 0.00
1 Boulder 1 13.65
********** -------------- --------------
Boulder 2 13.65
...
============== ==============
TOTAL 29 85.28
29 records listed
Which becomes this, after we suppress the details and focus on the groups themselves.
City...... Customer Count Currently Owes
Arvada 1 4.54
Boulder 2 13.65
Chicago 3 4.50
Denver 6 0.00
...
============== ==============
TOTAL 29 85.28
29 records listed
But can I get a count of how many unique grouping are in the same report? Something like this.
City...... Customer Count Currently Owes City Count
Arvada 1 4.54 1
Boulder 2 13.65 1
Chicago 3 4.50 1
Denver 6 0.00 1
...
============== ============== ==========
TOTAL 29 85.28 17
29 records listed
Essentially, I want the unique value count integrated into the other report so that I don't have to create an extra report just for something so simple.
SELECT CUSTOMER SAVING UNIQUE CITY
17 records selected to list 0.

I swear that this should be easier. I see various # variables in the documentation that hint at the possibility of doing this easily but I have never been about to get one of them to work.
If your data is structured in such a way that your id is what you would be grouping by and the data you want is stored in Value delimited field and you don't want to include or exclude anything you can use something like the following.
In UniVerse using the CUSTOMER table in the demo HS.SALES account installed on many systems, you can do this. The CUSTID is the the record #ID and Attribute 13 is where there PRICE is stored in a Value delimited array.
LIST CUSTOMER BREAK-ON CUSTID TOTAL EVAL "DCOUNT(#RECORD<13>,#VM)" TOTAL PRICE AS P.PRICE BY CUSTID DET.SUP
Which outputs this.
DCOUNT(#RECORD<13>,#
Customer ID VM)................. P.PRICE
1 1 $4,200
2 3 $19,500
3 1 $4,250
4 1 $16,500
5 2 $3,800
6 0 $0
7 2 $5,480
8 2 $12,900
9 0 $0
10 3 $10,390
11 0 $0
12 0 $0
==================== =======
15 $77,020
That is a little juice for a lot of squeeze, but I hope you find it useful.
Good Luck!

Since the system variable #NB is set only on the total lines, this will allow your counter to calculate the number of TOTAL lines, which occur per unique city, excluding the grand total.
LIST CUSTOMER BREAK-ON CITY TOTAL EVAL "IF #NB < 127 THEN 1 ELSE 0" COL.HDG "Customer Count" TOTAL CUR_BALANCE BY CITY
I don't have a system to try this on, but this is my understanding of the variable.

Related

how to access repeat purchase records for the next three months without self join?

I have a table with customer transaction information, for example records for one customer (identified by customer_id) look like this:
order_id
bk_date
booking_has_insurance_indicator
1
7/20
0
2
8/2
0
3
8/3
1
4
8/9
1
5
11/6
0
6
12/2
0
7
12/6
0
8
12/7
0
I'd like to find out for each customer, for each order_id, if there's repeat purchase within 90 days and how many of those, also if so, whether there's insurance attached. For example, for order_id = 1, there's three repeat purchase (order_id = 2,3,4) within 90 days and there exist orders with insurance (order_id = 3,4). Ideal output would look like
order_id
bk_date
repeat_count
repeat_has_insurance_indicator
1
7/20
3
1
2
8/2
2
1
3
8/3
2
1
4
8/9
1
0
5
11/6
3
0
6
12/2
2
0
7
12/6
1
0
8
12/7
0
0
I'm aware that if I only want to access the next order record I can use LEAD window function without joining, but with question above, I could only think of self join to join each order_id to the ones with bk_date within 90 days. However, given the volume of the data with millions of customers, self join is also not an option due to memory limit. Could someone help me if there's a more efficient solution?

increase rank based on particular value in column

I would appreciate some help for below issue. I have below table
id
items
1
Product
2
Tea
3
Coffee
4
Sugar
5
Product
6
Rice
7
Wheat
8
Product
9
Beans
10
Oil
I want output like below. Basically I want to increase the rank when item is 'Product'. May I know how can I do that? For data privacy and compliance purposes I have modified the data and column names
id
items
ranks
1
Product
1
2
Tea
1
3
Coffee
1
4
Sugar
1
5
Product
2
6
Rice
2
7
Wheat
2
8
Product
3
9
Beans
3
10
Oil
3
I have tried Lag and lead functions but unable to get expected output
Here is solution using a derived value of 1 or 0 to denote data boundaries SUM'ed up with the ROWS UNBOUNDED PRECEDING option, which is key here.
SELECT
id,
items,
SUM(CASE WHEN items='Product' THEN 1 ELSE 0 END) OVER (ORDER BY id ROWS UNBOUNDED PRECEDING) as ranks
FROM

SQL - Sum a row of values based on Dates

I have following type of data
ingredients:
Milk
Apple
Rice
...
Then its purchased Date
26.10.2020
25.10.2020
etc
Each item is recorded when its purchased.
I want now to get at the right hand side to see how many times I bought apples, rice & milk.
As now I only see
Dates ---> 25.10.2020|24.10.2020
Rice 1 NULL
Milk 1 1
Apples NULL 1
My Goal is to see:
Dates ---> 25.10.2020|24.10.2020 SUM
Rice 1 NULL 1
Milk 1 1 2
Apples NULL 1 1
Thank you for your support!
The example of the data
Now I want to see at the end to total SUM, as there would be multiple days.

Query: Employee Training Schedules Based on Position/Workrole

My company sends folks to training. Based on projected new hires/transfers, I was asked to generate a report that estimates the number of seats we need in each course broken out by quarter.
Question: My question is two-fold:
What is the best way to represent a sequence of courses (i.e. prerequisites) in a relational DB?
How do I create the query(-ies) necessary to produce the following desired output:
Desired Output:
ID PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 1 1 1/14/2017 1/14/2017
2 2 1 2/17/2017 2/17/2017
3 2 2 2/18/2017 2/19/2017
4 2 3 2/20/2017 2/20/2017
5 3 49 1/18/2017 2/03/2017
6 …
Background Info: The courses are taken in-sequence: the first few courses are orientation courses for the company, and later courses are more specific to the employee's workrole. There are over 50 different courses, 40 different workroles and we're projecting ~1k new hires/transfers. Each work role must take a sequence of courses in a prescribed order, but I'm having trouble representing this ordering and subsequently writing the necessary query.
Existing Tables:
I have several tables that I've used to store the data: Personnel, LnkPersonnelToWorkroles,Workroles, LnkWorkrolesToCourses, and Courses (there's many others as well, but I omit them for the sake of scoping this question down). Here's some notional data from these tables:
Personnel (These are the projected new hires and their estimated arrival date.)
ID DisplayName RequiredCompletionDate
1 Kristel Bump 10/1/2016
2 Shelton Franke 3/11/2017
3 Shaunda Launer 4/16/2017
4 Clarinda Kestler 3/13/2017
5 My Wimsatt 6/6/2017
6 Gillian Bramer 10/25/2016
7 ...
Workroles (These are the positions in the company)
ID Workrole
1 Manager
2 Secretary
3 Admin Asst.
4 ...
LnkPersonnelToWorkroles (Links projected new hires to their projected workrole)
ID PersonnelID WorkroleID
1 1 1
2 2 1
3 3 1
4 4 1
5 5 1
6 6 1
7 ...
Courses (All courses available)
ID CourseName LengthInDays
1 Orientation 1
2 Email Etiquette 2
3 Workplace Safety 1
4 ...
LnkWorkrolesToCourses
(Links workroles to their required courses in a Many-to-Many relationship)
ID WorkroleID CourseID
1 1 1
2 2 1
3 2 2
4 2 3
5 3 49
6 ...
Thoughts: My approach is to first develop a person-by-person schedule based upon the new hire's target completion date and workrole. Then for each class, I could sum the number of new hires starting in that quarter.
I've considered trying to represent the courses in the most general way I could think of (i.e. using a directed acyclic graph), but since most of the courses have only a single prerequisite course, I think it's much easier to represent the prerequisites using the Prerequisites table below; however, I don't know how I would use this in a query.
Prerequisites (Is this a good idea?)
ID CourseID PrereqCourseID
1 2 1
2 3 1
3 4 1
4 5 4
5 ...
Note: I am not currently concerned with whether or not the courses are actually offered on those days; we will figure out the course schedules once we know approximately how many we need each quarter. Right now, we're trying to estimate the demand for each course.
Edit 1: To clarify the Desired Output table: if the person begins course 1 on day D, then they can't start course 2 until after they finish course 1, i.e. until the next day. For courses with a length L >1 days, the start date for a subsequent courses is delayed L days. Notice this effect playing out for workrole ID 2 in the Desired Output table: He is expected to arrive on 2/17, start and complete course 1 the same day, begin course 2 the next day (on 2/18), and finish course 2 the day after that (on 2/19).
I'm posting this answer because it gives me an approximate solution; other answers are still welcome.
I avoided a prerequisite table altogether and opted for a simpler approach: a partial ordering of the courses.
First, I drew the course prerequisite tree; it looked similar to this image:
I defined a partial ordering of the courses based on their depth in the prerequisite tree. In the picture above, CHM124 and High School Chem w/ Lab are priority 1, CHM152 is priority 2, CHM 153 is priority 3, CHM260 and CHM 270 are priority 4, and so on... This partial ordering was stored in the CoursePriority table:
CoursePriority:
ID CourseID Priority
1 1 1
2 2 2
3 3 3
4 4 3
5 5 4
6 6 3
7 ...
So that no two courses would every be taken at the same time, I perturbed each course's priority by a small random number using the following Update query:
UPDATE CoursePriority SET CoursePriority.Priority = [Priority]+Rnd([ID])/1000;
(I used [ID] as input to the Rnd method to ensure each course was perturbed by a different random number.) I ended up with this:
ID CourseID Priority
1 1 1.000005623
2 2 2.000094955
3 3 3.000036401
4 4 3.000052486
5 5 4.000076711
6 6 3.00000535
7 ...
The approach above answers my first question "What is the best [sensible] way to represent a sequence of courses (i.e. prerequisites) in a relational DB?" Now as for generating the course schedule...
First, I created a query qryLnkCoursesPriorities to link Courses to the CoursePriority table:
SELECT Courses.ID AS CourseID, Courses.DurationInDays, CoursePriority.Priority
FROM Courses INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID;
Result:
CourseID DurationInDays Priority
1 35 1.000076177
2 21 2.000148297
3 28 3.000094352
4 14 3.000081442
5...
Second, I created the qryWorkrolePriorityDelay query:
SELECT LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.CourseID AS CourseID, qryLnkCoursePriorities.Priority, qryLnkCoursePriorities.DurationInDays, ([DurationInDays]+Nz(DSum("DurationInDays","qryLnkCoursePriorities","[Priority]>" & [Priority] & ""))) AS LeadTimeInDays
FROM LnkWorkrolesToCourses INNER JOIN qryLnkCoursePriorities ON LnkWorkrolesToCourses.CourseID = qryLnkCoursePriorities.CourseID
ORDER BY LnkWorkrolesToCourses.WorkroleID, qryLnkCoursePriorities.Priority;
Simply put: The qryWorkrolePriorityDelay query tells me how many days in advance each course should be taken to ensure the new hire can complete all subsequent courses prior to their required training completion deadline. It looks like this:
WorkroleID CourseID Priority DurationInDays LeadTimeInDays
1 7 1.000060646 7 147
1 1 1.000076177 35 140
1 2 2.000148297 21 105
1 4 3.000081442 14 84
1 6 3.000082824 14 70
1 3 3.000094352 28 56
1 5 4.000106905 28 28
2...
Finally, I was able to bring this all together to create the qryCourseSchedule query:
SELECT Personnel.ID AS PersonnelID, LnkWorkrolesToCourses.CourseID, [ProjectedHireDate]-[leadTimeInDays] AS ProjectedStartDate, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays] AS ProjectedEndDate
FROM Personnel INNER JOIN (((LnkWorkrolesToCourses INNER JOIN (Courses INNER JOIN qryWorkrolePriorityDelay ON Courses.ID = qryWorkrolePriorityDelay.CourseID) ON (Courses.ID = LnkWorkrolesToCourses.CourseID) AND (LnkWorkrolesToCourses.WorkroleID = qryWorkrolePriorityDelay.WorkroleID)) INNER JOIN LnkPersonnelToWorkroles ON LnkWorkrolesToCourses.WorkroleID = LnkPersonnelToWorkroles.WorkroleID) INNER JOIN CoursePriority ON Courses.ID = CoursePriority.CourseID) ON Personnel.ID = LnkPersonnelToWorkroles.PersonnelID
ORDER BY Personnel.ID, [ProjectedHireDate]-[leadTimeInDays]+[Courses].[DurationInDays];
This query gives me the following output:
PersonnelID CourseID ProjectedStartDate ProjectedEndDate
1 7 5/7/2016 5/14/2016
1 1 5/14/2016 6/18/2016
1 2 6/18/2016 7/9/2016
1 4 7/9/2016 7/23/2016
1 6 7/23/2016 8/6/2016
1 3 8/6/2016 9/3/2016
1 5 9/3/2016 10/1/2016
2...
With this output, I created a pivot table, where course start dates were grouped by quarter and counted. This gave me exactly what I needed.

Graphically represent SQL Data

Given a table with the following structure with 11+M transactions.
ID ProductKey CloseDate Part PartAge Sales
1 XXXXP1 5/10/15 P1 13 100
2 XXXXP2 6/1/16 P1 0 15
3 XXXXP3 4/1/08 P1 0 280
4 XXXXP1 3/18/11 P1 0 10
5 XXXXP3 6/29/15 P1 45 15
6 XXXXP1 8/11/13 P1 30 360
Products XXXXP1 and XXXXP3 are entered twice since they are resales. Product Age=0 indicates its a new sale. So these products went from:
New Sale --> ReSale --> ReSale
Using a self-joining query, I can retrieve all the products which were resales. But is there a way to display these in a pretty graph or tree format?
Something which depicts the life-span of the sale transaction of the product?
Any ideas will be appreciated.
TIA,
B