SQL query for sales pipeline - sql

I need to build a sales pipeline with one query in SQL (Big Query).
The table has columns:
-timestamp (event time)
-id (user id)
-event
Each event is a number from 1 to 8. And I need to calculate how many unique users there were at each step.
Each step is counted only if the previous steps have been completed.
It is not necessary to go through them straight one after the other, but the main thing is that before each step, n-1 step was taken earlier.
If you sort the table by 'timestamp', you often get such sequences for one 'id' at one day:
4, 4, 1, 1, 3, 6, 5, 5, 6, 5, 6, 7, 8, 1, 2, 5, 3, 4.
In this example, the longest sequence is 1, 2, 3, 4.
The sequence is counted in one day!
I failed to solve the problem through the max/min/lag/lead window functions. I even did a 'case' with a sequential comparison with lag+n values.
I wasted 2 days for this task(

Related

get the user only 100 rows from database that he hasn't seen yet

I need any kind of DB that will be able to do my task which is to know if there is any way to grab only data that the user hasn't pulled from the databases yet for example:
I have a user that pulls 5 country names from the database each time, and when he finishes viewing them I want him to get 5 more country names that he didn't pull till now.
can you help me find a way to do it?
*sorry for my English
The key to this is ordered results:
select id, name from country where id > #highest_id_so_far order by id limit 5;
Start with a negative #highest_id_so_far. You get the first entries, say, IDs 1, 4, 5, 6, 7.
The highest ID returned was 7, so query with #highest_id_so_far = 7 then and you get the next five rows (e.g. 8, 10, 12, 23, 24). And so on.

Checking for occurrence of value in postgres - NodeJS

Within my database my data can look one of two ways
1 -
hh_match_count: 5,
hh_total_fhc_0: 6,
hh_total_fhc_1: 5,
hh_total_fhc_2: 3,
hh_total_fhc_3: 2,
hh_total_fhc_4: 4
2 -
hh_match_count: 3,
hh_total_fhc_0: 6,
hh_total_fhc_1: 5,
hh_total_fhc_2: 3,
hh_total_fhc_3: null,
hh_total_fhc_4: null
What I want to do is calculate the number of times a value is >= 1 (will want to expand this to >= 2, >= 3 etc) from each of hh_total_fhc_0, hh_total_fhc_1, hh_total_fhc_2, hh_total_fhc_3, hh_total_fhc_4 and then divide that by hh_match_count. So basically getting the % of occurrences.
What query should I be looking at executing here? Slowly getting more involved with SQL statements.
coalesce returns the first non-null value it's passed. That turns your null values into zeroes, since you need to count them as zero for the average. Next step is to add least to the mix:
SELECT least(1, coalesce(hh_total_fhc_0, 0)) FROM fixtures gives you a 0 if the value is zero (crucially, null isn't a number so least(1, null) is 1!), and a 1 if it's a positive value. Apply that to each of your columns and then you can calculate the hit percentage exactly as you were thinking.

How to get the most recent rows in a group

I have a Rails 4.2.5.x project running PostGres. I have a table with a similar structure to this:
id, contact_id, date, domain, f1, f2, f3, etc
1, ABC, 01-01-16, abc.com, 1, 2, 3, ...
2, ABC, 01-01-15, abc.com, 1, 2, 3, ...
3, ABC, 01-01-14, abc.com, 1, 2, 3, ...
4, DEF, 01-01-15, abc.com, 1, 2, 3, ...
5, DEF, 01-01-14, abc.com, 1, 2, 3, ...
6, GHI, 01-11-16, abc.com, 1, 2, 3, ...
7, GHI, 01-01-16, abc.com, 1, 2, 3, ...
8, GHI, 01-01-15, abc.com, 1, 2, 3, ...
9, GHI, 01-01-14, abc.com, 1, 2, 3, ...
...
...
99, ZZZ, 01-01-16, xyz.com, 1, 2, 3, ...
I need to query to find:
The most recent rows by date
filtered by domain
for a distinct contact_id (grouped by?)
row-limited result. In this example, I'm not adding this complication but this needs to be factored in. If there are 50 distinct contacts, I am only interested in the top 3 by date.
ID is the primary key.
there are indexes on the other columns
the fX columns indicate other data in the model that is needed (such as contact email, for example).
In MySQL, this would be a simple SELECT * FROM table WHERE domain='abc.com' GROUP BY contact_id ORDER BY date DESC, however, PostGres complains, in this case, that:
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "table.id" must appear in the GROUP BY clause or be used in an aggregate function
I expect to get back 3 rows; 1, 4 and 6. Ideally, I'd like to get back the full rows in a single query... but I accept that I may need to do one query to get the IDs first, then another to find the items I want.
This is the closest I have got:
ExampleContacts
.select(:contact_id, 'max(date) AS max_date')
.where(domain: 'abc.com')
.group(:contact_id)
.order('max_date desc')
.limit(3)
However... this returns the contact_id, not the id. I cannot add the ID for the row.
EDIT:
Essentially, I need to get the primary key back for the row which is grouped on the non-primary key and sorted by another field.
If you want the rows, you don't need grouping. It's simply Contact.select('DISTINCT ON (contact_id)').where(domain: 'abc.com').order(date: :desc).limit(3)
Just to clarify #murad-yusufov's accepted answer, I ended up doing this:
subquery = ExampleContacts.select('DISTINCT ON (contact_id) *')
.where(domain: 'abc.com')
.order(contact_id)
.order(date: :desc)
ExampleContacts.from("(#{subquery.to_sql}) example_contacts")
.order(date: :desc)

Can SQL Server perform an update on rows with a set operation on the aggregate max or min value?

I am a fairly experienced SQL Server developer but this problem has me REALLY stumped.
I have a FUNCTION. The function is referencing a table that is something like this...
PERFORMANCE_ID, JUDGE_ID, JUDGING_CRITERIA, SCORE
--------------------------------------------------
101, 1, 'JUMP_HEIGHT', 8
101, 1, 'DEXTERITY', 7
101, 1, 'SYNCHRONIZATION', 6
101, 1, 'SPEED', 9
101, 2, 'JUMP_HEIGHT', 6
101, 2, 'DEXTERITY', 5
101, 2, 'SYNCHRONIZATION', 8
101, 2, 'SPEED', 9
101, 3, 'JUMP_HEIGHT', 9
101, 3, 'DEXTERITY', 6
101, 3, 'SYNCHRONIZATION', 7
101, 3, 'SPEED', 8
101, 4, 'JUMP_HEIGHT', 7
101, 4, 'DEXTERITY', 6
101, 4, 'SYNCHRONIZATION', 5
101, 4, 'SPEED', 8
In this example there are 4 judges (with IDs 1, 2, 3, and 4) judging a performance (101) against 4 different criteria (JUMP_HEIGHT, DEXTERITY, SYNCHRONIZATION, SPEED).
(Please keep in mind that in my real data there are 10+ criteria and at least 6 judges.)
I want to aggregate the results in a score BY JUDGING_CRITERIA and then aggregate those into a final score by summing...something like this...
SELECT SUM (Avgs) FROM
(SELECT AVG(SCORE) Avgs
FROM PERFORMANCE_SCORES
WHERE PERFORMANCE_ID=101
GROUP BY JUDGING_CRITERIA) result
BUT... that is not quite what I want IN THAT I want to EXCLUDE from the AVG the highest and lowest values for each JUDGING_CRITERIA grouping. That is the part that I can't figure out. The AVG should be applied only to the MIDDLE values of the GROUPING FOR EACH JUDGING_CRITERIA. The HI value and the LO value for JUMP_HEIGHT should not be included in the average. The HI value and the LO value for DEXTERITY should not be included in the average. ETC.
I know this could be accomplished with a cursor to set the hi and lo for each criteria to NULL. But this is a FUNCTION and should be extremely fast.
I am wondering if there is a way to do this as a SET operation but still automatically exclude HI and LO from the aggregation?
Thanks for your help. I have a feeling it can probably be done with some advanced SQL syntax but I don't know it.
One last thing. This example is actually a simplification of the problem I am trying to solve. I have other constraints not mentioned here for the sake of simplicity.
Seth
EDIT: -Moved the WHERE clause to inside the CTE.
-Removed JudgeID from the partition
This would be my approach
;WITH Agg1 AS
(
SELECT PERFORMANCE_ID
,JUDGE_ID
,JUDGING_CRITERIA
,SCORE
,MinFind = ROW_NUMBER() OVER ( PARTITION BY PERFORMANCE_ID
,JUDGING_CRITERIA
ORDER BY SCORE ASC )
,MaxFind = ROW_NUMBER() OVER ( PARTITION BY PERFORMANCE_ID
,JUDGING_CRITERIA
ORDER BY SCORE DESC )
FROM PERFORMANCE_SCORES
WHERE PERFORMANCE_ID=101
)
SELECT AVG(Score)
FROM Agg1
WHERE MinFind > 1
AND MaxFind > 1
GROUP BY JUDGING_CRITERIA

Conditionally creating a list of numbers in Access SQL

This is a homework task for Microsoft Access 2010.
How to create a query that will return a list of numbers from 1 to 18? Though on one condition, that the list of numbers will not contain any number that exists in table A.
So for example.
Table A has 1, 5, 10, 8, and 16.
Therefore, the query will only return this list of numbers (2, 3, 4, 6, 7, 9, 11, 12, 13, 14, 15, 17, 18)
Would anyone be kind enough to shed some pointers?
Thanks.
SQL is a set-based language. So you are always working with intersections, unions, complements, etc.
Consider that Table A could be completely empty. In this case you need to return all the numbers from 1 to 18. You are going to need to get these numbers from somewhere. (Hint, how about creating another table?)
Once you can write a query returning all the numbers from 1 to 18, you can start thinking about getting all the numbers from 1 to 18 EXCEPT those in table A.