I have a table that represents the purchases of a list of customers by date. The data is sorted in order by customer, and purchase date.
I need to place the total number of orders a particular customer has made in a third column (probably by checking the number of previous instances of the customer's name).
My table currently looks like this:
Column A Column B Column C
1 12/03/13 Angela
2 01/05/14 Angela
3 03/07/14 Angela
4 04/01/14 Angela
5 03/06/13 Ben
6 04/02/13 Ben
7 11/11/15 Carl
8 12/11/15 Carl
9 01/01/16 Carl
10 02/03/17 David
11 04/04/17 Ethan
And what I need to see is (where Column C is the Total Orders for that customer)
Column A Column B Column C
1 12/03/13 Angela 1
2 01/05/14 Angela 2
3 03/07/14 Angela 3
4 04/01/14 Angela 4
5 03/06/13 Ben 1
6 04/02/13 Ben 2
7 11/11/15 Carl 1
8 12/11/15 Carl 2
9 01/01/16 Carl 3
10 02/03/17 David 1
11 04/04/17 Ethan 1
Any help is greatly appreciated!
Try the following in C2
=COUNTIF($B$2:$B2,$B2)
Drag down for as many rows as required.
Related
I'm working in iMIS CMS (iMIS 200) and trying to create an IQA (an iMIS query, using SQL) that will give me a timetable of slots assigned to people per day (I've got this working); but then I want to be able to filter that timetable on a person's profile so they just see the slots they are assigned to.
(This is for auditions for an orchestra. So people make an application per instrument, then those applications are assigned to audition slots, of which there are several slots per day)
As the start/end times for slots are calculated using SUM OVER, when I filter this query by the person ID, I lose the correct start/end times for slots (as the other slots aren't in the data for it to SUM, I guess!)
Table structure:
tblContacts
===========
ContactID ContactName
---------------------------
1 Steve Jones
2 Clare Philips
3 Bob Smith
4 Helen Winters
5 Graham North
6 Sarah Stuart
tblApplications
===============
AppID FKContactID Instrument
-----------------------------------
1 1 Violin
2 1 Viola
3 2 Cello
4 3 Cello
5 4 Trumpet
6 5 Clarinet
7 5 Horn
8 6 Trumpet
tblAuditionDays
===============
AudDayID AudDayDate AudDayVenue AudDayStart
-------------------------------------------------
1 16-Sep-19 London 10:00
2 17-Sep-19 Manchester 10:00
3 18-Sep-19 Birmingham 13:30
4 19-Sep-19 Leeds 10:00
5 19-Sep-19 Glasgow 11:30
tblAuditionSlots
================
SlotID FKAudDayID SlotOrder SlotType SlotDuration FKAppID
-----------------------------------------------------------------
1 1 1 Audition 20 3
2 1 2 Audition 20 4
3 1 3 Chat 10 3
4 1 5 Chat 10 4
5 1 4 Audition 20
6 2 1 Audition 20 1
7 2 2 Audition 20 6
8 2 4 Chat 10 6
9 2 3 Chat 10 1
10 2 5 Audition 20
11 3 2 Chat 10 8
12 3 1 Audition 20 2
13 3 4 Chat 5 2
14 3 3 Audition 20 8
15 5 1 Audition 30 5
16 5 2 Audition 30 7
17 5 3 Chat 15 7
18 5 4 Chat 15 5
Current SQL for listing all the slots each day (in date/slot order, with the slot timings calculcated correctly) is:
SELECT
[tblAuditionSlots].[SlotOrder] as [Order],
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), [tblAuditionDays].[AudDayStart]), 108)
END
+ ' - ' +
CASE
WHEN
SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) is null
THEN
CONVERT(VARCHAR(5), [tblAuditionDays].[AudDayStart], 108)
ELSE
CONVERT(VARCHAR(5), Dateadd(minute, SUM([tblAuditionSlots].[SlotDuration]) OVER (PARTITION BY [tblAuditionDays].[FKAudDayID] ORDER BY [tblAuditionSlots].[SlotOrder] ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), [tblAuditionDays].[AudDayStart]), 108)
END AS [Slot],
[tblAuditionSlots].[SlotType] AS [Type],
[tblContacts].[ContactName] as [Name],
FROM
tblAuditionSlots
LEFT JOIN tblAuditionDays ON tblAuditionSlots.FKAudDayID = tblAuditionDays.AudDayID
LEFT JOIN tblApplications ON tblAuditionSlots.FKAppID = tblApplications.AppID
LEFT JOIN tblContacts ON tblApplications.FKContactID = tblContacts.ContactID
GROUP BY
[tblAuditionSlots].[SlotOrder],
[tblAuditionSlots].[SlotType],
[tblAuditionSlots].[SlotDuration],
[tblAuditionDays].[AudDayStart],
[tblContacts].[ContactName],
[tblContacts].[ContactID],
[tblAuditionDays].[AudDayID],
[tblAuditionDays].[AudDayDate]
ORDER BY
[tblAuditionDays].[DayDate],
[tblAuditionSlots].[Order]
iMIS, the CMS we're using, is limited by what you can create in an IQA (query).
You can basically insert (some) SQL as a column and give it an alias; you can add (non-calculated) fields to the order by; you can't really control the Group By (whatever fields are added are included in the Group By).
Ultimately, I'd like to be able to filter this by a Contact ID so I can see all their audition slots, but with the times correctly calculated.
From the sample data, for example:
STEVE JONES AUDITIONS
=====================
Date Slot Venue Type Instrument
----------------------------------------------------------------
17-Sep-19 10:00 - 10:20 Manchester Audition Violin
17-Sep-19 10:40 - 10:50 Manchester Chat Violin
18-Sep-19 13:30 - 13:50 Birmingham Audition Viola
18-Sep-19 14:30 - 14:35 Birmingham Chat Viola
HELEN WINTERS AUDITIONS
=======================
Date Slot Venue Type Instrument
----------------------------------------------------------------
19-Sep-19 11:30 - 12:00 Glasgow Audition Trumpet
19-Sep-19 12:45 - 13:00 Glasgow Chat Trumpet
Hopefully that all makes sense and I've provided enough information.
(In this version of iMIS [200], you can't do subqueries, in case that comes up...)
Thanks so much in advance for whatever help/tips/advice you can offer!
Chris
I have been having an issue for a little bit of time now and cannot find a solution that has worked for me. It might be that I am just not doing it correctly or that there is an alternative that will work better. I am opened to, and appreciative, of any ideas.
I have a table (tblDocQueue) in access that is like the one displayed below. The data comes from a data extract of an older application that we use at work, so the source or the extract cant be changed. We upload the data to Access to analyze and build metrics around it. The tables is as follows:
ID DocName OwnerName AccountNum DocRef
1 Doc 1 Matt 1001 Z0005638
2 Doc 1 Matt 1002 Z0005638
3 Doc 1 Tony 5010 Z0005639
4 Doc 2 Luke 1050 Z0005640
5 Doc 3 Luke 1050 Z0005641
6 Doc 3 Gary 1234 Z0005641
7 Doc 4 John 8789 Z0005642
8 Doc 5 Ed 8789 Z0005642
9 Doc 5 Ed 8790 Z0005643
10 Doc 5 Connie 4579 Z0005644
11 Doc 6 Mary 3616 Z0005645
12 Doc 6 Lucy 4795 Z0005646
13 Doc 6 Tina 4795 Z0005646
14 Doc 7 Matt 1001 Z0005638
15 Doc 7 John 8789 Z0005647
There are more columns than what are listed, but they are not relevant to the question. I am trying to remove duplicates, keeping one unique value, based on three columns (DocName, OwnerName, Doc Ref). I was doing this using the following SQL, but it began taking hours (up to 7) to process around 500,000 lines of data. I am unsure if the efficiency problem is because of using min/max or some other reason
SELECT tblDocQueue.ID AS Expr1, tblDocQueue.DocName AS Expr2,
tblDocQueue.OwnerName AS Expr3, tblDocQueue.AcctNo AS Expr4,
tblDocQueue.ExpDate AS Expr5, tblDocQueue.EffectiveDate AS Expr6,
tblDocQueue.SignatureDate AS Expr7, tblDocQueue.DocBNYSts AS Expr8,
tblDocQueue.StsDate AS Expr9, tblDocQueue.UserSts AS Expr10,
tblDocQueue.Location AS Expr11, tblDocQueue.Ackngmt AS Expr12,
tblDocQueue.OPID AS Expr13, tblDocQueue.Comments AS Expr14,
tblDocQueue.DocRef AS Expr15, tblDocQueue.ExternalComment AS Expr16,
tblDocQueue.FirstName AS Expr17, tblDocQueue.LastName AS Expr18,
tblDocQueue.ClientID AS Expr19, tblDocQueue.Address AS Expr20,
tblDocQueue.CountryCode AS Expr21
FROM tblDocQueue
WHERE ((([tblDocQueue].[ID])=(
SELECT Min(t.[ID])
FROM [tblDocQueue] AS t
WHERE t.[DocRef]=[tblDocQueue].[DocRef]
AND t.[DocName]=[tblDocQueue].[DocName])));
The time that this it was taking was unacceptable for the business team. I then developed a work around in VBA that exports the data to an excel file, using the remove duplicates function preloaded in excel, and imported the unique data back into a different table. This only takes a few seconds in excel. As the use of this database is beginning to expand, and I will removing duplicates in a similar way for hundreds of datasets a day, I am trying to get this to work without using the above workaround.
The anticipated result of the above example data would be:
ID DocName OwnerName AccountNum DocRef
1 Doc 1 Matt 1001 Z0005638
3 Doc 1 Tony 5010 Z0005639
4 Doc 2 Luke 1050 Z0005640
5 Doc 3 Luke 1050 Z0005641
6 Doc 3 Gary 1234 Z0005641
7 Doc 4 John 8789 Z0005642
8 Doc 5 Ed 8789 Z0005642
9 Doc 5 Ed 8790 Z0005643
10 Doc 5 Connie 4579 Z0005644
11 Doc 6 Mary 3616 Z0005645
12 Doc 6 Lucy 4795 Z0005646
13 Doc 6 Tina 4795 Z0005646
14 Doc 7 Matt 1001 Z0005638
15 Doc 7 John 8789 Z0005647
If anyone can help me with the SQL to:
Conditionally remove duplicate values based on three columns
Using Microsoft Access 2010
While keeping one row of each unique value
In some way that is efficient / does not take a long amount of time (tables up to 5,000,000 records)
Any and all help is greatly appreciated!
Consider a straightforward aggregate query, grouped on the columns you need distinct DocName, OwnerName, DocRef. Then, take the Min() of the ID and AccountNum which would also include all the other columns:
SELECT Min(tblDocQueue.ID) AS MinOfID,
tblDocQueue.DocName,
tblDocQueue.OwnerName,
Min(tblDocQueue.AccountNum) AS MinOfAccountNum,
tblDocQueue.DocRef
FROM tblDocQueue
GROUP BY tblDocQueue.DocName,
tblDocQueue.OwnerName,
tblDocQueue.DocRef
ORDER BY Min(tblDocQueue.ID);
Output
MinOfID DocName OwnerName MinOfAccountNum DocRef
1 Doc 1 Matt 1001 Z0005638
3 Doc 1 Tony 5010 Z0005639
4 Doc 2 Luke 1050 Z0005640
5 Doc 3 Luke 1050 Z0005641
6 Doc 3 Gary 1234 Z0005641
7 Doc 4 John 8789 Z0005642
8 Doc 5 Ed 8789 Z0005642
9 Doc 5 Ed 8790 Z0005643
10 Doc 5 Connie 4579 Z0005644
11 Doc 6 Mary 3616 Z0005645
12 Doc 6 Lucy 4795 Z0005646
13 Doc 6 Tina 4795 Z0005646
14 Doc 7 Matt 1001 Z0005638
15 Doc 7 John 8789 Z0005647
This is a trivial example, but I am trying to understand how to think creatively using SQL.
For example, I have the following tables below, and I want to query the names of folks who have three or more questions. How can I do this without using HAVING or COUNT? I wonder if this is possible using JOINS or something similar?
FOLKS
folkID name
---------- --------------
01 Bill
02 Joe
03 Amy
04 Mike
05 Chris
06 Elizabeth
07 James
08 Ashley
QUESTION
folkID questionRating questionDate
---------- ---------- ----------
01 2 2011-01-22
01 4 2011-01-27
02 4
03 2 2011-01-20
03 4 2011-01-12
03 2 2011-01-30
04 3 2011-01-09
05 3 2011-01-27
05 2 2011-01-22
05 4
06 3 2011-01-15
06 5 2011-01-19
07 5 2011-01-20
08 3 2011-01-02
Using SUM or CASE seems to be cheating to me!
I'm not sure if it's possible in your current formulation, but if you add a primary key to the question table (questionid) then the following seems to work:
SELECT DISTINCT Folks.folkid, Folks.name
FROM ((Folks
INNER JOIN Question AS Question_1 ON Folks.folkid = Question_1.folkid)
INNER JOIN Question AS Question_2 ON Folks.folkid = Question_2.folkid)
INNER JOIN Question AS Question_3 ON Folks.folkid = Question_3.folkid
WHERE (((Question_1.questionid) <> [Question_2].[questionid] And
(Question_1.questionid) <> [Question_3].[questionid]) AND
(Question_2.questionid) <> [Question_3].[questionid]);
Sorry, this is in MS Access SQL, but it should translate to any flavour of SQL.
Returns:
folkid name
3 Amy
5 Chris
Update: Just to explain why this works. Each join will return all the question ids asked by that person. The where clauses then leaves only unique rows of question ids. If there are less than three questions asked then there will be no unique rows.
For example, Bill:
folkid name Question_3.questionid Question_1.questionid Question_2.questionid
1 Bill 1 1 1
1 Bill 1 1 2
1 Bill 1 2 1
1 Bill 1 2 2
1 Bill 2 1 1
1 Bill 2 1 2
1 Bill 2 2 1
1 Bill 2 2 2
There are no rows where all the ids are different.
however for Amy:
folkid name Question_3.questionid Question_1.questionid Question_2.questionid
3 Amy 4 4 5
3 Amy 4 4 4
3 Amy 4 4 6
3 Amy 4 5 4
3 Amy 4 5 5
3 Amy 4 5 6
3 Amy 4 6 4
3 Amy 4 6 5
3 Amy 4 6 6
3 Amy 5 4 4
3 Amy 5 4 5
3 Amy 5 4 6
3 Amy 5 5 4
3 Amy 5 5 5
3 Amy 5 5 6
3 Amy 5 6 4
3 Amy 5 6 5
3 Amy 5 6 6
3 Amy 6 4 4
3 Amy 6 4 5
3 Amy 6 4 6
3 Amy 6 5 4
3 Amy 6 5 5
3 Amy 6 5 6
3 Amy 6 6 4
3 Amy 6 6 5
3 Amy 6 6 6
There are several rows which have different ids and hence these get returned by the above query.
you can try sum , to replace count.
SELECT SUM(CASE WHEN Field_name >=3 THEN field_name ELSE 0 END)
FROM tabel_name
SELECT f.*
FROM (
SELECT DISTINCT
COUNT(*) OVER (PARTITION BY folkID) AS [Count] --count questions for folks
,a.folkID
FROM QUESTION AS q
) AS p
INNER JOIN FOLKS as f ON f.folkID = q.folkID
WHERE p.[Count] > 3
How can I create a new column (inCount) with numbering of occurrences in a specific column?
Here is an example:
id name inCount
1 Orly 1
2 Ernest 1
3 Rachel 1
4 Don 1
5 Don 2
6 Ernest 2
7 Angela 1
8 Ernest 3
9 David 1
10 Rachel 2
11 Sully 1
12 Sully 2
13 Rachel 3
14 David 2
15 David 3
16 Kevin 1
17 Kevin 2
18 Orly 2
19 Angela 2
20 Sully 3
21 Kevin 3
22 Don 3
23 Orly 3
24 Angela 3
Don from id 5 is numbered 2 because Don appears in id 4 too.
Don from id 22 is numbered 3 due to the above preceding occurrences.
I use MS SQL SERVER 2008 R2 Express edition.
Thanks.
You could use partition by, like:
select row_number() over (partition by name order by id) as inCount
, *
from YourTable
order by
id
This should work
SELECT id, Name, ROW_NUMBER() OVER(PARTITION BY Name ORDER BY id)
FROM table
ORDER BY id
EDIT: Added order by clause on the select in order to show results in same order indicated by OP. The ORDER BY in the ROW_NUMBER did not change the outcome, but I changed to id as it will keep the row_number correct for the sample data.
I have database table that I am after some SQL for (Which is defeating me so far!)
Imagine there are 192 Athletic Clubs who all take part in 12 Track Meets per season.
So that is 2304 individual performances per season (for example in the 100Metres)
I would like to find the top 48 (unique) individual performances from the table, these 48 athletes are then going to take part in the end of season World Championships.
So imagine the 2 fastest times are both set by "John Smith", but he can only be entered once in the world champs. So i would then look for the next fastest time not set by "John Smith"... so on and so until I have 48 unique athletes..
hope that makes sense.
thanks in advance if anyone can help
PS
I did have a nice screen shot created that would explain it much better. but as a newish user i cannot post images.
I'll try a copy and paste version instead...
ID AthleteName AthleteID Time
1 Josh Lewis 3 11.99
2 Joe Dundee 4 11.31
3 Mark Danes 5 13.44
4 Josh Lewis 3 13.12
5 John Smith 1 11.12
6 John Smith 1 12.18
7 John Smith 1 11.22
8 Adam Bennett 6 11.33
9 Ronny Bower 7 12.88
10 John Smith 1 13.49
11 Adam Bennett 6 12.55
12 Mark Danes 5 12.12
13 Carl Tompkins 2 13.11
14 Joe Dundee 4 11.28
15 Ronny Bower 7 12.14
16 Carl Tompkin 2 11.88
17 Nigel Downs 8 14.14
18 Nigel Downs 8 12.19
Top 4 unique individual performances
1 John Smith 1 11.12
3 Joe Dundee 4 11.28
5 Adam Bennett 6 11.33
6 Carl Tompkins 2 11.88
Basically something like this:
select top 48 *
from (
select athleteId,min(time) as bestTime
from theRaces
where raceId = '123' -- e.g., 123=100 meters
group by athleteId
) x
order by bestTime
try this --
select x.ID, x.AthleteName , x.AthleteID , x.Time
(
select rownum tr_count,v.AthleteID AthleteID, v.AthleteName AthleteName, v.Time Time,v.id id
from
(
select
tr1.AthleteName AthleteName, tr1.Time time,min(tr1.id) id, tr1.AthleteID AthleteID
from theRaces tr1
where time =
(select min(time) from theRaces tr2 where tr2.athleteId = tr1.athleteId)
group by tr1.AthleteName, tr1.AthleteID, tr1.Time
having tr1.Time = ( select min(tr2.time) from theRaces tr2 where tr1.AthleteID =tr2.AthleteID)
order by tr1.time
) v
) x
where x.tr_count < 48