Finding the first occurrence of an element in a SQL database - sql

I have a table with a column for customer names, a column for purchase amount, and a column for the date of the purchase. Is there an easy way I can find how much first time customers spent on each day?
So I have
Name | Purchase Amount | Date
Joe 10 9/1/2014
Tom 27 9/1/2014
Dave 36 9/1/2014
Tom 7 9/2/2014
Diane 10 9/3/2014
Larry 12 9/3/2014
Dave 14 9/5/2014
Jerry 16 9/6/2014
And I would like something like
Date | Total first Time Purchase
9/1/2014 73
9/3/2014 22
9/6/2014 16
Can anyone help me out with this?

The following is standard SQL and works on nearly all DBMS
select date,
sum(purchaseamount) as total_first_time_purchase
from (
select date,
purchaseamount,
row_number() over (partition by name order by date) as rn
from the_table
) t
where rn = 1
group by date;
The derived table (the inner select) selects all "first time" purchases and the outside the aggregates based on the date.

The two key concepts here are aggregates and sub-queries, and the details of which dbms you're using may change the exact implementation, but the basic concept is the same.
For each name, determine they're first date
Using the results of 1, find each person's first day purchase amount
Using the results of 2, sum the amounts for each date
In SQL Server, it could look like this:
select Date, [totalFirstTimePurchases] = sum(PurchaseAmount)
from (
select t.Date, t.PurchaseAmount, t.Name
from table1 t
join (
select Name, [firstDate] = min(Date)
from table1
group by Name
) f on t.Name=f.Name and t.Date=f.firstDate
) ftp
group by Date

If you are using SQL Server you can accomplish this with either sub-queries or CTEs (Common Table Expressions). Since there is already an answer with sub-queries, here is the CTE version.
First the following will identify each row where there is a first time purchase and then get the sum of those values grouped by date:
;WITH cte
AS (
SELECT [Name]
,PurchaseAmount
,[date]
,ROW_NUMBER() OVER (
PARTITION BY [Name] ORDER BY [date] --start at 1 for each name at the earliest date and count up, reset every time the name changes
) AS rn
FROM yourTableName
)
SELECT [date]
,sum(PurchaseAmount) AS TotalFirstTimePurchases
FROM cte
WHERE rn = 1
GROUP BY [date]

Related

Select particular not grouped column from grouped set

The topic might be a little bit unclear but I couldn't describe in a single sentence what I want to achieve.
Say I have a table that is (columns)
id INT PK
name VARCHAR
date DATE
I have a grouping select
select
name,
max(date)
from table
group by name
that gives me a name and the latest date.
What is the easiest way to join the id column to the current aggregated result set with the id value where the date was the maximum?
Let me explain what my goal is with an example:
The table is filled with the data as follows
id name date
1 david 2012-12-12
2 david 2013-12-02
3 patrick 2014-01-02
4 patrick 2012-11-11
and by my query I'd like to get the following result
id name date
2 david 2013-12-02
3 patrick 2014-01-02
Notice that all the records for name = 'david' are aggregated and the maximum date is selected. How to get the row id for this maximum date?
One option is to use ROW_NUMBER():
SELECT id, name, date
FROM (
SELECT id, name, date,
row_number() over (partition by name order by date desc) rn
FROM yourtable
) t
WHERE rn = 1
SQL Fiddle Demo
Another option is to join the table back to itself using the MAX() aggregate. This option could potentially result in ties if multiple id/name combinations share the same max date:
SELECT t.id, t.name, t.date
FROM yourtable t
JOIN (SELECT name, max(date) maxdate
FROM yourtable
GROUP BY name) t2 on t.name = t2.name AND t.date = t2.maxdate
More Fiddle

Get adjacent fields with GROUP BY / MAX

This has probably been asked many times but I can't find a solution because I don't know how to phrase this question.
[product] [shop] [price] [date]
Pizza Shop1 10 2014-05-10
Pizza Shop2 12 2014-05-04
Snow Shop1 101 2014-05-02
Snow Shop3 93 2014-05-11
I wish to query this table and get the price of the last added product:
[product] [shop] [price] [date]
Pizza Shop1 10 2014-05-10
Snow Shop3 93 2014-05-11
An obviously wrong syntax:
SELECT
product,
shop WHERE MAX(date),
price WHERE MAX(date),
MAX(date)
FROM myTable
GROUP BY product
This query is already a part of a subquery so I want the best possible performing solution.
Using ROW_NUMBER() you can split the records in to partitions for each product, and assign each a row number, starting with 1 for the newest record for a partition (product).
Then you just select all the records with a row number of 1, for each product.
WITH
sequenced AS
(
SELECT
myTable.*,
ROW_NUMBER() OVER (PARTITION BY [product] ORDER BY [date] DESC) AS product_date_ordinal
FROM
myTable
)
SELECT
*
FROM
sequenced
WHERE
product_date_ordinal = 1
This assumes SQL SERVER 2005 onwards, and you should have an index on product, date DESC for best performance.
You can try this aswell. Using join with subquery to filter your table, not as good as first answer I guess, but easier to read and understand:
SELECT Table.*
FROM Table
RIGHT JOIN
(
SELECT product, max(date) AS date FROM Table
GROUP BY product
) AS Filter
ON Filter.product = Table.product AND Filter.date = Table.date
Just replace "Table" with the name of your table.

How to select multiple rows in SQL Server while filling one column with the first value

Each of my rows have a date. I want the database to keep the good date. But I am in a situation where I want only the first date. But I still want all the other rows. So I would like to fill the date column with all the same date in my result.
For an example (Because I don't think I expressed myself well)
I have this:
name value date
a 10 5/13
b 14 2/13
c 20 1/13
a 11 7/13
a 5 8/13
b 8 9/13
I want it to become like this in the result:
name value date
a 26 5/13
b 22 5/13
c 20 5/13
I searched for this information but I only find the way to select the first row.
for now I'm doing
SELECT name, SUM(value), date FROM table
ORDER BY name
And I'm kind of clueless for what to do next.
Thanks :)
Databases don't have a concept of "first". Here is an attempt, but no guarantees unless you have a way of ordering to determine first:
select name, sum(value), const.date
from table cross join
(select top 1 date from table) const
group by name, const.date
If you only want to do this for a query, to provide this aggregated data for some specific client requirement, then #freshPrince's answer is appropriate. But if want to actually modify the data in the table itself, and prevent the issue from arising again, then you need to change the schema.
Create Table newTable(
name varChar(30) not null,
date datetime not null,
value decimal(10,2) not null default(0),
primary key (name, date) )
Insert newTable (name, date, value)
Select name, SUM(value), Min(date)
FROM currentTable
Group By Name
and delete the old table... then rename the new table to whatever...
You will also have to modify the process used to insert new rows so that instread of always inserting a new row, it updates the existing row for a specified name and date if it already exists...
Your question is slightly confusing since your desired result is showing a date that does not exists with either b or c but if that is the result that you want want you could use something similar to the following:
select name, sum(value) value, d.date
from yt
cross join
(
select min(date) date
from yt
where name = (select min(name)
from yt)
) d
group by name, d.date;
See SQL Fiddle with Demo
But it seems like you actually would want the min(date) for each name:
select name, sum(value) value, min(date)
from yt
group by name;
See SQL Fiddle with Demo.
If the order of the date should be the determined by the name then you could use:
select t.name, sum(value) value, d.date
from yt t
cross join
(
select top 1 name, date
from yt
order by name, date
) d
group by t.name, d.date;
See Demo

Selecting specific data based on multiple conditions

I need some help constructing a SQL command for a database query. The database has 5 columns:
Date(string)
Name(string)
number(int)
There can be multiple entries for each date, name, and number.
I want to SELECT only one row for each date and name combination. The problem is there are multiple instances of these. For each date and name combination I want to select the one with the highest number. I would like it ordered by date. For example:
date | name | number
1/1/1 henry 500
1/1/1 henry 2000
1/1/1 jacob 5
1/1/1 jacob 8
1/2/1 henry 6
The command would return:
1/1/1 henry 2000
1/1/1 jacob 8
1/2/1 henry 6
I have been messing around with some commands but I am a pretty lost. Is this even possible?
You can use ROW_NUMBER:
WITH cte
AS (SELECT date,
name,
number,
rn = Row_number ()
OVER(
partition BY date, name
ORDER BY number DESC)
FROM dbo.tablename)
SELECT date,
name,
number
FROM CTE
WHERE rn = 1
ORDER BY date ASC
DEMO
ROW_NUMBER will always select one record per group. If you want to get all rows with the highest number for a given name(if there are more than one) use DENSE_RANK instead.
SELECT date, name, MAX(number)
FROM Table1
GROUP BY date, name
ORDER date, name
Try grouping by date and name and then selecting the maximum number. Like so (exact syntax may vary depending on your version of sql):
select
date,
name,
max(number)
from
yourtable
group by
date,
name
order by
date asc

Return min date and corresponding amount to that distinct ID

Afternoon
I am trying to return the min value/ max values in SQL Server 2005 when I have multiple dates that are the same but the values in the Owed column are all different. I've already filtered the table down by my select statement into a temp table for a different query, when I've then tried to mirror I have all the duplicated dates that you can see below.
I now have a table that looks like:
ID| Date |Owes
-----------------
1 20110901 89
1 20110901 179
1 20110901 101
1 20110901 197
1 20110901 510
2 20111001 10
2 20111001 211
2 20111001 214
2 20111001 669
My current query:
Drop Table #Temp
Select Distinct Convert(Varchar(8), DateAdd(dd, Datediff(DD,0,DateDue),0),112)as Date
,ID
,Paid
Into #Temp
From Table
Where Paid <> '0'
Select ,Id
,Date
,Max(Owed)
,Min(Owed)
From #Temp
Group by ID, Date, Paid
Order By ID, Date, Paid
This doesn't strip out any of my dates that are the same, I'm new to SQL but I'm presuming its because my owed column has different values. I basically want to be able to pull back the first record as this will always be my minimum paid and my last record will always be my maximum owed to work out my total owed by ID.
I'm new to SQL so would like to understand what I've done wrong for my future knowledge of structuring queries?
Many Thanks
In your "select into"statement, you don't have an Owed column?
GROUP BY is the normal way you "strip out values that are the same". If you group by ID and Date, you will get one row in your result for each distinct pair of values in those two columns. Each row in the results represents ALL the rows in the underlying table, and aggregate functions like MIN, MAX, etc. can pull out values.
SELECT id, date, MAX(owes) as MaxOwes, MIN(owes) as minOwes
FROM myFavoriteTable
GROUP BY id, date
In SQL Server 2005 there are "windowing functions" that allow you to use aggregate functions on groups of records, without grouping. An example below. You will get one row for each row in the table:
SELECT id, date, owes,
MAX(Owes) over (PARTITION BY select, id) AS MaxOwes,
MIN(Owes) over (PARTITION BY select, id) AS MinOwes
FROM myfavoriteTable
If you name a column "MinOwes" it might sound like you're just fishing tho.
If you want to group by date you can't also group by ID, too, because ID is probably unique. Try:
Select ,Date
,Min(Owed) AS min_date
,Max(Owed) AS max_date
From #Temp
Group by Date
Order By Date
To get additional values from the row (your question is a bit vague there), you could utilize window functions:
SELECT DISTINCT
,Date
,first_value(ID) OVER (PARTITION BY Date ORDER BY Owed) AS min_owed_ID
,last_value(ID) OVER (PARTITION BY Date ORDER BY Owed) AS max_owed_ID
,first_value(Owed) OVER (PARTITION BY Date ORDER BY Owed) AS min_owed
,last_value(Owed) OVER (PARTITION BY Date ORDER BY Owed) AS max_owed
FROM #Temp
ORDER BY Date;