Querying and count distinct contacts and group them by month - sql

this is my first post an d I hope I am fulfilling the guidelines.
Beforehand: I am a complete beginner with SQL and used it roughly in the past.
Here is my issue:
Prerequisite:
I have a table with contacts and timestamps, e.g.
contact_id
timestamp
contact_001
2022-01-03
contact_001
2022-01-16
contact_002
2022-01-03
contact_003
2022-01-05
contact_002
2017-04-27
contact_003
2017-04-27
Expected outcome:
I'd like to have a table which counts(!) the unique(!) contacts based on the contact_id per month and write it in a table so I get something like this:
Month
contactCount
2022-01
3
2017-04
2
Can someone provide me with a schema how to query that?
I really appreciate your help and I apologize if this is not the right way or place to put that question.
Please see my explanation above.

In a general sense it would be as simple as follows...
SELECT
the_month,
COUNT(DISTINCT contact_id)
FROM
your_table
GROUP BY
the_month
ORDER BY
the_month
How to get the month form your timestamp, however, depends on the SQL dialect you're using.
For example;
SQL Server = DATEADD(month, DATEDIFF(month, 0, [timestamp]), 0)
SQL Server = DATEADD(DAY, 1, EOMONTH([timestamp], -1))
MySQL = DATE_FORMAT(timestamp, '%Y-%m-01')
Oracle = TRUNC(timestamp, 'MM')
So, which RDBMS do you use?

Try the following
SQL Server
SELECT
FORMAT(timestamp, 'yyyy-MM') AS month,
COUNT(DISTINCT contact_id) AS contactCount
FROM table
GROUP BY FORMAT(timestamp, 'yyyy-MM')
ORDER BY 1
MySQL
SELECT
DATE_FORMAT(timestamp, '%Y-%m') AS month,
COUNT(DISTINCT contact_id) AS contactCount
FROM table
GROUP BY DATE_FORMAT(timestamp, '%Y-%m')
ORDER BY 1

Many thanks guys! You helped me a lot. I was not aware that COUNT(DISTINCT contact_id) works with "GROUP BY" at the same time as I assumed that unique entries are only considered when they first appeared.
Here is what I finally used:
SELECT FORMAT_DATE('%Y-%m', date) AS month,
COUNT(DISTINCT contact_id) AS contactCount
FROM table
WHERE DATE(date) BETWEEN '2017-01-01' AND '2023-01-31'
GROUP BY month
ORDER BY 1

Related

SQL - where MIN date is exactly 7 days ago

I need to select only the lines where MIN date is 7 days ago.
I have Customers with invoices, the invoices have due dates. I can select MIN due date of these invoices per Customer. I am now stuck as I cannot select only the Customers who are exactly 7 days overdue with their oldest invoice.
This is what I have:
select
customerID,
MIN(dueDate) as min_due_date
from invoices
where (invoiceStatus NOT IN ('PaidPosted','Cancelled'))
and (entity = 'HQ')
group by (customerID)
I tried adding:
and min_due_date = dateadd(day, -7, SYSDATE())
This does not work.
What am I doing wrong?
Thanks.
Use a having clause to filter the minimum date:
select customerID, min(dueDate) as min_due_date
from invoices
where invoiceStatus not in ('PaidPosted', 'Cancelled') and
entity = 'HQ'
group by customerID
having min(duedate) = current_date - interval '7 day';
Note that date functions are highly database specific. The above is standard SQL, but the exact syntax depends on the database you are using.
Thank you, Gordon, you put me on the right track!
This is what eventually did the trick:
select customerID, min(dueDate) as min_due_date
from invoices
where invoiceStatus not in ('PaidPosted', 'Cancelled') and
entity = 'HQ'
having MIN(dueDate) = trunc(sysdate) -7
group by customerID

SQL Server data search with date range

I have a table with the following columns:
Date
Skills,
Customer ID
I want to find out Date(x), Customers, Count of Customers in between Date(x) and Date(x)+6
Can somebody guide me how to make this query, or can I create this function in SQL Server?
If I understand you correctly, you want something like this:
(take care, can be bad syntax, because i "work" only with oracle. But I think that it should work)
select date, customer_id, COUNT(*)
from your_table --add your table
where date between getdate() and DATEADD(day, 6, getdate())
-- between current database system date and +6 day
group by date, customer id
order by COUNT (*) desc -- if you want, you can order your result - ASC||DESC
If you have data on each date, then perhaps this is what you want:
select date, count(*),
sum(count(*)) over (order by date rows between 6 preceding and current row) as week_count
from t
group by date;

Efficient way to query separate days of data?

I want to query statistics using SQL from 3 different days (in a row). The display would be something like:
15 users created today, 10 yesterday, 12 two days ago
The SQL would be something like (for today):
SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11'
And then I would do 2 more queries for yesterday and the day before.
So in total I'm doing 3 queries against the entire database. The format for created_date is 2012-05-11 05:24:11 (date & time).
Is there a more efficient SQL way to do this, say in one query?
For specifics, I'm using PHP and SQLite (so the PDO extension).
The result should be 3 different numbers (one for each day).
Any chance someone could show some performance numbers in comparison?
You can use GROUP BY:
SELECT Count(*), created_date FROM Users GROUP BY created_date
That will give you a list of dates with the number of records found on that date. You can add criteria for created_date using a normal WHERE clause.
Edit: based on your edit:
SELECT Count(*), created_date FROM Users WHERE created_date>='2012-05-09' GROUP BY date(created_date)
The best solution is to use GROUP BY DAY(created_date). Here is your query:
SELECT DATE(created_date), count(*)
FROM users
WHERE created_date > CURRENT_DATE - INTERVAL 3 DAY
GROUP BY DAY(created_date)
This would work I believe though I have no way to test it:
SELECT
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as today,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-10') as yesterday,
(SELECT Count(*) FROM Users WHERE created_date >= '2012-05-11') as day_before
;
Use GROUP BY like jeroen suggested, but if you're planning for other periods you can also set ranges like this:
SELECT SUM(IF(created_date BETWEEN '2012-05-01' AND NOW(), 1, 0)) AS `this_month`,
SUM(IF(created_date = '2012-05-09', 1, 0)) AS `2_days_ago`
FROM ...
As noted below, SQLite doesn't have IF function but there is CASE instead. So this way it should work:
SELECT SUM(CASE WHEN created_date BETWEEN '2012-05-01' AND NOW() THEN 1 ELSE 0 END) AS `this_month`,
SUM(CASE created_date WHEN '2012-05-09' THEN 1 ELSE 0 END) AS `2_days_ago`
FROM ...

How can I pull data from a SQL Database that spans an academic year?

Basically, I want to pull data from August to May for a given set of dates. Using the between operator works as long as I do not cross the year marker (i.e. BETWEEN 8 AND 12 works -- BETWEEN 8 AND 5 does not). Is there any way to pull this data? Here is the SQL Query I wrote:
SELECT count(*), MONTH(DateTime)
FROM Downloads
WHERE YEAR(DateTime) BETWEEN 2009 AND 2010 AND MONTH(DateTime) BETWEEN 8 AND 5
GROUP BY MONTH(DateTime)
ORDER BY MONTH(DateTime)"
Any help is appreciated.
Thanks,
Eric R.
Using BETWEEN with YEAR() and MONTH() is going to ruin any chance of using indexes on that column anyway. I would use:
SELECT
COUNT(*) AS [count],
YEAR(my_date) AS [year],
MONTH(my_date) AS [month]
FROM
Downloads
WHERE
my_date >= '2009-08-01' AND
my_date < '2010-06-01'
GROUP BY
YEAR(my_date),
MONTH(my_date)
ORDER BY
YEAR(my_date), MONTH(my_date)
(I used my_date because I can't bring myself to refer to a column as DateTime) :)
SELECT count(*), MONTH(DateTime)
FROM Downloads
WHERE DateTime>'2009/8/1 00:00:00' AND datetime<'2010/6/1 00:00:00'
GROUP BY MONTH(DateTime)
ORDER BY MONTH(DateTime)

SQL query to group by day

I want to list all sales, and group the sum by day.
Sales (saleID INT, amount INT, created DATETIME)
NOTE: I am using SQL Server 2005.
if you're using SQL Server,
dateadd(DAY,0, datediff(day,0, created)) will return the day created
for example, if the sale created on '2009-11-02 06:12:55.000',
dateadd(DAY,0, datediff(day,0, created)) return '2009-11-02 00:00:00.000'
select sum(amount) as total, dateadd(DAY,0, datediff(day,0, created)) as created
from sales
group by dateadd(DAY,0, datediff(day,0, created))
For SQL Server:
GROUP BY datepart(year, datefield),
datepart(month, datefield),
datepart(day, datefield)
or faster (from Q8-Coder):
GROUP BY dateadd(DAY, 0, datediff(day, 0, created))
For MySQL:
GROUP BY year(datefield), month(datefield), day(datefield)
or better (from Jon Bright):
GROUP BY date(datefield)
For Oracle:
GROUP BY to_char(datefield, 'yyyy-mm-dd')
or faster (from IronGoofy):
GROUP BY trunc(created);
For Informix (by Jonathan Leffler):
GROUP BY date_column
GROUP BY EXTEND(datetime_column, YEAR TO DAY)
If you're using MySQL:
SELECT
DATE(created) AS saledate,
SUM(amount)
FROM
Sales
GROUP BY
saledate
If you're using MS SQL 2008:
SELECT
CAST(created AS date) AS saledate,
SUM(amount)
FROM
Sales
GROUP BY
CAST(created AS date)
For PostgreSQL:
GROUP BY to_char(timestampfield, 'yyyy-mm-dd')
or using cast:
GROUP BY timestampfield::date
if you want speed, use the second option and add an index:
CREATE INDEX tablename_timestampfield_date_idx ON tablename(date(timestampfield));
actually this depends on what DBMS you are using but in regular SQL convert(varchar,DateColumn,101) will change the DATETIME format to date (one day)
so:
SELECT
sum(amount)
FROM
sales
GROUP BY
convert(varchar,created,101)
the magix number 101 is what date format it is converted to
If you're using SQL Server, you could add three calculated fields to your table:
Sales (saleID INT, amount INT, created DATETIME)
ALTER TABLE dbo.Sales
ADD SaleYear AS YEAR(Created) PERSISTED
ALTER TABLE dbo.Sales
ADD SaleMonth AS MONTH(Created) PERSISTED
ALTER TABLE dbo.Sales
ADD SaleDay AS DAY(Created) PERSISTED
and now you could easily group by, order by etc. by day, month or year of the sale:
SELECT SaleDay, SUM(Amount)
FROM dbo.Sales
GROUP BY SaleDay
Those calculated fields will always be kept up to date (when your "Created" date changes), they're part of your table, they can be used just like regular fields, and can even be indexed (if they're "PERSISTED") - great feature that's totally underused, IMHO.
Marc
For oracle you can
group by trunc(created);
as this truncates the created datetime to the previous midnight.
Another option is to
group by to_char(created, 'DD.MM.YYYY');
which achieves the same result, but may be slower as it requires a type conversion.
The simplest and intuitive solution for MySQL is:
GROUP BY day(datefield)
use linq
from c in Customers
group c by DbFunctions.TruncateTime(c.CreateTime) into date
orderby date.Key descending
select new
{
Value = date.Count().ToString(),
Name = date.Key.ToString().Substring(0, 10)
}