Count query with timestamp value - sql

I would like to create a count query (in Postgres) which counts data.data_name dependent on data.todb_date.
So what I want to is that the query counts all the rows that are higher than the requirement in the WHERE clause. I tried Count(data.data_name) and Count(*) but they didn't work.
My planned result looks like this:
todb_date: 2016-01-01
data.data_name : test1
count: 150
todb_date: 2017-01-01
data.data_name : test1
count: 130
This is the query I have tried:
SELECT data.data_name, parentdata.data_id,
data.data_id, parentdata.todb_date,
COUNT (data.data_name)
FROM parentdata, data
WHERE parentdata.data_id = data.data_id
AND parentdata.todb_date > '2016-01-01'
GROUP BY parentdata.data_id, data.data_id, data.data_name, parentdata.todb_date

As #Usagi Miyamoto suggested, you should use a data_trunc() function to group your results according to certain time increments (here: per year):
SELECT d.data_name nam, date_trunc('year',p.todb_date) yr, COUNT(*) cnt
FROM parentdata p
INNER JOIN data d ON p.data_id = d.data_id AND p.todb_date > '2016-01-01'
GROUP BY d.data_name,date_trunc('year',p.todb_date)
ORDER BY nam, yr
If you replace 'year' by 'date' you will get daily counts, see here.

Related

Postgres - Pick 1 max row instead of all max rows in the same month on a LOD (group by)

Hi have a data set which has product, site, station, date and some numerical fields such as sn_count, blob etc.
Within every combination of product, site and station, if there are multiple entries for the same month from different dates, I want to only pick one row with max sn count in that month.
The code I have right now is working for the most part. It is filtering out rows with lesser sn counts in that month. But it gives me all rows with the same max sn count. Whereas, I just want 1 from a month.
This is my code:
FROM insight_info_temp a
INNER JOIN
(
SELECT distinct b.product_code,b.site_name,b.station_type,to_char(b.date_b, 'YYYY-MM') as date_new,
MAX(dist_sn_count_at_blob) as max_sn
FROM insight_info_temp b
GROUP BY b.product_code,b.site_name,b.station_type,to_char(b.date_b, 'YYYY-MM')
) b
ON a.product_code = b.product_code and
a.site_name = b.site_name and
a.station_type = b.station_type and
to_char(a.date_b, 'YYYY-MM') = b.date_new
AND a.dist_sn_count_at_blob = b.max_sn
where a.product_code = 'D00'
and a.site_name = 'F00' and a.station_type = 'A00';
This is the result I have:
The highlighted rows have the same sn count and is the max sn count for that month.
I however, only want one of these rows. Not both.
My guess is that you have two observations with the same dist_sn_count_at_blob.
This is a candidate for PostgreSQL's distinct on.
Please try something like this:
select distinct on (product_code, site_name, station_type,
to_char(date_b, 'YYYY-MM'))
dist_sn_count_at_blob, last_updated_at_pkey, <other columns>
from insight_info_temp
where a.product_code = 'D00'
and a.site_name = 'F00'
and a.station_type = 'A00'
order by product_code, site_name, station_type,
to_char(date_b, 'YYYY-MM'), dist_sn_count_at_blob desc;

How to join table is sql?

I have two tables which name shoes_type and shoes_list. The shoes_type table includes shoes_id, shoes_size, shoes_type, date, project_id. Meanwhile, on the shoes_list table, I have shoes_quantity, shoes_id, shoes_color, date, project_id.
I need to get the sum of shoes_quantity based on the shoes_type, shoes_size, date, and also project_id.
I get how to sum the shoes_quantity based on color by doing:
select shoes_color, sum(shoes_quantity)
from shoes_list group by shoes_color
Basically what I want to see is the total quantity of shoes based on the type, size, date and project_id. The type and size information are available on shoes_type table, while the quantity is coming from the shoes_list table. I expect to see something like:
shoes_type shoes_size total quantity date project_id
heels 5 3 19/10/02 1
sneakers 5 3 19/10/02 1
sneakers 6 1 19/10/05 1
heels 7 5 19/10/03 1
While for the desired result, I have tried:
select shoes_type, shoes_size, date, project_id, sum(shoes_quantity)
from shoes_type st
join shoes_list sl
on st.project_id = sl.project_id
and st.shoes_id = sl.shoes_id
and st.date = sl.date
group by shoes_type, shoes_size, date, project_id
Unfortunately, I got an error that says that the column reference "date" is ambiguous.
How should I fix this?
Thank you.
The date column exists in both tables, so you have to specify where to select it from. Replace date with shoes_type.date or shoes_list.date
Qualify all column references to remove the "ambiguous" column error:
select st.shoes_type, st.shoes_size, st.date, st.project_id, sum(slshoes_quantity)
from shoes_type st join
shoes_list sl
on st.project_id = sl.project_id and
st.shoes_id = sl.shoes_id and
st.date = sl.date
group by st.shoes_type, st.shoes_size, st.date, st.project_id;
If you want all columns from shoes_type, you might find that a correlated subquery is faster:
select st.*,
(select sum(slshoes_quantity)
from shoes_list sl
where st.project_id = sl.project_id and
st.shoes_id = sl.shoes_id and
st.date = sl.date
)
from shoes_type st;

SQL - Group by with having Result

For the following table i need to fetch user who did min 2 distinct transactions and have sum of net sale equal to or more than 20,
But, everything need to be in same select cant use temp table, i am using the below query, but getting ambiguity in result,
select z.customer_nbr, transaction_nbr
from sales_transaction,
(select customer_nbr
from sales_transaction
group by customer_nbr
having count(transaction_nbr) >=2) z
group by z.customer_nbr, transaction_nbr
having sum(net_sales_rtl)>20
Below is the result
Result ambiguity - customer_numer have no transaction with no 16
By "user", I assume you mean the entity referred to by customer_nbr.
Your query is only looking at the net sales for a single transaction, not for the entire customer.
You seem to want aggregation and having:
select st.customer_nbr
from sales_transaction st
group by st.customer_nbr
having count(distinct st.transaction_nbr) >= 2 and
sum(st.net_sales) > 20;
If you wanted all transactions to follow the 20 minimum, then two levels of aggregation would be appropriate:
select ct.customer_nbr
from (select st.customer_nbr, st.transaction_nbr,
sum(st.net_sales) as transaction_net_sales
from sales_transaction st
group by st.customer_nbr, st.transaction_nbr
) ct
group by ct.customer_nbr
having count(*) >= 2 and
min(ct.transaction_net_sales) > 20;
I think what is missing here is a join between the results from sales_transaction and the subquery z.
Considering both your tables share column transaction_nbr, you could have something like this:
select z.customer_nbr, s.transaction_nbr
from sales_transaction s,
(select customer_nbr, transaction_nbr
from sales_transaction
group by customer_nbr, transaction_nbr
having count(transaction_nbr) >=2) z
where z.transaction_nbr = s.transaction_nbr
group by z.customer_nbr, transaction_nbr
having sum(net_sales_rtl)>20

SQL Server select max date per ID

I am trying to select max date record for each service_user_id for each finance_charge_id and the amount that is linked the highest date
select distinct
s.Finance_Charge_ID, MAX(s.start_date), s.Amount
from
Service_User_Finance_Charges s
where
s.Service_User_ID = '156'
group by
s.Finance_Charge_ID, s.Amount
The issue is that I receive multiple entries where the amount is different. I only want to receive the amount on the latest date for each finance_charge_id
At the moment I receive the below which is incorrect (the third line should not appear as the 1st line has a higher date)
Finance_Charge_ID (No column name) Amount
2 2014-10-19 1.00
3 2014-10-16 500.00
2 2014-10-01 1000.00
Remove the Amount column from the group by to get the correct rows. You can then join that query onto the table again to get all the data you need. Here is an example using a CTE to get the max dates:
WITH MaxDates_CTE (Finance_Charge_ID, MaxDate) AS
(
select s.Finance_Charge_ID,
MAX(s.start_date) MaxDate
from Service_User_Finance_Charges s
where s.Service_User_ID = '156'
group by s.Finance_Charge_ID
)
SELECT *
FROM Service_User_Finance_Charges
JOIN MaxDates_CTE
ON MaxDates_CTE.Finance_Charge_ID = Service_User_Finance_Charges.Finance_Charge_ID
AND MaxDates_CTE.MaxDate = Service_User_Finance_Charges.start_date
This can be done using a window function which removes the need for a self join on the grouped data:
select Finance_Charge_ID,
start_date,
amount
from (
select s.Finance_Charge_ID,
s.start_date,
max(s.start_date) over (partition by s.Finance_Charge_ID) as max_date,
s.Amount
from Service_User_Finance_Charges s
where s.Service_User_ID = 156
) t
where start_date = max_date;
As the window function does not require you to use group by you can add any additional column you need in the output.

Select from derived table sql exception

I'm trying to execute a select statement from derived table as follows in MSSQL SERVER 2005:
The problem I try to solve is that there are duplicate rows but they differ in DATE field by seconds but i take minutes into account for example
ID DATE
1 08:20:00
1 08:20:01
2 09:21:00
5 10:00:00
5 10:00:01
I want to take DISTINCT values of ID's, and order by DATE but as i order by date I need to include DATE field. So i cant select distinctly on one column.
Derived table query (works by itself perfectly retrieving duplicates)
SELECT p.[SICIL] AS ID, h.[ZAMAN_TRH] AS ZAMAN_TRH
FROM [RF_BIO].[dbo].[PERSONEL] p, [RF_BIO].[dbo].[HAREKETLER] h
WHERE h.[ZAMAN_TRH] > '2013-05-27T00:00:00.000' AND h.[YON]= 2 AND
(p.[KARTNO] = h.[KARTNO] OR p.[SICIL]= h.[SICIL])
ORDER BY h.[ZAMAN_TRH] DESC
The query that uses the derived table:
SELECT DISTINCT [SICIL]
FROM ( SELECT p.[SICIL] AS SICIL, h.[ZAMAN_TRH] AS ZAMAN_TRH
FROM [RF_BIO].[dbo]. [PERSONEL] p, [RF_BIO].[dbo].[HAREKETLER] h
WHERE h.[ZAMAN_TRH] > '2013-05-27T00:00:00.000' AND h.[YON]= 2 AND
(p.[KARTNO] = h.[KARTNO] OR p.[SICIL]= h.[SICIL]) ORDER BY h.[ZAMAN_TRH] DESC ) AS LAST
This gets me sql exception in Java
java.sql.SQLException:
at net.sourceforge.jtds.jdbc.SQLDiagnostic.addDiagnostic(SQLDiagnostic.java:372)
at net.sourceforge.jtds.jdbc.TdsCore.tdsErrorToken(TdsCore.java:2893)
at net.sourceforge.jtds.jdbc.TdsCore.nextToken(TdsCore.java:2335)
at net.sourceforge.jtds.jdbc.TdsCore.getMoreResults(TdsCore.java:638)
at net.sourceforge.jtds.jdbc.JtdsStatement.executeSQLQuery(JtdsStatement.java:505)
at net.sourceforge.jtds.jdbc.JtdsStatement.executeQuery(JtdsStatement.java:1427)
Thank you for your help.
Use GROUP BY clause with aggregate function in the ORDER BY clause
SELECT p.[ID] AS ID
FROM [RF_BIO].[dbo].[PERSONEL] p, [RF_BIO].[dbo].[HAREKETLER] h
WHERE h.[DATE] > '2013-05-27T00:00:00.000' AND h.[YON]= 2
AND (p.[KART] = h.[KART] OR p.[ID]= h.[ID])
GROUP BY p.[ID]
ORDER BY MAX(h.[DATE]) DESC
Simple demo on SQLFiddle
SELECT p.[SICIL] AS SICIL
FROM [RF_BIO].[dbo].[PERSONEL] p, [RF_BIO].[dbo].[HAREKETLER] h
WHERE h.[ZAMAN_TRH] > '2013-05-27T00:00:00.000' AND h.[YON]= 2
AND (p.[KARTNO] = h.[KARTNO] OR p.[SICIL]= h.[SICIL])
GROUP BY p.[SICIL]
ORDER BY MAX(h.[ZAMAN_TRH]) DESC
Plan Diagram